Many R functions return objects that are printed to the console in a special manner. For instance, t_results = t.test(c(1,2,3), c(1,2,4))
will assign a list
to the t_results
variable, but when I enter this variable in the console, or call it as print(t_results)
or show(t_results)
, it prints some plain text information (such as Welch Two Sample t-test...
etc.) instead of returning the actual list
. (This is a base R function, but I've seen this implemented in many custom user R packages just as well.)
My question is: how do I do this for objects created in my own custom R package? I've read several related questions and answers (e.g., this, this, and this), which do give a general idea (using setMethod
for my custom classes), but none of them makes it clear to me what exactly I need to do to make it work properly in a custom R package. I also cannot find any official documentation or tutorial on the matter.
To give an example of what I want to do, here is a very simple function from my hypothetical R package, which simply return a small data.frame
(with an arbitrary class name I add to it, here 'my_df_class'
):
my_main_function = function() {
my_df = data.frame(a = c('x1', 'y2', 'z2'),
b = c('x2', 'y2', 'z2'))
class(my_df) = c(class(my_df), 'my_df_class')
return(my_df)
}
I would like to have this printed/shown e.g. like this:
my_print_function = function(df) {
cat('My results:', df$a[2], df$a[3])
}
# see my_print_function(my_main_function())
What exactly has to be done to make this work for my R package (i.e., that when someone installs my R package, assigns the my_main_function()
results to a variable, and print
s/show
s that variable, it would be done via my_print_function()
)?
CodePudding user response:
Here is a small explanation. Adding to the amazing answer posted by @nya:
First, you are dealing with S3 classes. With these classes, we can have one method manipulating the objects differently depending on the class the object belongs to.
Below is a simple class and how it operates:
- Class contains numbers,
- The class values to be printed like 1k, 2k, 100k, 1M,
- The values can be manipulated numerically.
-- Lets call the class my_numbers
Now we will define the class constructor:
my_numbers = function(x) structure(x, class = c('my_numbers', 'numeric'))
Note that we added the class 'numeric'. ie the class my_numbers
INHERITS from numeric class
We can create an object of the said class as follows:
b <- my_numbers(c(100, 2000, 23455, 24567654, 2345323))
b
[1] 100 2000 23455 24567654 2345323
attr(,"class")
[1] "my_numbers" "numeric"
Nothing special has happened. Only an attribute of class has been added to the vector. You can easily remove/strip off the attribute by calling c(b)
c(b)
[1] 100 2000 23455 24567654 2345323
vector b
is just a normal vector of numbers.
Note that the class
attribute could have been added by any of the following (any many more ways):
class(b) <- c('my_numbers', 'numeric')
attr(b, 'class') <- c('my_numbers', 'numeric')
attributes(b) <- list(class = c('my_numbers', 'numeric'))
Where is the magic?
I will write a simple function with recursion. Don't worry about the function implementation. We will just use it as an example.
my_numbers_print = function(x, ..., digs=2, d = 1, L = c('', 'K', 'M', 'B', 'T')){
ifelse(abs(x) >= 1000, Recall(x/1000, d = d 1),
sprintf(paste0('%.',digs,'f%s'), x, L[d]))
}
my_numbers_print(b)
[1] "100.00" "2.00K" "23.45K" "24.57M" "2.35M"
There is no magic still. Thats the normal function called on b
.
Instead of calling the function my_numbers_print
we could write another function with the name print.my_numbers
ie method.class_name
(Note I added the parameter quote = FALSE
print.my_numbers = function(x, ..., quote = FALSE){
print(my_numbers_print(x), quote = quote)
}
b
[1] 100.00 2.00K 23.45K 24.57M 2.35M
Now b has been printed nicely. We can still do math on b
b^2
[1] 10.00K 4.00M 550.14M 603.57T 5.50T
Can we add b to a dataframe?
data.frame(b)
b
1 100
2 2000
3 23455
4 24567654
5 2345323
b
reverts back to numeric instead of maintaining its class. That is because we need to change another function. ie the formats
function.
Ideally, the correct way to do this is to create a format function and then the print function. (Becoming too long)
Summary : Everything Put Together
# Create a my_numbers class definition function
my_numbers = function(x) structure(x, class = c('my_numbers', 'numeric'))
# format the numbers
format.my_numbers = function(x,...,digs =1, d = 1, L = c('', 'K', 'M', 'B', 'T')){
ifelse(abs(x) >= 1000, Recall(x/1000, d = d 1),
sprintf(paste0('%.',digs,'f%s'), x, L[d]))
}
#printing the numbers
print.my_numbers = function(x, ...) print(format(x), quote = FALSE)
# ensure class is maintained after extraction to allow for sort/order etc
'[.my_numbers' = function(x, ..., drop = FALSE) my_numbers(NextMethod('['))
b <- my_numbers(c(2000, 100, 20, 23455, 24567654, 2345323))
data.frame(x = sort(-b) / 2)
x
1 -12.3M
2 -1.2M
3 -11.7K
4 -1.0K
5 -50.0
6 -10.0
CodePudding user response:
The easiest way to use a specific function for a class is to set it as an S3 generic.
print.my_df_class = function(df) {
cat('My results:', df$a[2], df$a[3])
}
Note that because you retain the data.frame
class on line class(my_df) = c(class(my_df), 'my_df_class')
, the print()
will show the printing of the data.frame.
print(my_main_function())
# a b
# 1 x1 x2
# 2 y2 y2
# 3 z2 z2
You can either use print.my_df_class()
, or modify the my_main_function()
class assignment.
my_main_function = function() {
my_df = data.frame(a = c('x1', 'y2', 'z2'),
b = c('x2', 'y2', 'z2'))
class(my_df) = 'my_df_class'
return(my_df)
}
Then you can use print
without the class specification at the end to get a class-specific response.
print(my_main_function())
# My results: y2 z2