I have a function to which I pass certain values that are levels in a factor. I would like to revalue levels in factor, based on this values.
For example, if I want to revalue levels in factor my_factor
("cats" -> "animals", "pines" -> "trees"), I use:
my_factor <- revalue(my_factor, c("cats"="animals", "pines"="trees"))
. But now I want to revalue levels, based on values of arguments, passed to the function:
myFunction(..., member1 = "cats", member2 = "pines") {
my_factor <- revalue(my_factor, c(member1="animals", member2="trees"))
}
This fragment of code isn't working (Error: The following
fromvalues were not present in
x: member1, member2
).
Please tell me how to do it correctly?
Perhaps I need to use something other than revalue
.
CodePudding user response:
You reference plyr
, but that package is "retired" and its use is generally not recommended. I'm not going to attempt a solution in the dplyr
manner, since I don't have a sufficient command of it's various levels of abstraction.
The base function levels<-
will do this cleanly. When you do something like:
levels(fac)[some_index] <- "something"
You change the print value of that level without changing the underlying pattern of factor integers that carry the information. So use levels(fac)
; once to get the current values of the levels to create a logical index to use inside "[" , and again on the "outside" of the LHS to do the reassignment:
levels(fac)[ levels(fac) == "cats"] <- "animals"
levels(fac)[ levels(fac) == "pines"] <- "trees"
You are actually using two different functions: levels<-
(on the outside) and levels
(on the inside). To make this process into a function that can handle an arbitrary number of reassignments, you would want the reassignment pairs to be carried in a list of lists so you could iterate over the pairs. Your current request is attempting to use a language-like expression such as "cats" = "animals"
, but that would create a parameter named cats
with a value of "animals"
. Looking at the code of plyr::revalue
I can see that it then needs to undo that construction before it sends the names and values to mapvalues
which works with two separate sets of parameters. At any rate, here's an old-school attempt.
reval <- function(x)(fac, reassigns) {
levs <-lapply(reassigns, function(fac, pair) {
levels(fac)[levels(fac)==pair[[1]]] <-pair[[2]]}
return(levs) }
And you would call it like this:
levels(facname) <- reval ( facname, list( list("curlev1", "newlev1"),
list("curlev2", "newlev2")) ) )
If you have an example off the example naming you used "my_factor ("cats" -> "animals", "pines" -> "trees")" then test it with
levels(my_factor)<- reval(my_factor, reassigns = list (list("cats" , "animals"),
list("pines", "trees") ) )
If it doesn't work, then you should post R code to create an example that can be used for further development and testing. And looking at the dplyr
Index, I see the recode
function which has a factor method. This is the example from that help page that appears to match your desires:
# For factor values, use only named replacements
# and supply default with levels()
factor_vec <- factor(c("a", "b", "c"))
recode(factor_vec, a = "Apple", .default = levels(factor_vec))
As (almost) always, R will not actually modify factor_vec
unless you assign the result of recode back to the origianl name
factor_vec <- recode(factor_vec, a = "Apple", .default = levels(factor_vec))