Home > Enterprise >  How can I revalue elements in factor based on values of arguments, passed to the function in R?
How can I revalue elements in factor based on values of arguments, passed to the function in R?

Time:11-07

I have a function to which I pass certain values that are levels in a factor. I would like to revalue levels in factor, based on this values.

For example, if I want to revalue levels in factor my_factor ("cats" -> "animals", "pines" -> "trees"), I use: my_factor <- revalue(my_factor, c("cats"="animals", "pines"="trees")). But now I want to revalue levels, based on values of arguments, passed to the function:

myFunction(..., member1 = "cats", member2 = "pines") {
my_factor <- revalue(my_factor, c(member1="animals", member2="trees"))
}

This fragment of code isn't working (Error: The following fromvalues were not present inx: member1, member2).

Please tell me how to do it correctly?
Perhaps I need to use something other than revalue.

CodePudding user response:

You reference plyr, but that package is "retired" and its use is generally not recommended. I'm not going to attempt a solution in the dplyr manner, since I don't have a sufficient command of it's various levels of abstraction.

The base function levels<- will do this cleanly. When you do something like:

levels(fac)[some_index] <- "something"

You change the print value of that level without changing the underlying pattern of factor integers that carry the information. So use levels(fac); once to get the current values of the levels to create a logical index to use inside "[" , and again on the "outside" of the LHS to do the reassignment:

levels(fac)[ levels(fac) == "cats"] <- "animals"
levels(fac)[ levels(fac) == "pines"] <- "trees"

You are actually using two different functions: levels<- (on the outside) and levels (on the inside). To make this process into a function that can handle an arbitrary number of reassignments, you would want the reassignment pairs to be carried in a list of lists so you could iterate over the pairs. Your current request is attempting to use a language-like expression such as "cats" = "animals", but that would create a parameter named cats with a value of "animals". Looking at the code of plyr::revalue I can see that it then needs to undo that construction before it sends the names and values to mapvalues which works with two separate sets of parameters. At any rate, here's an old-school attempt.

reval <- function(x)(fac, reassigns) {
             levs <-lapply(reassigns, function(fac, pair) {
                       levels(fac)[levels(fac)==pair[[1]]] <-pair[[2]]}
                                      return(levs) }

And you would call it like this:

levels(facname) <- reval ( facname, list( list("curlev1", "newlev1"),
                                  list("curlev2", "newlev2")) )  )

If you have an example off the example naming you used "my_factor ("cats" -> "animals", "pines" -> "trees")" then test it with

levels(my_factor)<- reval(my_factor, reassigns = list (list("cats" , "animals"),
                                           list("pines", "trees") ) )

If it doesn't work, then you should post R code to create an example that can be used for further development and testing. And looking at the dplyr Index, I see the recode function which has a factor method. This is the example from that help page that appears to match your desires:

# For factor values, use only named replacements
# and supply default with levels()
factor_vec <- factor(c("a", "b", "c"))
recode(factor_vec, a = "Apple", .default = levels(factor_vec))

As (almost) always, R will not actually modify factor_vec unless you assign the result of recode back to the origianl name

factor_vec <- recode(factor_vec, a = "Apple", .default = levels(factor_vec))
  • Related