Home > Software design >  Transform a common variable within a series of datasets without placing those datasets in a list in
Transform a common variable within a series of datasets without placing those datasets in a list in

Time:03-01

This seems like something I should know how to do. But say I have a series of datasets

df1 <- data.frame(x = letters[1:6], y = rnorm(6))
df2 <- data.frame(x = letters[1:6], y = rnorm(6))
df3 <- data.frame(x = letters[1:6], y = rnorm(6))

And what if I want to change all of x variables in these datasets to factors instead of character vectors. I can put the datasets into a list

dfList <- list(df1, df2, df3)

And then loop over that list performing the necessary transformation

for (i in 1:length(dfList)) {
  dfList[[i]][,"x"] <- factor(dfList[[i]][,"x"])
}

str(dfList[[2]])

# 'data.frame': 6 obs. of  1 variable:
#   $ x: Factor w/ 6 levels "a","b","c","d",..: 1 2 3 4 5 6

But now the only way I can access the transformed version of the datasets is thorugh the list. How do I transform the datasets without having to place them inside the list?

I tried this

for (i in c(df1, df2, df3)) {
  i[,"x"] <- factor(i[,"x"])
}

But got an error.

# Error in i[, "x"] : incorrect number of dimensions

CodePudding user response:

It is usually advised to keep the data in a list and work with it. In case, if you want to transfer the changed values to individual dataframes you can use list2env function.

dfList <- dplyr::lst(df1, df2, df3)
dfList <- lapply(dfList, function(df) transform(df, x = factor(x)))
list2env(dfList, .GlobalEnv)

str(df1)
#'data.frame':  6 obs. of  2 variables:
# $ x: Factor w/ 6 levels "a","b","c","d",..: 1 2 3 4 5 6
# $ y: num  -0.867 0.198 -1.628 -0.481 1.228 ...

The benefit of using dplyr::lst over base::list is that it gives us a named list which is important to have for list2env to work.

  •  Tags:  
  • r
  • Related