Home > Software design >  Apply same recoding rules to multiple data frames
Apply same recoding rules to multiple data frames

Time:01-19

I have 5 data frames. I want to recode all variables ending with "_comfort", "_agree", and "effective" using the same rules for each data frame. As is, the values in each column are 1:5 and I want is to recode 5's to 1, 4's to 2, 2's to 4, and 5's to 1 (3 will stay the same).
I do not want the end result to one merged dataset, but instead to apply the same recoding rules across all 5 independent data frames. For simplicity sake, let's just assume I have 2 data frames:

df1 <- data.frame(a_comfort = c(1, 2, 3, 4, 5),
        b_comfort = c(1, 2, 3, 4, 5),
        c_effective = c(1, 2, 3, 4, 5))

df2 <- data.frame(a_comfort = c(1, 2, 3, 4, 5),
        b_comfort = c(1, 2, 3, 4, 5),
        c_effective = c(1, 2, 3, 4, 5))

What I want is:

df1 <- data.frame(a_comfort = c(5, 4, 3, 2, 1),
        b_comfort = c(5, 4, 3, 2, 1),
        c_effective = c(5, 4, 3, 2, 1))

df2 <- data.frame(a_comfort = c(5, 4, 3, 2, 1),
        b_comfort = c(5, 4, 3, 2, 1),
        c_effective = c(5, 4, 3, 2, 1))

Conventionally, I would use dplyr's mutate_atand ends_withto achieve my goal, but have not been successful with this method across multiple data frames. I am thinking a combination of the purr and dplyr packages will work, but haven't nailed down the exact technique.

Thanks in advance for any help!

CodePudding user response:

You can use get() and assign() in a loop:

library(dplyr)

for (df_name in c("df1", "df2")) {
  df <- mutate(
    get(df_name),
    across(
      ends_with(c("_comfort", "_agree", "_effective")),
      \(x) 6 - x
    )
  )
  assign(df_name, df)
}

Result:

#> df1
  a_comfort b_comfort c_effective
1         5         5           5
2         4         4           4
3         3         3           3
4         2         2           2
5         1         1           1

#> df2
  a_comfort b_comfort c_effective
1         5         5           5
2         4         4           4
3         3         3           3
4         2         2           2
5         1         1           1

Note, however, it’s often better practice to keep multiple related dataframes in a list than loose in the global environment (see). In this case, you can use purrr::map() (or base::lapply()):

library(dplyr)
library(purrr)

dfs <- list(df1, df2)

dfs <- map(
  dfs,
  \(df) mutate(
    df,
    across(
      ends_with(c("_comfort", "_agree", "_effective")),
      \(x) 6 - x
    )
  )
)

Result:

#> dfs
[[1]]
  a_comfort b_comfort c_effective
1         5         5           5
2         4         4           4
3         3         3           3
4         2         2           2
5         1         1           1

[[2]]
  a_comfort b_comfort c_effective
1         5         5           5
2         4         4           4
3         3         3           3
4         2         2           2
5         1         1           1

CodePudding user response:

You can use ls(pattern = 'df\\d ') to find all objects whose names match a certain pattern. Then store them into a list and pass to purrr::map or lapply for recoding.

library(dplyr)

df.lst <- purrr::map(
  mget(ls(pattern = 'df\\d ')),
  ~ .x %>% mutate(6 - across(ends_with(c("_comfort", "_agree", "effective"))))
)

# $df1
#   a_comfort b_comfort c_effective
# 1         5         5           5
# 2         4         4           4
# 3         3         3           3
# 4         2         2           2
# 5         1         1           1
# 
# $df2
#   a_comfort b_comfort c_effective
# 1         5         5           5
# 2         4         4           4
# 3         3         3           3
# 4         2         2           2
# 5         1         1           1

You can further overwrite those dataframes in your workspace from the list through list2env().

list2env(df.lst, .GlobalEnv)

CodePudding user response:

Please try the below code where i convert the columns to factor and then recode them

data

  a_comfort b_comfort c_effective
1         1         1           1
2         2         2           2
3         3         3           3
4         4         4           4
5         5         5           5

code

library(tidyverse)

df1 %>% mutate(across(c(ends_with('comfort'),ends_with('effective')), ~ factor(.x, levels=c('1','2','3','4','5'), labels=c('5','4','3','2','1'))))

output

  a_comfort b_comfort c_effective
1         5         5           5
2         4         4           4
3         3         3           3
4         2         2           2
5         1         1           1

  • Related