Home > OS >  How do I loop through several dataframes and variables in a nested loop to change values of variable
How do I loop through several dataframes and variables in a nested loop to change values of variable

Time:04-14

I have several dataframes with similar variables that I would like to loop through (variables "a" and "c" in the example) in order to change certain values (-1, 9, 98) to missing values (NA). I would like to achieve this through a nested loop, by putting the dataframes in a list and the variable names that I want to loop through in a vector.

df1 <- data.frame(a = c(-1, 1, 0, 3), b = c(4, 9, 0, -1), c = c(2, 0, 98, -1), d = c(3, 4, 3, 0))
df2 <- data.frame(a = c(3, 4, -1, 98), b = c(1, 3, 2, 9), c = c(9, -1, 0, 2), d = c(1, 4, 0, -1))
df3 <- data.frame(a = c(2, 4, 3, -1), b = c(9, 98, 0, 2), c = c(1, 2, -1, 1), d = c(3, 3, 0, 1))
df4 <- data.frame(a = c(-1, -1, 0, 0), b = c(4, -1, 9, 0), c = c(9, -1, 2, 0), d = c(1, -1, 2, 0))

dfs <- list(df1, df2, df3, df4)

vars <- c("a", "c")



for(i in dfs) {
  for(x in vars) {
    i %>% replace_with_na(replace = list(x = c(-1, 9, 98)))
  } 
}

I am imagining something like the above mentioned code. replace_with_na is taken from the naniar package.

In the last step I would like to extract the dataframes from the list again (which I have not found out how to do either).

Thanks for any suggestions!

CodePudding user response:

You could use a tidyverse approach on this task:

library(dplyr)
library(purrr)

dfs %>% 
  map(~mutate(.x, across(vars, ~ifelse(.x %in% c(-1, 9, 98), NA, .x))))

This returns

[[1]]
   a  b  c d
1 NA  4  2 3
2  1  9  0 4
3  0  0 NA 3
4  3 -1 NA 0

[[2]]
   a b  c  d
1  3 1 NA  1
2  4 3 NA  4
3 NA 2  0  0
4 NA 9  2 -1

[[3]]
   a  b  c d
1  2  9  1 3
2  4 98  2 3
3  3  0 NA 0
4 NA  2  1 1

[[4]]
   a  b  c  d
1 NA  4 NA  1
2 NA -1 NA -1
3  0  9  2  2
4  0  0  0  0

CodePudding user response:

I had the same tidyverse approach as Martin Gal. To extract the dataframes back, you would need to assign the output of map to dfs, and then do

dfs %>% walk2(seq_along(dfs), ~assign(paste0("df",.y), .x, envir = .GlobalEnv))

Or alternatively, just call walk2 in a pipe after the map call.

  • Related