I have several dataframes with similar variables that I would like to loop through (variables "a" and "c" in the example) in order to change certain values (-1, 9, 98) to missing values (NA). I would like to achieve this through a nested loop, by putting the dataframes in a list and the variable names that I want to loop through in a vector.
df1 <- data.frame(a = c(-1, 1, 0, 3), b = c(4, 9, 0, -1), c = c(2, 0, 98, -1), d = c(3, 4, 3, 0))
df2 <- data.frame(a = c(3, 4, -1, 98), b = c(1, 3, 2, 9), c = c(9, -1, 0, 2), d = c(1, 4, 0, -1))
df3 <- data.frame(a = c(2, 4, 3, -1), b = c(9, 98, 0, 2), c = c(1, 2, -1, 1), d = c(3, 3, 0, 1))
df4 <- data.frame(a = c(-1, -1, 0, 0), b = c(4, -1, 9, 0), c = c(9, -1, 2, 0), d = c(1, -1, 2, 0))
dfs <- list(df1, df2, df3, df4)
vars <- c("a", "c")
for(i in dfs) {
for(x in vars) {
i %>% replace_with_na(replace = list(x = c(-1, 9, 98)))
}
}
I am imagining something like the above mentioned code. replace_with_na
is taken from the naniar
package.
In the last step I would like to extract the dataframes from the list again (which I have not found out how to do either).
Thanks for any suggestions!
CodePudding user response:
You could use a tidyverse
approach on this task:
library(dplyr)
library(purrr)
dfs %>%
map(~mutate(.x, across(vars, ~ifelse(.x %in% c(-1, 9, 98), NA, .x))))
This returns
[[1]]
a b c d
1 NA 4 2 3
2 1 9 0 4
3 0 0 NA 3
4 3 -1 NA 0
[[2]]
a b c d
1 3 1 NA 1
2 4 3 NA 4
3 NA 2 0 0
4 NA 9 2 -1
[[3]]
a b c d
1 2 9 1 3
2 4 98 2 3
3 3 0 NA 0
4 NA 2 1 1
[[4]]
a b c d
1 NA 4 NA 1
2 NA -1 NA -1
3 0 9 2 2
4 0 0 0 0
CodePudding user response:
I had the same tidyverse approach as Martin Gal. To extract the dataframes back, you would need to assign the output of map
to dfs, and then do
dfs %>% walk2(seq_along(dfs), ~assign(paste0("df",.y), .x, envir = .GlobalEnv))
Or alternatively, just call walk2
in a pipe after the map
call.