First, I had a list that including several dataframes like this:
list:
$dataframes1
A | B |
---|---|
1 | 2 |
NA | 3 |
1 | 2 |
NA | NA |
1 | NA |
NA | NA |
1 | 2 |
NA | 3 |
1 | 2 |
1 | NA |
$dataframes2
A | B |
---|---|
1 | 2 |
1 | 3 |
1 | 2 |
1 | 3 |
7 | 2 |
1 | 3 |
1 | 2 |
5 | 3 |
7 | 2 |
1 | 3 |
$dataframes3
A | B |
---|---|
NA | 2 |
1 | 3 |
NA | 2 |
1 | 3 |
NA | 2 |
1 | 3 |
NA | 2 |
1 | 3 |
NA | 2 |
1 | 3 |
$dataframes4
A | B |
---|---|
1 | 2 |
1 | 3 |
3 | 2 |
1 | 3 |
3 | 4 |
1 | 3 |
5 | 5 |
5 | 3 |
1 | NA |
1 | 3 |
They all have the same number of rows and variables (i.e., A and B), and there are some NAs in the dataset, If a variable in the dataframe contain > 3NAs, then the corresponding column will return a NA vector, otherwise remains the same:
e.g.
$dataframes3
A | B |
---|---|
NA | 2 |
NA | 3 |
NA | 2 |
NA | 3 |
NA | 2 |
NA | 3 |
NA | 2 |
NA | 3 |
NA | 2 |
NA | 3 |
My expected output:
list:
$dataframes1
A | B |
---|---|
NA | NA |
NA | NA |
NA | NA |
NA | NA |
NA | NA |
NA | NA |
NA | NA |
NA | NA |
NA | NA |
NA | NA |
$dataframes2
A | B |
---|---|
1 | 2 |
1 | 3 |
1 | 2 |
1 | 3 |
7 | 2 |
1 | 3 |
1 | 2 |
5 | 3 |
7 | 2 |
1 | 3 |
$dataframes3
A | B |
---|---|
NA | 2 |
NA | 3 |
NA | 2 |
NA | 3 |
NA | 2 |
NA | 3 |
NA | 2 |
NA | 3 |
NA | 2 |
NA | 3 |
$dataframes4
A | B |
---|---|
1 | 2 |
1 | 3 |
3 | 2 |
1 | 3 |
3 | 4 |
1 | 3 |
5 | 5 |
5 | 3 |
1 | NA |
1 | 3 |
Is there any way to convert the dataframes in the list without using >2 for loop?
I used 3 for loops to convert and the running speed is very slow...
May be using lapply
to each dataframe is a good solution? But the code will become difficult to read and debug?
CodePudding user response:
This should do the trick:
library(purrr)
library(dplyr)
map(dataframes, ~ mutate(
.x,
across(
everything(),
ifelse(sum(is.na(.x)) <= 3, .x, NA)
)
)