Home > Software design >  Remove rows from list of dataframes based on condition
Remove rows from list of dataframes based on condition

Time:02-22

I have a list of dataframes. It looks something like this:

df1 <- data.frame(Var1 = c(1, 7, 9, 4, 2),
                  Var2 = c(7, 2, 4, 4, 3),
                  Var3 = c(3, 6, 2, 0, 8)) 

df2 <- data.frame(Var1 = c(5, 6, 2, 2, 1),
                  Var2 = c(8, 6, 6, 7, 4),
                  Var3 = c(9, 0, 1, 3, 4)) 

df3.wxyz <- data.frame(Var1 = c("w", "x", "y", "z", 3, 7, 3, 6, 6),
                       Var2 = c(NA, NA, NA, NA, 7, 5, 8, 0, 2),
                       Var3 = c(NA, NA, NA, NA, 3, 3, 4, 1, 9)) 

df4 <- data.frame(Var1 = c(2, 7, 2, 4, 8),
                  Var2 = c(8, 3, 1, 7, 3),
                  Var3 = c(9, 1, 1, 6, 5)) 

df5.wxyz <- data.frame(Var1 = c("w", "x", "y", "z", 2, 7, 3, 1, 6),
                       Var2 = c(NA, NA, NA, NA, 7, 4, 8, 1, 9),
                       Var3 = c(NA, NA, NA, NA, 8, 0, 4, 1, 2)) 

df.list <- list(df1, df2, df3.wxyz, df4, df5.wxyz)

names(df.list) <- c("df1", "df2", "df3.wxyz", "df4", "df5.wxyz")

I would like to remove the first 4 rows of df3.wxyz and df5.wxyz from the list of dataframes as those contain information that I do not need. What I've tried is the following code, but instead of only removing the first 4 rows in df3.wxyz and df5.wxyz, it is removing the first 4 rows from every dataframe in my list. I'm not sure what the issue is.

df.list <- lapply(df.list, function(i){
  ifelse(grepl("wxyz", names(df.list)), i <- i[-c(1:4), ], df.list)
  i
})

This is what I would like to achieve:

df1 <- data.frame(Var1 = c(1, 7, 9, 4, 2),
                  Var2 = c(7, 2, 4, 4, 3),
                  Var3 = c(3, 6, 2, 0, 8)) 

df2 <- data.frame(Var1 = c(5, 6, 2, 2, 1),
                  Var2 = c(8, 6, 6, 7, 4),
                  Var3 = c(9, 0, 1, 3, 4)) 

df3.wxyz <- data.frame(Var1 = c(3, 7, 3, 6, 6),
                       Var2 = c(7, 5, 8, 0, 2),
                       Var3 = c(3, 3, 4, 1, 9)) 

df4 <- data.frame(Var1 = c(2, 7, 2, 4, 8),
                  Var2 = c(8, 3, 1, 7, 3),
                  Var3 = c(9, 1, 1, 6, 5)) 

df5.wxyz <- data.frame(Var1 = c(2, 7, 3, 1, 6),
                       Var2 = c(7, 4, 8, 1, 9),
                       Var3 = c(8, 0, 4, 1, 2)) 

df.list <- list(df1, df2, df3.wxyz, df4, df5.wxyz)

names(df.list) <- c("df1", "df2", "df3.wxyz", "df4", "df5.wxyz")

CodePudding user response:

You can try,

df.list[grepl('wxyz', names(df.list))] <- lapply(df.list[grepl('wxyz', names(df.list))], na.omit)

CodePudding user response:

You can try na.omit like below

> Map(na.omit,df.list)
$df1
  Var1 Var2 Var3
1    1    7    3
2    7    2    6
3    9    4    2
4    4    4    0
5    2    3    8

$df2
  Var1 Var2 Var3
1    5    8    9
2    6    6    0
3    2    6    1
4    2    7    3
5    1    4    4

$df3.wxyz
  Var1 Var2 Var3
5    3    7    3
6    7    5    3
7    3    8    4
8    6    0    1
9    6    2    9

$df4
  Var1 Var2 Var3
1    2    8    9
2    7    3    1
3    2    1    1
4    4    7    6
5    8    3    5

$df5.wxyz
  Var1 Var2 Var3
5    2    7    8
6    7    4    0
7    3    8    4
8    1    1    1
9    6    9    2
  • Related