I have a dataframe that contains NA
values, and I want to remove some rows that have an NA
(i.e., not complete cases). However, I only want to remove rows at the end of the dataframe. So, I want to keep any rows that have an NA
that are not at the end of the dataframe. What is the most efficient way to remove the ending rows with NA
s without using a row index?
Data
df <- structure(list(var1 = 1:15, var2 = c(3, 6, 3, NA, 2, NA, 3, 4,
2, NA, 4, 2, 45, 2, 1), var3 = c(6L, 7L, 8L, 9L, 10L, 11L, 12L,
13L, 14L, 15L, 16L, NA, NA, NA, NA), var4 = c(NA, 7L, 8L, 9L,
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, NA)), class = "data.frame", row.names = c(NA,
-15L))
Expected Output
So, in this example, I removed rows 12 through 15 since they have an NA
and row 11 does not have an NA
.
var1 var2 var3 var4
1 1 3 6 NA
2 2 6 7 7
3 3 3 8 8
4 4 NA 9 9
5 5 2 10 10
6 6 NA 11 11
7 7 3 12 12
8 8 4 13 13
9 9 2 14 14
10 10 NA 15 15
11 11 4 16 16
I know that if I wanted to remove the first row of data frame until reaching the first row with no NA, then I could do the following:
df[cumsum(complete.cases(df)) != 0,]
CodePudding user response:
We could use na.trim
from zoo
package:
library(zoo)
library(dplyr)
df %>%
slice(1:nrow(na.trim(df, "right", is.na = "any")))
var1 var2 var3 var4
1 1 3 6 NA
2 2 6 7 7
3 3 3 8 8
4 4 NA 9 9
5 5 2 10 10
6 6 NA 11 11
7 7 3 12 12
8 8 4 13 13
9 9 2 14 14
10 10 NA 15 15
11 11 4 16 16
CodePudding user response:
To remove the last rows, you can use rev
in the same approach. So, we put the output from complete.cases
in reverse order, so that we can calculate cumsum
starting with the end of the dataframe. Then, we put the cumsum
back in the original order (the reason for the second rev
). Now, we can remove the rows with 0
(i.e., rows that contain an NA
).
df[rev(cumsum(rev(complete.cases(df)))) != 0,]
Or with dplyr
:
library(dplyr)
df %>%
filter(rev(cumsum(rev(complete.cases(.)))) != 0)
Output
var1 var2 var3 var4
1 1 3 6 NA
2 2 6 7 7
3 3 3 8 8
4 4 NA 9 9
5 5 2 10 10
6 6 NA 11 11
7 7 3 12 12
8 8 4 13 13
9 9 2 14 14
10 10 NA 15 15
11 11 4 16 16