I would like to count the value in the dataset and if the count is under 4 I would like to delete the column.
A | B | C |
---|---|---|
1 | NA | 2 |
2 | NA | 5 |
3 | 1 | 2 |
3 | NA | 2 |
3 | NA | NA |
count <> NA
In this case, if count(value<>NA) < 4 I need to delete the column, my original dataset is bigger than this so I would like to have something handy
CodePudding user response:
Removing columns with column sums below 4
df[, colSums(df, na.rm=T) >= 4]
A C
1 1 2
2 2 5
3 3 2
4 3 2
5 3 NA
To delete columns with a NA count below 4 try something like this
df[, colSums(is.na(df)) >= 4, drop=F]
B
1 NA
2 NA
3 1
4 NA
5 NA
Data
df <- structure(list(A = c(1L, 2L, 3L, 3L, 3L), B = c(NA, NA, 1, NA,
NA), C = c(2L, 5L, 2L, 2L, NA)), row.names = c(NA, -5L), class = "data.frame")