I have a data frame with NULLs or Nos in R: Example:
people1 <- c("Variety 1", "Variety 2", "Variety 3", "Variety 4", "Variety 5")
people2 <- c("Variety 4", "Variety 3", "Variety 2", "Variety 1", "NULL")
people3 <- c("Variety 3", "Variety 2", "NULL", "Variety 4", "Variety 1")
df2 <- data.frame(people1, people2, people3)
For each column, if there is a NULL or NA, I want them removed and the next item moved up to replace. So that the columns may end with NULLs but not have them in the middle.
Can you help? Thanks!
CodePudding user response:
Note that the question was changed after I answered it and the answer to the original question is at the bottom under Old. The answer to the new question is:
replace(df2, TRUE, lapply(df2, function(x) c(x[x != "NULL"], x[x == "NULL"])))
giving:
people1 people2 people3
1 Variety 1 Variety 4 Variety 3
2 Variety 2 Variety 3 Variety 2
3 Variety 3 Variety 2 Variety 4
4 Variety 4 Variety 1 Variety 1
5 Variety 5 NULL NULL
Old
Convert the "NULL" values to NA and use na.locf
(last occurrence carried forward). It will also work if there are multiple consecutive "NULL" values.
library(zoo)
m <- as.matrix(df2)
replace(df2, TRUE, na.locf(ifelse(m == "NULL", NA, m)))
giving:
people1 people2 people3
1 Variety 1 Variety 4 Variety 3
2 Variety 2 Variety 3 Variety 2
3 Variety 3 Variety 2 Variety 2
4 Variety 4 Variety 1 Variety 4
5 Variety 5 Variety 1 Variety 1
CodePudding user response:
df3 <- data.frame(
people1 = c(people1[people1 != "NULL"], people1[people1=="NULL"]),
people2 = c(people2[people2 != "NULL"], people2[people2=="NULL"]),
people3 = c(people3[people3 != "NULL"], people3[people3=="NULL"])
)
Will work for "NULL" strings.
> df3
people1 people2 people3
1 Variety 1 Variety 4 Variety 3
2 Variety 2 Variety 3 Variety 2
3 Variety 3 Variety 2 Variety 4
4 Variety 4 Variety 1 Variety 1
5 Variety 5 NULL NULL