I have this df corresponding to the rank of citation of each variety:
variety <- c("variety1","variety2","variety3","variety4","variety5")
people1 <- c(1, NA, 3, 2, NA)
people2 <- c(4, 3, 2, 1, NA)
people3 <- c(3, 2, NA, 4, 1)
df <- data.frame(variety, people1, people2, people3); df
I want to create instead a data frame without the first column. (easy enough :-D) And where all the following ones rather list the varieties cited by all the people in the proper order. It would look like this:
people1 <- c("variety1", "variety3", "variety2",NA,NA)
people2 <- c("variety4", "variety3", "variety2", "variety1", NA)
people3 <- c("variety3", "variety2", "variety4", "variety1", NA)
df2 <- data.frame(people1, people2, people3); df2
So the list would become each variety names cited in the right order based on their rank in the initial df.
NAs could be removed such that columns would not have the same length (fine for the software where I plan to submit this)
Thanks
CodePudding user response:
Well, we can't have a non-rectangular data frame, but here's a list
with the entries you want and the NA
s removed:
lapply(df[-1], \(x) df$variety[x[!is.na(x)]])
# $people1
# [1] "variety1" "variety3" "variety2"
#
# $people2
# [1] "variety4" "variety3" "variety2" "variety1"
#
# $people3
# [1] "variety3" "variety2" "variety4" "variety1"
If you need it as a data frame we can pad the lengths:
result = lapply(df[-1], \(x) df$variety[x[!is.na(x)]])
n = max(lengths(result))
result = lapply(result, \(x) {length(x) = n; x}) |> as.data.frame()
result
# people1 people2 people3
# 1 variety1 variety4 variety3
# 2 variety3 variety3 variety2
# 3 variety2 variety2 variety4
# 4 <NA> variety1 variety1
CodePudding user response:
list2DF(lapply(df[-1], \(x)df$variety[c(na.omit(x), x[is.na(x)])]))
people1 people2 people3
1 variety1 variety4 variety3
2 variety3 variety3 variety2
3 variety2 variety2 variety4
4 <NA> variety1 variety1
5 <NA> <NA> <NA>