R: transform a dataframe into another based on rank-CodePudding

I have this df corresponding to the rank of citation of each variety:

variety <- c("variety1","variety2","variety3","variety4","variety5")
people1 <- c(1, NA, 3, 2, NA)
people2 <- c(4, 3, 2, 1, NA)
people3 <- c(3, 2, NA, 4, 1)
df <- data.frame(variety, people1, people2, people3); df

I want to create instead a data frame without the first column. (easy enough :-D) And where all the following ones rather list the varieties cited by all the people in the proper order. It would look like this:

people1 <- c("variety1", "variety3", "variety2",NA,NA)
people2 <- c("variety4", "variety3", "variety2", "variety1", NA)
people3 <- c("variety3", "variety2", "variety4", "variety1", NA)
df2 <- data.frame(people1, people2, people3); df2

So the list would become each variety names cited in the right order based on their rank in the initial df.

NAs could be removed such that columns would not have the same length (fine for the software where I plan to submit this)

Thanks

CodePudding user response：

Well, we can't have a non-rectangular data frame, but here's a list with the entries you want and the NAs removed:

lapply(df[-1], \(x) df$variety[x[!is.na(x)]])
# $people1
# [1] "variety1" "variety3" "variety2"
# 
# $people2
# [1] "variety4" "variety3" "variety2" "variety1"
# 
# $people3
# [1] "variety3" "variety2" "variety4" "variety1"

If you need it as a data frame we can pad the lengths:

result = lapply(df[-1], \(x) df$variety[x[!is.na(x)]])
n = max(lengths(result))
result = lapply(result, \(x) {length(x) = n; x}) |> as.data.frame()
result
#    people1  people2  people3
# 1 variety1 variety4 variety3
# 2 variety3 variety3 variety2
# 3 variety2 variety2 variety4
# 4     <NA> variety1 variety1

CodePudding user response：

list2DF(lapply(df[-1], \(x)df$variety[c(na.omit(x), x[is.na(x)])]))

   people1  people2  people3
1 variety1 variety4 variety3
2 variety3 variety3 variety2
3 variety2 variety2 variety4
4     <NA> variety1 variety1
5     <NA>     <NA>     <NA>