I would like to duplicate/repeat lists based on the number of rows matching the same name in a data frame.
For example, given my list and a data frame
mylist <- list(A = c(1,2,5,6), B = c(2,4,6,5), C = c(2,4,2,35))
> mylist
$A
[1] 1 2 5 6
$B
[1] 2 4 6 5
$C
[1] 2 4 2 35
mydf <- as.data.frame(c("A", "A", "A", "B", "B", "C"))
colnames(mydf) <- "Freq"
> mydf
Freq
1 A
2 A
3 A
4 B
5 B
6 C
I would like this output where from mylist A is repeated 3 times because it has 3 rows in mydf, B is repeated 2 times because it has the next 2 rows in mydf, and C is repeated 1 time because it has the next 1 row in mydf:
desired.output <- list(A = c(1,2,5,6), A = c(1,2,5,6), A = c(1,2,5,6), B = c(2,4,6,5), B = c(2,4,6,5), C = c(2,4,2,35)
> desired.output
$A
[1] 1 2 5 6
$A
[1] 1 2 5 6
$A
[1] 1 2 5 6
$B
[1] 2 4 6 5
$B
[1] 2 4 6 5
$C
[1] 2 4 2 35
I have tried to use the rep
function and all result in a NULL object.
attempt1 <- rep(mylist[[]], times=as.vector(mydf$Freq))
attempt2 <- rep(mylist[[]], times = match(mydf$Freq, names(mylist)))
attempt3 <- rep(mylist[[]], times = length(match(mydf$Freq, names(mylist))))
Ultimately, my goal is to have mylist contain the same number of items as mydf and where each group (A, B, C) is replicated according to their sample size in mydf.
CodePudding user response:
We can use the mydf
Frequencies data to do the selection for us
mylist[mydf$Freq]
$A
[1] 1 2 5 6
$A
[1] 1 2 5 6
$A
[1] 1 2 5 6
$B
[1] 2 4 6 5
$B
[1] 2 4 6 5
$C
[1] 2 4 2 35