Home > database >  Repeat lists based on length of matching column in R
Repeat lists based on length of matching column in R

Time:11-17

I would like to duplicate/repeat lists based on the number of rows matching the same name in a data frame.

For example, given my list and a data frame

mylist <- list(A = c(1,2,5,6), B = c(2,4,6,5), C = c(2,4,2,35))

> mylist
$A
[1] 1 2 5 6

$B
[1] 2 4 6 5

$C
[1]  2  4  2 35


mydf <- as.data.frame(c("A", "A", "A", "B", "B", "C"))
colnames(mydf) <- "Freq"

> mydf
  Freq
1    A
2    A
3    A
4    B
5    B
6    C

I would like this output where from mylist A is repeated 3 times because it has 3 rows in mydf, B is repeated 2 times because it has the next 2 rows in mydf, and C is repeated 1 time because it has the next 1 row in mydf:

desired.output <- list(A = c(1,2,5,6), A = c(1,2,5,6), A = c(1,2,5,6), B = c(2,4,6,5), B = c(2,4,6,5), C = c(2,4,2,35)

> desired.output
$A
[1] 1 2 5 6

$A
[1] 1 2 5 6

$A
[1] 1 2 5 6

$B
[1] 2 4 6 5

$B
[1] 2 4 6 5

$C
[1]  2  4  2 35

I have tried to use the rep function and all result in a NULL object.

attempt1 <- rep(mylist[[]], times=as.vector(mydf$Freq))
attempt2 <- rep(mylist[[]], times = match(mydf$Freq, names(mylist)))
attempt3 <- rep(mylist[[]], times = length(match(mydf$Freq, names(mylist)))) 

Ultimately, my goal is to have mylist contain the same number of items as mydf and where each group (A, B, C) is replicated according to their sample size in mydf.

CodePudding user response:

We can use the mydf Frequencies data to do the selection for us

mylist[mydf$Freq]
$A
[1] 1 2 5 6

$A
[1] 1 2 5 6

$A
[1] 1 2 5 6

$B
[1] 2 4 6 5

$B
[1] 2 4 6 5

$C
[1]  2  4  2 35
  •  Tags:  
  • r
  • Related