Home > front end >  get repeated rows by vector of values in data table
get repeated rows by vector of values in data table

Time:12-28

Consider this table df and this vector v:

df = data.table(ID=c(50,40,30),fruit=c('mango','grape','melon'))
df
   ID fruit
1: 50 mango
2: 40 grape
3: 30 melon

v = sample(df[,ID],size=30,replace = T)
v
[1] 30 50 30 30 40 50 40 40 50 30 40 50 50 40 30 30 30 40 40 30 50 40 30 30 30 40 30 40 50 30

I want to create a df with as many rows as the length of vector v and the vector v should be the index for the rows I want to repeat based on the equality with the ID column.

I tried:

> df[v%in%ID]
Error in `[.data.table`(df, v %in% ID) : 
  i evaluates to a logical vector length 30 but there are 3 rows. Recycling of logical i is no longer allowed as it hides more bugs than is worth the rare convenience. Explicitly use rep(...,length=.N) if you really need to recycle.
> df[v==ID]
Error in `[.data.table`(df, v == ID) : 
  i evaluates to a logical vector length 30 but there are 3 rows. Recycling of logical i is no longer allowed as it hides more bugs than is worth the rare convenience. Explicitly use rep(...,length=.N) if you really need to recycle.

CodePudding user response:

You could transform v to a data.table and join it to df:

df[data.table(ID=v),on=.(ID)]

       ID  fruit
    <num> <char>
 1:    40  grape
 2:    40  grape
 3:    30  melon
 4:    30  melon
 5:    50  mango
 6:    50  mango
 7:    30  melon
 8:    40  grape
 9:    50  mango
10:    50  mango
11:    30  melon
12:    30  melon
13:    50  mango
14:    50  mango
15:    30  melon
16:    40  grape
17:    30  melon
18:    40  grape
19:    40  grape
20:    40  grape
21:    40  grape
22:    40  grape
23:    30  melon
24:    30  melon
25:    40  grape
26:    40  grape
27:    50  mango
28:    50  mango
29:    50  mango
30:    50  mango

CodePudding user response:

library(data.table)
df = data.table(ID=c(50,40,30),fruit=c('mango','grape','melon'))
df
#>    ID fruit
#> 1: 50 mango
#> 2: 40 grape
#> 3: 30 melon
v = sample(df[,ID],size=30,replace = T)
v <- data.table(ID = v)

df[v, on = list(ID)]
#>     ID fruit
#>  1: 40 grape
#>  2: 40 grape
#>  3: 40 grape
#>  4: 50 mango
#>  5: 40 grape
#>  6: 30 melon
#>  7: 30 melon
#>  8: 30 melon
#>  9: 30 melon
#> 10: 50 mango
#> 11: 40 grape
#> 12: 30 melon
#> 13: 30 melon
#> 14: 30 melon
#> 15: 50 mango
#> 16: 40 grape
#> 17: 50 mango
#> 18: 50 mango
#> 19: 40 grape
#> 20: 30 melon
#> 21: 40 grape
#> 22: 30 melon
#> 23: 30 melon
#> 24: 40 grape
#> 25: 50 mango
#> 26: 50 mango
#> 27: 40 grape
#> 28: 40 grape
#> 29: 40 grape
#> 30: 40 grape
#>     ID fruit

Created on 2021-12-28 by the reprex package (v2.0.1)

CodePudding user response:

You will need to convert the vector to a df and merge them based on the ID column.

v <- data.frame(ID = v)
df1 <- merge(df, v, by = 'ID')

The output looks like this:

    ID fruit
 1: 30 melon
 2: 30 melon
 3: 30 melon
 4: 30 melon
 5: 30 melon
 6: 30 melon
 7: 30 melon
 8: 30 melon
 9: 30 melon
10: 30 melon
11: 30 melon
12: 40 grape
13: 40 grape
14: 40 grape
15: 40 grape
16: 40 grape
17: 50 mango
18: 50 mango
19: 50 mango
20: 50 mango
21: 50 mango
22: 50 mango
23: 50 mango
24: 50 mango
25: 50 mango
26: 50 mango
27: 50 mango
28: 50 mango
29: 50 mango
30: 50 mango
  • Related