I'm trying to get the first n parts of an object, but with different n per group, according values I have in other object.
I have the next replicable example:
a<- tibble(id = c(1,2,3,4,5,6,7,8,9,10),
group = c(1,1,1,1,1,2,2,2,2,2))
b<- tibble(group=c(1,2),
n = c(3,4))
where what I want is to get the first 3 rows of a when the group is 1, and the first 4 rows of a when the group is 2.
I've trying doing this:
cob<- a %>% group_by(group) %>% arrange(id, .by_group = TRUE) %>%
group_map(~head(.x, b$n))
But I just get the first 3 rows in both groups, and not different size for each group.
CodePudding user response:
We can do a join and then filter
library(dplyr)
a %>%
left_join(b) %>%
group_by(group) %>%
filter(row_number() <= first(n)) %>%
ungroup %>%
select(-n)
or another option is
a %>%
group_by(group) %>%
slice(seq_len(b$n[match(cur_group(), b$group)]))
CodePudding user response:
Here is a data.table solution.
library(data.table)
setDT(a) # only needed because you started with a tibble
setDT(b) # same
a[b, on=.(group)][, .(id=id[1:n]), by=.(group, n)]
group n V1
1: 1 3 1
2: 1 3 2
3: 1 3 3
4: 2 4 6
5: 2 4 7
6: 2 4 8
7: 2 4 9
The first clause: a[b, on=.(group)]
joins b
to a
creating a data.table with columns group
, id
, and n
. The second clause: [, .(id=id[1:n]), by=.(group, n)]
groups by group
, taking the first n
elements of id
in each group.