Home > Software engineering >  heading rows with different n by group in R
heading rows with different n by group in R

Time:04-01

I'm trying to get the first n parts of an object, but with different n per group, according values I have in other object.

I have the next replicable example:

a<- tibble(id = c(1,2,3,4,5,6,7,8,9,10), 
           group = c(1,1,1,1,1,2,2,2,2,2))
b<- tibble(group=c(1,2), 
           n = c(3,4))

where what I want is to get the first 3 rows of a when the group is 1, and the first 4 rows of a when the group is 2.

I've trying doing this:

  cob<- a %>%  group_by(group) %>% arrange(id, .by_group = TRUE) %>% 
  group_map(~head(.x, b$n))

But I just get the first 3 rows in both groups, and not different size for each group.

CodePudding user response:

We can do a join and then filter

library(dplyr)
a %>% 
  left_join(b) %>% 
   group_by(group) %>% 
   filter(row_number() <= first(n)) %>%
   ungroup %>%
   select(-n)

or another option is

a %>% 
   group_by(group) %>% 
   slice(seq_len(b$n[match(cur_group(), b$group)]))

CodePudding user response:

Here is a data.table solution.

library(data.table)
setDT(a)    # only needed because you started with a tibble
setDT(b)    # same
a[b, on=.(group)][, .(id=id[1:n]), by=.(group, n)]

   group n V1
1:     1 3  1
2:     1 3  2
3:     1 3  3
4:     2 4  6
5:     2 4  7
6:     2 4  8
7:     2 4  9

The first clause: a[b, on=.(group)] joins b to a creating a data.table with columns group, id, and n. The second clause: [, .(id=id[1:n]), by=.(group, n)] groups by group, taking the first n elements of id in each group.

  • Related