Home > Software engineering >  Disaggregate Rows according to the value of a column in R
Disaggregate Rows according to the value of a column in R

Time:10-27

Given df1, I need to expand the same amount of rows according to the value given by the TRIP column. As an example I put two columns but in real life it is a dataframe with more than 15 columns. How do I expand all records from the value recorded in the Trip column?


*df1*       
        
Id  Trip
1   3
2   2
3   2
4   4


Expected Result

*df1*       
        
Id  Trip
1   1
1   2
1   3
2   1
2   2
3   1
3   2
4   1
4   2
4   3
4   4

CodePudding user response:

We can replicate the rows first using uncount on the 'Trip' column, then convert the 'Trip' to sequence grouped by 'Id'

library(dplyr)
library(tidyr)
df1 %>%
   uncount(Trip, .remove = FALSE) %>%
   group_by(Id) %>%
   mutate(Trip = row_number()) %>%
   ungroup

-output

# A tibble: 11 × 2
      Id  Trip
   <int> <int>
 1     1     1
 2     1     2
 3     1     3
 4     2     1
 5     2     2
 6     3     1
 7     3     2
 8     4     1
 9     4     2
10     4     3
11     4     4

Or another option is to first convert the 'Trip' to sequence in a list and unnest the list column

library(purrr)
df1 %>%
   mutate(Trip = map(Trip, seq_len)) %>%
   unnest(Trip)

-output

# A tibble: 11 × 2
      Id  Trip
   <int> <int>
 1     1     1
 2     1     2
 3     1     3
 4     2     1
 5     2     2
 6     3     1
 7     3     2
 8     4     1
 9     4     2
10     4     3
11     4     4

data

df1 <- structure(list(Id = 1:4, Trip = c(3L, 2L, 2L, 4L)), 
class = "data.frame", row.names = c(NA, 
-4L))

CodePudding user response:

Still another dplyr alone option is:

library(dplyr)

df1 %>% 
  group_by(Id) %>% 
  summarise(cur_data()[seq(Trip),]) %>% 
  mutate(Trip = row_number())
     Id  Trip
   <int> <int>
 1     1     1
 2     1     2
 3     1     3
 4     2     1
 5     2     2
 6     3     1
 7     3     2
 8     4     1
 9     4     2
10     4     3
11     4     4
  •  Tags:  
  • r
  • Related