Home > other >  Expand dataset by count column in Dplyr
Expand dataset by count column in Dplyr

Time:02-01

I have a dataset as follows:

library(tidyverse)

df <- data.frame(
        report_date = c("2020-03-14", "2020-03-14", "2020-03-19", "2020-03-20"),
         start_date = c("2020-03-06", "2020-03-10", "2020-03-11", "2020-03-11"),
              count = c(1, 2, 1, 3)
     )

Looking like:

  report_date start_date count
1  2020-03-14 2020-03-06     1
2  2020-03-14 2020-03-10     2
3  2020-03-19 2020-03-11     1
4  2020-03-20 2020-03-11     3

I want to perform a transformation using the value count - aka - repeating each row n times as in count for starting row. I think it's clear if I show the desired result as follows:

df_final <- data.frame(
               report_date = c("2020-03-14", "2020-03-14", "2020-03-14", "2020-03-19",
                               "2020-03-20", "2020-03-20", "2020-03-20"),
                start_date = c("2020-03-06", "2020-03-10", "2020-03-10", "2020-03-11",
                               "2020-03-11", "2020-03-11", "2020-03-11"),
                     count = c(1, 1, 1, 1, 1, 1, 1)
            )

  report_date start_date count
1  2020-03-14 2020-03-06     1
2  2020-03-14 2020-03-10     1
3  2020-03-14 2020-03-10     1
4  2020-03-19 2020-03-11     1
5  2020-03-20 2020-03-11     1
6  2020-03-20 2020-03-11     1
7  2020-03-20 2020-03-11     1

Thanks!

CodePudding user response:

We may use uncount to replicate and then create the 'count'

library(dplyr)
library(tidyr)
df %>% 
    uncount(count) %>% 
    mutate(count = 1) 

-output

 report_date start_date count
1  2020-03-14 2020-03-06     1
2  2020-03-14 2020-03-10     1
3  2020-03-14 2020-03-10     1
4  2020-03-19 2020-03-11     1
5  2020-03-20 2020-03-11     1
6  2020-03-20 2020-03-11     1
7  2020-03-20 2020-03-11     1
  •  Tags:  
  • Related