Home > front end >  R: add x rows with value y
R: add x rows with value y

Time:04-04

I have a dataframe (df1) with three columns: Name, Decade and Count. For example:

Name <- c("a","b","c")
Decade <- c(1810,1850,1900)
Count <- c(2,3,1)
df1 <- data.frame(Name,Decade,Count)
print(df1)
  Name Decade Count
1    a   1810     2
2    b   1850     3
3    c   1900     1

I wish to create a new dataframe (df2) with columns Name and Decade, with repeating rows of values in df1$Name and df1$Decade by df1$Count. For each repeating row in df2, I wish the df2$Decade to increase by an increment of 10. The first instance of each df2$Name and df2$Decade combination would be equal to df1$Name and df1$Decade.

In the above example, my desired output would be:

> print(df2)
  Name Decade
1    a   1810
2    a   1820
3    b   1850
4    b   1860
5    b   1870
6    c   1900

I would love to show my workings so far, but I don't know how to start and have been manually doing it in excel. [hangs head in shame]

Thank you for your time in advance.

CodePudding user response:

Here is one option using a combination of tidyverse and splitstackshape. After duplicating rows according to Count, we can use row_number minus 1 to create a sequence for each group (e.g., 0, 1, 2...), then multiply by 10 to get the 10 year increment and add to the Decade for that group.

library(tidyverse)
library(splitstackshape)

df2 <- expandRows(df1, "Count") %>% 
  group_by(Name) %>% 
  mutate(Decade = Decade   (row_number()-1) *10)

Or if you don't want to load another package, we could create the duplicated rows with base R:

df2 <- df1[rep(seq(nrow(df1)), df1$Count),c(1:2)] %>% 
  group_by(Name) %>% 
  mutate(Decade = Decade   (row_number()-1) *10)

Output

df2

  Name  Decade
  <chr>  <dbl>
1 a       1810
2 a       1820
3 b       1850
4 b       1860
5 b       1870
6 c       1900

CodePudding user response:

Sequences should sort this very quickly:

df2 <- df1[rep(seq_len(nrow(df1)), df1$Count), 1:2]
df2$Decade <- df2$Decade   (sequence(df1$Count)-1) * 10
df2

#    Name Decade
#1      a   1810
#1.1    a   1820
#2      b   1850
#2.1    b   1860
#2.2    b   1870
#3      c   1900
  • Related