R: create new rows from preexistent dataframe-CodePudding

I want to create new rows based on the value of pre-existent rows in my dataset. There are two catches: first, some cell values need to remain constant while others have to increase by 1. Second, I need to cycle through every row the same amount of times.

I think it will be easier to understand with data

Here is where I am starting from:

mydata <- data.frame(id=c(10012000,10012002,10022000,10022002),
                     col1=c(100,201,44,11),
                     col2=c("A","C","B","A"))

Here is what I want:

mydata2 <- data.frame(id=c(10012000,10012001,10012002,10012003,10022000,10022001,10022002,10022003),
                     col1=c(100,100,201,201,44,44,11,11),
                     col2=c("A","A","C","C","B","B","A","A"))

Note how I add 1 in the id column cell for each new row but col1 and col2 remain constant.

Thank you

CodePudding user response：

library(tidyverse)

mydata |> 
  mutate(id = map(id, \(x) c(x, x 1))) |> 
  unnest(id)
#> # A tibble: 8 × 3
#>         id  col1 col2 
#>      <dbl> <dbl> <chr>
#> 1 10012000   100 A    
#> 2 10012001   100 A    
#> 3 10012002   201 C    
#> 4 10012003   201 C    
#> 5 10022000    44 B    
#> 6 10022001    44 B    
#> 7 10022002    11 A    
#> 8 10022003    11 A

^{Created on 2022-04-14 by the reprex package (v2.0.1)}

CodePudding user response：

library(data.table)
setDT(mydata)
final <- setorder(rbind(copy(mydata), mydata[, id := id   1]), id)
#          id col1 col2
# 1: 10012000  100    A
# 2: 10012001  100    A
# 3: 10012002  201    C
# 4: 10012003  201    C
# 5: 10022000   44    B
# 6: 10022001   44    B
# 7: 10022002   11    A
# 8: 10022003   11    A

CodePudding user response：

I think this should do it:

library(dplyr)
df1 <- arrange(rbind(mutate(mydata, id = id   1), mydata), id, col2)

Gives:

        id col1 col2
1 10012000  100    A
2 10012001  100    A
3 10012002  201    C
4 10012003  201    C
5 10022000   44    B
6 10022001   44    B
7 10022002   11    A
8 10022003   11    A

CodePudding user response：

You could use a tidyverse approach:

library(dplyr)
library(tidyr)

mydata %>% 
  group_by(id) %>% 
  uncount(2) %>% 
  mutate(id = first(id)   row_number() - 1) %>% 
  ungroup()

This returns

# A tibble: 8 x 3
        id  col1 col2 
     <dbl> <dbl> <chr>
1 10012000   100 A    
2 10012001   100 A    
3 10012002   201 C    
4 10012003   201 C    
5 10022000    44 B    
6 10022001    44 B    
7 10022002    11 A    
8 10022003    11 A

CodePudding user response：

in base R, for nostalgic reasons:

mydata2 <- as.data.frame(lapply(mydata, function(col) rep(col, each = 2)))
mydata2$id <- mydata2$id   0:1