Home > Blockchain >  How to add new rows conditionally on R
How to add new rows conditionally on R

Time:08-12

I have a df with

v1  t1  c1  o1
1   1   9   1
1   1   12  2
1   2   2   1
1   2   7   2
2   1   3   1
2   1   6   2
2   2   3   1
2   2   12  2

And I would like to add 2 rows each time that v1 changes it's value, in order to get this:

v1  t1  c1  o1
1   1   1   1
1   1   1   2
1   2   9   1
1   2   12  2
1   3   2   1
1   3   7   2
2   1   1   1
2   1   1   2
1   2   3   1
1   2   6   2
2   3   3   1
2   3   12  2

So what I'm doing is that every time v1 changes its value I'm adding 2 rows of ones and adding a 1 to the values of t1. This is kind of tricky. I've been able to do it in Excel but I would like to scale to big files in R.

CodePudding user response:

We may do the expansion in group_modify

library(dplyr)
df1 %>%
    group_by(v1) %>% 
    group_modify(~  .x %>%
    slice_head(n = 2) %>% 
    mutate(across(-o1, ~ 1)) %>%
   bind_rows(.x) %>%
   mutate(t1 = as.integer(gl(n(), 2, n())))) %>%
   ungroup

-output

# A tibble: 12 × 4
      v1    t1    c1    o1
   <int> <int> <dbl> <int>
 1     1     1     1     1
 2     1     1     1     2
 3     1     2     9     1
 4     1     2    12     2
 5     1     3     2     1
 6     1     3     7     2
 7     2     1     1     1
 8     2     1     1     2
 9     2     2     3     1
10     2     2     6     2
11     2     3     3     1
12     2     3    12     2

Or do a group by summarise

df1 %>% 
  group_by(v1) %>% 
  summarise(t1 = as.integer(gl(n()   2, 2, n()   2)), 
  c1 = c(1, 1, c1), o1 = rep(1:2, length.out = n()   2),
    .groups = 'drop')

-output

# A tibble: 12 × 4
      v1    t1    c1    o1
   <int> <int> <dbl> <int>
 1     1     1     1     1
 2     1     1     1     2
 3     1     2     9     1
 4     1     2    12     2
 5     1     3     2     1
 6     1     3     7     2
 7     2     1     1     1
 8     2     1     1     2
 9     2     2     3     1
10     2     2     6     2
11     2     3     3     1
12     2     3    12     2

data

df1 <- structure(list(v1 = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), t1 = c(1L, 
1L, 2L, 2L, 1L, 1L, 2L, 2L), c1 = c(9L, 12L, 2L, 7L, 3L, 6L, 
3L, 12L), o1 = c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L)),
 class = "data.frame", row.names = c(NA, 
-8L))
  • Related