Home > OS >  How to add nth rows of "NA" at every nth rows in a dataframe?
How to add nth rows of "NA" at every nth rows in a dataframe?

Time:04-09

I have a set of measurements in a data frame which I need to add in n number of rows of "NA" at every nth row. The reason is that I have a set of measurements that were taken only at every nth period of time and I will like to plot it to reflect the actual total time.

Hope there is some expert who can help me with this? I have been trying to figure this out for a very long time. Thank you.

For example, for an initial data frame below.

        x     y
 row1   1     2
 row2   1     2
 row3   1     2     
 row4   1     2
 row5   1     2
 row6   1     2
 row7   1     2
 row8   1     2
 row9   1     2
 row10  1     2

Suppose I want to add 2 "NA" rows at every other 2 rows:

        x     y
 row1   1     2
 row2   1     2
 row3   NA    NA
 row4   NA    NA
 row5   1     2
 row6   1     2
 row7   NA    NA
 row8   NA    NA
 row9   1     2
 row10  1     2 
 row11   NA    NA
 row12   NA    NA
 row13   1     2
 row14   1     2 
 row15   NA    NA
 row16   NA    NA
 row17   1     2
 row18   1     2 

CodePudding user response:

You could use a custom grouping together with group_modify:

library(dplyr)

df %>% 
  group_by(grp = (row_number()   1) %/% 2 ) %>% 
  group_modify(~ add_row(.x, x = rep(NA, 2))) %>% 
  ungroup() %>% 
  select(-grp)

This returns

# A tibble: 20 x 2
       x     y
   <dbl> <dbl>
 1     1     2
 2     1     2
 3    NA    NA
 4    NA    NA
 5     1     2
 6     1     2
 7    NA    NA
 8    NA    NA
 9     1     2
10     1     2
11    NA    NA
12    NA    NA
13     1     2
14     1     2
15    NA    NA
16    NA    NA
17     1     2
18     1     2
19    NA    NA
20    NA    NA

CodePudding user response:

You could split at the cumsum of row indeces modulo 2 using by and rbind a matrix with the desired dimensions and same dimnames.

by(dat, cumsum(seq_len(nrow(dat)) %% 2), \(x) 
   rbind(x, matrix(,2, 2, dimnames=list(NULL, names(x))))) |>
  do.call(what=rbind) |>
  {\(.) `rownames<-`(., paste0('row', seq_len(nrow(.))))}()  ## optional
#        x  y
# row1   1  2
# row2   1  2
# row3  NA NA
# row4  NA NA
# row5   1  2
# row6   1  2
# row7  NA NA
# row8  NA NA
# row9   1  2
# row10  1  2
# row11 NA NA
# row12 NA NA
# row13  1  2
# row14  1  2
# row15 NA NA
# row16 NA NA
# row17  1  2
# row18  1  2
# row19 NA NA
# row20 NA NA

Note: R >= 4.1 used.


Data:

dat <- structure(list(x = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
    y = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L)), class = "data.frame", row.names = c("row1", 
"row2", "row3", "row4", "row5", "row6", "row7", "row8", "row9", 
"row10"))
  • Related