Home > Software design >  Insert NA row after new value in column
Insert NA row after new value in column

Time:03-22

How can I insert a row of NA values after each new character string in a column (here, the "key" column)?

Data:

df <- data.frame(var1=c("M", "F", "M", "F"),
                 var2=c(1,2,2,2),
                 var3=c("A", "B", "C", "D"))
# make data long
df %>% gather(key, value) %>%
    dplyr::group_by(key, value) %>%
  summarise(n=n())

`summarise()` has grouped output by 'key'. You can override using the `.groups` argument.
# A tibble: 8 x 3
# Groups:   key [3]
  key   value     n
  <chr> <chr> <int>
1 var1  F         2
2 var1  M         2
3 var2  1         1
4 var2  2         3
5 var3  A         1
6 var3  B         1
7 var3  C         1
8 var3  D         1

Looking for this output:

    key value  n
1  var1     F  2
2  var1     M  2
3  <NA>  <NA> NA
4  var2     1  1
5  var2     2  3
6  <NA>  <NA> NA
7  var3     A  1
8  var3     B  1
9  var3     C  1
10 var3     D  1

The goal is to fill the NA values with blanks in order to separate the variables.

CodePudding user response:

Another tidyverse option:

library(tidyverse)

df %>% 
  gather(key, value) %>%
  group_by(key, value) %>%
  summarise(n=n()) %>% 
  group_by(key) %>% 
  group_modify(~ add_row(.x)) %>% 
  mutate(key = ifelse(row_number() == n(), NA, key)) %>% 
  ungroup %>% 
  slice(1:(n()-1))

Output

   key   value     n
   <chr> <chr> <int>
 1 var1  F         2
 2 var1  M         2
 3 NA    NA       NA
 4 var2  1         1
 5 var2  2         3
 6 NA    NA       NA
 7 var3  A         1
 8 var3  B         1
 9 var3  C         1
10 var3  D         1

CodePudding user response:

You could do:

df %>% gather(key, value) %>%
    dplyr::group_by(key, value) %>%
  summarise(n=n()) %>%
  summarise(across(everything(), ~c(.x, NA))) %>%
  mutate(key = ifelse(is.na(value), NA, key))
#> # A tibble: 11 x 3
#> # Groups:   key [4]
#>    key   value     n
#>    <chr> <chr> <int>
#>  1 var1  F         2
#>  2 var1  M         2
#>  3 <NA>  <NA>     NA
#>  4 var2  1         1
#>  5 var2  2         3
#>  6 <NA>  <NA>     NA
#>  7 var3  A         1
#>  8 var3  B         1
#>  9 var3  C         1
#> 10 var3  D         1
#> 11 <NA>  <NA>     NA

CodePudding user response:

Another option:

df <- df %>% gather(key, value) %>%
  dplyr::group_by(key, value) %>%
  summarise(n=n()) %>% 
  group_by(key) %>% 
  do(rbind(., c(rep(NA, ncol(df))))) %>%
  ungroup()

Output

> df
# A tibble: 11 x 3
   key   value     n
   <chr> <chr> <int>
 1 var1  F         2
 2 var1  M         2
 3 NA    NA       NA
 4 var2  1         1
 5 var2  2         3
 6 NA    NA       NA
 7 var3  A         1
 8 var3  B         1
 9 var3  C         1
10 var3  D         1
11 NA    NA       NA
  • Related