R Add empty row before specific value-CodePudding

I am trying to add an empty row before every specific value (intercept):

(I put three linear regression model summaries into a dataframe, I want to use NA to make the dataframe looks better)

For example, my dataframe is like this.

var           coefficient     p_Value
(intercept)    -17,22           0.2
speed            3.82           0.001
(intercept)    -172,23          0.02
youtube          13.42          0.001
facebook          5.44          0.5
(intercept)       3.22          0.02
youtube           4.98          0.001
facebook          4.33          0.5
newspaper        1.22          0.11

I want result like this:

var           coefficient     p_Value
(intercept)    -17,22           0.2
speed            3.82           0.001
  NA               NA            NA
(intercept)    -172,23          0.02
youtube          13.42          0.001
facebook          5.44          0.5
NA                 NA            NA
(intercept)       3.22          0.02
youtube           4.98          0.001
facebook          4.33          0.5
newspaper        1.22          0.11

I know I could hard code empty rows based on the row locations, but I am looking for a better way. Instead of hard coding, I might have a much more complex and more extended data frame in the future. I do not want to split it into different list or separate dataframe, because eventually I will write this dataframe to csv, so that with NA I could easily see different models by only read csv.

Thank you for your time.

CodePudding user response：

And here's a tidyverse approach (updated to get rid of the last NA row)

library(tidyverse)
df |> 
  mutate(split = cumsum(ifelse(var == "(intercept)", 1, 0))) |>
  group_by(split) |> 
  group_modify(.f = ~add_row(.data = .,
                             var = NA_character_)) |> 
  ungroup() |>
  slice(-n())

# A tibble: 11 × 4
   split var         coefficient p_Value
   <dbl> <chr>       <chr>         <dbl>
 1     1 (intercept) -17,22        0.2  
 2     1 speed       3.82          0.001
 3     1 NA          NA           NA    
 4     2 (intercept) -172,23       0.02 
 5     2 youtube     13.42         0.001
 6     2 facebook    5.44          0.5  
 7     2 NA          NA           NA    
 8     3 (intercept) 3.22          0.02 
 9     3 youtube     4.98          0.001
10     3 facebook    4.33          0.5  
11     3 newspaper   1.22          0.11

CodePudding user response：

Here is a base R option using split rbind

> head(do.call(rbind, lapply(split(df, cumsum(startsWith(df$var, "("))), rbind, NA)), -1)
             var coefficient p_Value
1.1  (intercept)      -17,22   0.200
1.2        speed        3.82   0.001
1.3         <NA>        <NA>      NA
2.3  (intercept)     -172,23   0.020
2.4      youtube       13.42   0.001
2.5     facebook        5.44   0.500
2.41        <NA>        <NA>      NA
3.6  (intercept)        3.22   0.020
3.7      youtube        4.98   0.001
3.8     facebook        4.33   0.500
3.9    newspaper        1.22   0.110

Data

df <- structure(list(var = c(
  "(intercept)", "speed", "(intercept)",
  "youtube", "facebook", "(intercept)", "youtube", "facebook",
  "newspaper"
), coefficient = c(
  "-17,22", "3.82", "-172,23", "13.42",
  "5.44", "3.22", "4.98", "4.33", "1.22"
), p_Value = c(
  0.2, 0.001,
  0.02, 0.001, 0.5, 0.02, 0.001, 0.5, 0.11
)), class = "data.frame", row.names = c(
  NA,
  -9L
))

CodePudding user response：

This should be more efficient, as there is only a single loop through columns.

## separate an atomic vector `x` by an NA before `x[i]`
NAsep <- function (x, i) {
  y <- vector(mode(x), length(x)   length(i))
  NAind <- i   seq(0, length(i) - 1)
  y[NAind] <- NA
  y[-NAind] <- x
  y
}

data.frame(lapply(df, NAsep, i = which(df$var == "(intercept)")[-1]))
#           var coefficient p_Value
#1  (intercept)      -17,22   0.200
#2        speed        3.82   0.001
#3         <NA>        <NA>      NA
#4  (intercept)     -172,23   0.020
#5      youtube       13.42   0.001
#6     facebook        5.44   0.500
#7         <NA>        <NA>      NA
#8  (intercept)        3.22   0.020
#9      youtube        4.98   0.001
#10    facebook        4.33   0.500
#11   newspaper        1.22   0.110