Home > front end >  How to fill a data frame with cumulative sum with missing values in another variable
How to fill a data frame with cumulative sum with missing values in another variable

Time:07-06

Imagine you have the following data frame

x<- c(3, 5, 9, 12, 14)
y<- c(0.2, 0.4, 0.7, 1.4, 1.8)
df<- data.frame(x, y)
df

enter image description here

I asked a few months ago, how to fill "x" with remaining numbers and those numbers take the value zero in "y". And the answer was:

df <- tidyr::complete(df, x = 0:16, fill = list(y = 0))
cbind(df$x, df$y)

enter image description here

Now, I'd like to fill the numbers in the following way, but automatically, and I don't know if it is possible.
How to obtein "y1" automatically

Thanks in advance.

df$y1<- c(0,0,0, 0.2,0.2, 0.4,0.4,0.4,0.4, 0.7,0.7,0.7, 1.4,1.4, 1.8,1.8,1.8)
cbind(df$x, df$y1)

enter image description here

CodePudding user response:

Instead of specifying the fill in complete, leave it as it is, so that by default it gets filled by NA, then use fill from tidyr to update the NA elements with the previous non-NA

library(dplyr)
library(tidyr)
tidyr::complete(df, x = 0:16) %>%
   fill(y, .direction = "down") %>% 
   mutate(y = replace(y, is.na(y), 0))

-output

# A tibble: 17 × 2
       x     y
   <dbl> <dbl>
 1     0   0  
 2     1   0  
 3     2   0  
 4     3   0.2
 5     4   0.2
 6     5   0.4
 7     6   0.4
 8     7   0.4
 9     8   0.4
10     9   0.7
11    10   0.7
12    11   0.7
13    12   1.4
14    13   1.4
15    14   1.8
16    15   1.8
17    16   1.8

CodePudding user response:

df %>%
 complete(x=0:16) %>%
 fill(y) %>%
 replace_na(list(y=0))

# A tibble: 17 x 2
       x     y
   <dbl> <dbl>
 1     0   0  
 2     1   0  
 3     2   0  
 4     3   0.2
 5     4   0.2
 6     5   0.4
 7     6   0.4
 8     7   0.4
 9     8   0.4
10     9   0.7
11    10   0.7
12    11   0.7
13    12   1.4
14    13   1.4
15    14   1.8
16    15   1.8
17    16   1.8
  • Related