I am trying to use base R or dplyr to fill dataframe column cells that meet a specified criteria (value of 0 in this case) with the last value above before the 0. So, suppose we have the below data frame myDF
, with columns Element
, Group
, eleCnt
, and reSeq
generated by the code immediately beneath. I would like to either add a new column "ADD" or modify the existing column reSeq
with the manually input values I show under "ADD". Any recommendations for doing this in Base R or dplyr?
Element Group eleCnt reSeq
<chr> <dbl> <int> <dbl> ADD << ADD Explained
1 R 0 1 1 1 since reSeq not 0 use current row reSeq value
2 R 0 2 2 2 same as above
3 X 0 1 1 1 same as above
4 X 1 2 2 2 same as above
5 X 1 3 0 2 since reSeq = 0 use reSeq value from last row prior to the 0
6 X 0 4 4 4 since reSeq not 0 use current row reSeq value
7 X 0 5 5 5 same as above
8 X 0 6 6 6 same as above
9 B 0 1 1 1 same as above
10 R 0 3 3 3 same as above
11 R 2 4 4 4 same as above
12 R 2 5 0 4 since reSeq = 0 use reSeq value from last row prior to the 0
13 X 3 7 7 7 since reSeq not 0 use current row reSeq value
14 X 3 8 0 7 since reSeq = 0 use last reSeq value <> 0 above
15 X 3 9 0 7 since reSeq = 0 use last reSeq value <> 0 above
myDF %>%
group_by(Element) %>%
mutate(eleCnt = row_number()) %>%
ungroup()%>%
mutate(reSeq = ifelse(Group == 0 | Group != lag(Group), eleCnt,0)
)
Running dput(myDF):
> dput(myDF)
structure(list(Element = c("R", "R", "X", "X", "X", "X", "X",
"X", "B", "R", "R", "R", "X", "X", "X"), Group = c(0, 0, 0, 1,
1, 0, 0, 0, 0, 0, 2, 2, 3, 3, 3)), class = "data.frame", row.names = c(NA,
-15L))
CodePudding user response:
We may change the 0s to NA
and then do a group by fill
library(dplyr)
library(tidyr)
df1 %>%
mutate(ADD = na_if(reSeq, 0)) %>%
group_by(Element) %>%
fill(ADD) %>%
ungroup
-output
# A tibble: 15 × 5
Element Group eleCnt reSeq ADD
<chr> <dbl> <int> <dbl> <dbl>
1 R 0 1 1 1
2 R 0 2 2 2
3 X 0 1 1 1
4 X 1 2 2 2
5 X 1 3 0 2
6 X 0 4 4 4
7 X 0 5 5 5
8 X 0 6 6 6
9 B 0 1 1 1
10 R 0 3 3 3
11 R 2 4 4 4
12 R 2 5 0 4
13 X 3 7 7 7
14 X 3 8 0 7
15 X 3 9 0 7
The steps in the creation of 'ADD' can be simplified from the original data 'myDF'
library(data.table)
myDF %>%
mutate(ADD = rowid(Element) * NA^!(Group == 0 |!duplicated(Group))) %>%
group_by(Element) %>%
fill(ADD) %>%
ungroup
-output
# A tibble: 15 × 3
Element Group ADD
<chr> <dbl> <dbl>
1 R 0 1
2 R 0 2
3 X 0 1
4 X 1 2
5 X 1 2
6 X 0 4
7 X 0 5
8 X 0 6
9 B 0 1
10 R 0 3
11 R 2 4
12 R 2 4
13 X 3 7
14 X 3 7
15 X 3 7
data
df1 <- structure(list(Element = c("R", "R", "X", "X", "X", "X", "X",
"X", "B", "R", "R", "R", "X", "X", "X"), Group = c(0, 0, 0, 1,
1, 0, 0, 0, 0, 0, 2, 2, 3, 3, 3), eleCnt = c(1L, 2L, 1L, 2L,
3L, 4L, 5L, 6L, 1L, 3L, 4L, 5L, 7L, 8L, 9L), reSeq = c(1, 2,
1, 2, 0, 4, 5, 6, 1, 3, 4, 0, 7, 0, 0)), row.names = c(NA, -15L
), class = c("tbl_df", "tbl", "data.frame"))