I want all the values in column values
prior to string "%" should be flagged as "YES". Else "NO". It should be checked by each unique values of column Id
.
df=data.frame('Id'=c(101,101,101,101,102,102,102,102),
'values'=c('a','%','a','c','a','d','%','c'))
All the preceding rows should be flagged as "YES". For example, In Id = 102, YES should be appeared against values "a", "d".
CodePudding user response:
With dplyr
you can use lead
in conjunction with a vector of values YES
and NO
. This assumes you only have one %
per group. It does not search for last occurrence of %
.
library(dplyr)
df %>%
group_by(Id) %>%
mutate(flag=values=="%", flag=c("YES", "NO")[cumsum(flag) 1]) %>%
ungroup()
# A tibble: 8 × 3
Id values flag
<dbl> <chr> <chr>
1 101 a YES
2 101 % NO
3 101 a NO
4 101 c NO
5 102 a YES
6 102 d YES
7 102 % NO
8 102 c NO
CodePudding user response:
With tidyverse
, we can use str_detect
with lead
to determine if %
is the next value, and if so, then return 1
and if not, NA
. Then, we can group by Id
and use fill
to add 1
to the previous rows. Then, we can convert to YES
and NO
.
library(tidyverse)
df %>%
mutate(flag = ifelse(str_detect(lead(values), "%"), 1, NA)) %>%
group_by(Id) %>%
fill(flag, .direction = "up") %>%
mutate(flag = ifelse(is.na(flag), "NO", "YES"))
Id values flag
<dbl> <chr> <chr>
1 101 a YES
2 101 % NO
3 101 a NO
4 101 c NO
5 102 a YES
6 102 d YES
7 102 % NO
8 102 c NO