I have a data frame with data split into groups with each row in each group given a code. I want the dates of everything within the group to be compared to the date given to 'a'. So for group 1, 'a' has a date of '2022-01-01', 'b' has a date of '2022-01-03' which is after the date of 'a' and therefore 'after' is written in the comment field (which shows the expected outcomes of the data in this example). I want to write code that populates the 'comment' field based on the rule above.
group<-c(1,1,1,1,2,2,2)
code<-c("a","b","c","d","a","b","c")
date<-c('2022-01-01','2022-01-03', '2021-12-15', '2022-05-01','2021-06-01', '2022-04-04','2022-05-10')
comment<-c(NA, "after","before","after",NA,"after","after")
data<-data.frame(group, code, date, comment, stringsAsFactors = FALSE)
Can anyone suggest any code?
CodePudding user response:
Do you mean this?
library(dplyr)
data %>%
group_by(group) %>%
mutate(comment2 = case_when(
first(as.Date(date)) == as.Date(date) ~ NA_character_,
first(as.Date(date)) > as.Date(date) ~ "before",
TRUE ~ "after")) %>%
ungroup()
## A tibble: 7 × 5
# group code date comment comment2
# <dbl> <chr> <chr> <chr> <chr>
#1 1 a 2022-01-01 NA NA
#2 1 b 2022-01-03 after after
#3 1 c 2021-12-15 before before
#4 1 d 2022-05-01 after after
#5 2 a 2021-06-01 NA NA
#6 2 b 2022-04-04 after after
#7 2 c 2022-05-10 after after
I'm creating a new comment2
column to explicitly show that it reproduces your expected output in column comment
.
This assumes that
- There is always an
"a"
code
, and - Codes are ordered
"a"
,"b"
,"c"
bygroup
(making"a"
the firstcode
in everygroup
).
If codes are not ordered (but there is always an "a"
code
), you can do
data %>%
group_by(group) %>%
mutate(comment2 = case_when(
as.Date(date)[code == "a"] == as.Date(date) ~ NA_character_,
as.Date(date)[code == "a"] > as.Date(date) ~ "before",
TRUE ~ "after")) %>%
ungroup()