How to filter for all instances of X happening only if nothing else is in the data before the associ-CodePudding

I'm not sure how to word the title better - I have a list of names, dates, and services. I want to find all instances of a specific service occurring only when there were 0 other services BEFORE the date of the specific one.

Example data below. The desired output would be ONLY returning row 5 because Bruce Wayne had a surgery with 0 services beforehand. John Doe is disqualified because there was a check-up beforehand and Jane Doe is disqualified because there was no surgery.

Extra question - Instead of checking for any occurrence beforehand, how would I check for any occurrence within 6 months instead?

Date <- c("2022-01-01","2022-04-01","2022-05-01","2022-07-01","2022-08-01","2022-08-05")
Name <- c("John Doe","John Doe","John Doe","Jane Doe","Bruce Wayne","Bruce Wayne")
Service <- c("Check-up","Surgery","Follow-up", "Check-up", "Surgery", "Follow-up")

df <- data.frame(Date,Name,Service)

df


        Date        Name   Service
1 2022-01-01    John Doe  Check-up
2 2022-04-01    John Doe   Surgery
3 2022-05-01    John Doe Follow-up
4 2022-07-01    Jane Doe  Check-up
5 2022-08-01 Bruce Wayne   Surgery
6 2022-08-05 Bruce Wayne Follow-up

CodePudding user response：

I don't always trust the ordering of the frame,

df %>%
  group_by(Name) %>%
  filter(Service == "Surgery", Date == min(Date)) %>%
  ungroup()
# # A tibble: 1 × 3
#   Date       Name        Service
#   <chr>      <chr>       <chr>  
# 1 2022-08-01 Bruce Wayne Surgery

CodePudding user response：

You could filter on surgery and check if it is the first row_number

library(dplyr)
df %>%
  group_by(Name) %>%
  filter(Service == "Surgery" & row_number() == 1)
#> # A tibble: 1 × 3
#> # Groups:   Name [1]
#>   Date       Name        Service
#>   <chr>      <chr>       <chr>  
#> 1 2022-08-01 Bruce Wayne Surgery

^{Created on 2023-01-27 with reprex v2.0.2}