I have the following two data frames:
> head(Reaction_per_park_per_day_3)
Park Date Type_1_2 Number_AC_events
<chr> <date> <chr> <int>
1 Beaverdam Flats 2018-09-25 0 1
2 Nosehill 64 ave 2018-09-26 0 1
3 Nosehill 64 ave 2018-09-26 0 1
4 Nosehill Macewin 2018-09-26 0 1
5 Crestmont 2018-09-27 0 2
6 Country Hills G.C. - Nose Creek 2018-09-28 0 1
> head(All_reports_per_month2)
Month Park Code Reports_per_month
<date> <chr> <chr> <dbl>
1 2018-09-29 Beaverdam Flats 1 1
2 2018-10-12 Nosehill 64 ave 2 1
3 2018-10-25 Nosehill 64 ave 1 2
4 2018-09-21 Crestmont 1 1
5 2018-09-29 Crestmont 2 1
I would like to add a "days since last AC event" column to All_reports_per_month2 that would take into account the date and the park of the AC event as well as the date and park of the report. If the report data is prior to the first AC event in a certain park, NA would appear. See example below:
Month Park Code Reports_per_month Days_since_last_AC
<date> <chr> <chr> <dbl> <chr>
1 2018-09-29 Beaverdam Flats 1 1 4
2 2018-10-12 Nosehill 64 ave 2 1 16
3 2018-10-25 Nosehill 64 ave 1 2 29
4 2018-09-21 Crestmont 1 1 NA
5 2018-09-29 Crestmont 2 1 2
Any help would be appreciated!
CodePudding user response:
You may want to try something like this. I have not tested it with data yet.
library(tidyverse)
min_date <- Reaction_per_park_per_day_3 %>%
group_by(Park) %>%
summarise(date = min(date))
All_reports_per_month2 %>% left_join(min_date, by = "Park") %>%
mutate(Days_since_last_AC = ifelse(Month >= date, Month - date, NA))
CodePudding user response:
This is a joining and filtering operation that will use the dplyr
package.
# import the packages
library( dplyr )
# join the data tables and filter so that we are always looking back in time
All_reports_per_month2 %>%
left_join( Reaction_per_park_per_day_3, by="Park" ) %>%
filter( Date <= Month ) %>%
group_by( Park, Month ) %>%
summarize( Days_since_last_AC = Month - max(Date) )