Home > Software design >  R - Number of days since last event in another dataframe
R - Number of days since last event in another dataframe

Time:09-29

I have the following two data frames:

> head(Reaction_per_park_per_day_3)
  Park                            Date       Type_1_2 Number_AC_events
  <chr>                           <date>     <chr>               <int>
1 Beaverdam Flats                 2018-09-25 0                       1
2 Nosehill 64 ave                 2018-09-26 0                       1
3 Nosehill 64 ave                 2018-09-26 0                       1
4 Nosehill Macewin                2018-09-26 0                       1
5 Crestmont                       2018-09-27 0                       2
6 Country Hills G.C. - Nose Creek 2018-09-28 0                       1

> head(All_reports_per_month2)
  Month      Park            Code  Reports_per_month
  <date>     <chr>           <chr>             <dbl>
1 2018-09-29 Beaverdam Flats 1                     1
2 2018-10-12 Nosehill 64 ave 2                     1
3 2018-10-25 Nosehill 64 ave 1                     2
4 2018-09-21 Crestmont       1                     1
5 2018-09-29 Crestmont       2                     1

I would like to add a "days since last AC event" column to All_reports_per_month2 that would take into account the date and the park of the AC event as well as the date and park of the report. If the report data is prior to the first AC event in a certain park, NA would appear. See example below:

Month      Park            Code  Reports_per_month         Days_since_last_AC
      <date>     <chr>           <chr>             <dbl>                 <chr>
    1 2018-09-29 Beaverdam Flats 1                     1                    4
    2 2018-10-12 Nosehill 64 ave 2                     1                   16
    3 2018-10-25 Nosehill 64 ave 1                     2                   29
    4 2018-09-21 Crestmont       1                     1                   NA
    5 2018-09-29 Crestmont       2                     1                    2

Any help would be appreciated!

CodePudding user response:

You may want to try something like this. I have not tested it with data yet.

library(tidyverse)
min_date <- Reaction_per_park_per_day_3 %>%
  group_by(Park) %>%
  summarise(date = min(date))
All_reports_per_month2 %>% left_join(min_date, by = "Park")  %>%
  mutate(Days_since_last_AC = ifelse(Month >= date, Month - date, NA))

CodePudding user response:

This is a joining and filtering operation that will use the dplyr package.

# import the packages
library( dplyr )

# join the data tables and filter so that we are always looking back in time
All_reports_per_month2 %>%
  left_join( Reaction_per_park_per_day_3, by="Park" ) %>%
  filter( Date <= Month ) %>%
  group_by( Park, Month ) %>%
  summarize( Days_since_last_AC = Month - max(Date) )
  •  Tags:  
  • r
  • Related