I am a newbie in R programming and I have the following question.
I have two dataframes, the first one has a minimum and maximum time for an Id. For the second dataframe, each record represents a transaction and is given at a specific time.
I need in a third dataframe to add the number of transactions of the second dataframe taking into account the time of the transaction (HourTRX) within the range of minimum and maximum hours of the first dataframe. Additionally, I appreciate if someone can tell me how I could pass these hours from character to time data type.
df1
Id HourMIN HourMAX
A90 06:00:00 06:25:00
A91 08:00:00 08:30:00
df2
Id HourTRX Transaction
A90 06:00:00 1
A90 06:10:00 1
A90 06:25:00 1
A91 08:00:00 1
A91 08:05:00 1
A91 08:15:00 1
A91 08:25:00 1
A91 08:30:00 1
expected result
Id HourMIN HourMAX Quantity Transaction
A90 06:00:00 06:25:00 3
A91 08:00:00 08:30:00 5
CodePudding user response:
If you are open to a dplyr
based approach you could use a left_join
combined with a filter
and finally aggregating everything using summarise
:
df1 %>%
left_join(df2, by = "Id") %>%
group_by(Id, HourMIN, HourMAX) %>%
filter(HourTRX >= HourMIN, HourTRX <= HourMAX) %>%
summarise(Quantity_Transaction = sum(Transaction), .groups = "drop")
This returns
# A tibble: 2 x 4
Id HourMIN HourMAX Quantity_Transaction
<chr> <time> <time> <dbl>
1 A90 06:00 06:25 3
2 A91 08:00 08:30 5