This data frame has two columns, Date
and sum_pips
. I've been trying to group_by
each month and find the total sum_pips
for each month. For example, all the June's in this data frame equal to a sum of 2700 pips. Hopefully that makes sense
CodePudding user response:
One approach would be to use the month()
function from the lubridate
package, which teases out the months from date-formatted variables as follows:
Sample data
set.seed(123)
df <- data.frame(Date = seq(as.Date("2022/1/1"), by = "day", length.out = 100),
sum_pips = rnbinom(100, mu = 55, size = 0.75))
Code
library(dplyr)
library(lubridate)
df %>%
group_by(month(Date)) %>%
summarize(sum_all_pips = sum(sum_pips))
Output:
# `month(Date)` sum_all_pips
# <dbl> <dbl>
# 1 1 1387
# 2 2 1663
# 3 3 1783
# 4 4 803
CodePudding user response:
This is effectively a duplicate of Calculate the mean by group, albeit needing to know how to extract the "month" for each row. Assuming your date column is of the appropriate Date
class (and not strings), then you can likely do something like:
transform(yourdata, mon = format(Date, format = "%b")) |>
stats:::aggregate.formula(formula = sum_pips ~ mon, FUN = sum)
or in dplyr,
library(dplyr)
group_by(yourdata, mon = format(Date, format = "%b")) %>%
summarize(total = sum(sum_pips))