I have a dataset similar to this:
> dput(df)
structure(list(Date.Bought = structure(c(18262, 18262, 18262,
18265, 18266, 18266, 18268, 18269, 18270, 18270, 18272, 18273
), class = "Date"), Wasted = c(4L, 1L, 0L, 2L, 2L, 1L, 1L, 0L,
1L, 1L, 0L, 1L), Ordered = c(12L, 2L, 0L, 4L, 5L, 7L, 3L, 2L,
1L, 4L, 5L, 1L)), row.names = c(NA, -12L), class = "data.frame")
I want to divide the total wastage by the total ordered by date.
So, for example, on 2020-01-01
, we see a total of 5 wastages
(4 1) and a total of 14 orders
(12 2). I want to create a column that divides 5/14 to show the proportion of waste in relation to order by that date. The output for 2020-01-01
should be 36%.
I tried this, but it gives me row-wise calculations (whereas I want to group it by date):
df <- df %>% group_by(Date) %>%
mutate(all = Wasted/Ordered)
CodePudding user response:
The first column of your df is Date.Bought
, not Date
. You also want to use summarize
instead of mutate
in this case. This should work:
df |>
group_by(Date.Bought) |>
summarise(all = sum(Wasted)/sum(Ordered))