I have
> head(df,7)
date pos cons_week
1 2020-03-30 313 169
2 2020-03-31 255 169
3 2020-04-01 282 169
4 2020-04-02 382 169
5 2020-04-03 473 169
6 2020-04-04 312 169
7 2020-04-05 158 169
pos
denotes number of positive COVID cases per day. cons_week
is number of consecutive weeks since lockdown. Therefore, I have 7 entries of pos
for each cons_week
. I want to summarise so I have the total number of pos
per week.
I tried different versions, like
df %>% group_by(cons_week) %>%
summarise(n = n())
or
df %>% group_by(cons_week, pos) %>%
summarise(n = sum())
Expected output
cons_week n
169 2175
170 1651
171 1179
Data
df <- structure(list(date = structure(c(18351, 18352, 18353, 18354,
18355, 18356, 18357, 18358, 18359, 18360, 18361, 18362, 18363,
18364, 18365, 18366, 18367, 18368, 18369, 18370, 18371), class = "Date"),
pos = c("313", "255", "282", "382", "473", "312", "158",
"424", "347", "301", "140", "142", "140", "157", "156", "258",
"199", "178", "168", "106", "114"), cons_week = c(169, 169,
169, 169, 169, 169, 169, 170, 170, 170, 170, 170, 170, 170,
171, 171, 171, 171, 171, 171, 171)), row.names = c(NA, 21L
), class = "data.frame")
CodePudding user response:
Because pos
is character
in your df
. You need to convert it to numeric
first. E.g.:
library(dplyr)
df %>%
mutate(pos = as.numeric(pos)) %>%
group_by(cons_week) %>%
summarise(n = sum(pos))
Or:
df %>%
group_by(cons_week) %>%
summarise(n = sum(as.numeric(pos)))
Output:
cons_week n
<dbl> <dbl>
1 169 2175
2 170 1651
3 171 1179
CodePudding user response:
Use:
df %>% group_by(cons_week) %>%
summarise(n = sum(as.numeric(pos)))
Or before:
df$pos <- as.numeric(df$pos)
df %>% group_by(cons_week) %>%
summarise(n = sum(pos))
The problem is that pos
is character type (class
), not numeric
.