I have a sample of my data as follows:
id = c(1, 2, 3, 4, 5, 1, 4, 7, 8, 3)
date = c("2020-12-31", "2020-12-31", "2020-12-31", "2020-12-31",
"2020-12-31", "01-01-2021", "01-01-2021", "01-01-2021", "01-01-2021",
"01-01-2021")
total = c(1, 4, 4, 15, 0, 12, 1, 1, 1, 0)
data = data.frame(id, date, total)
I am trying to count how many times a "total" value occurs per date. So for example, for the date "2020-12-31"
the value 4
occurs twice but the value 1
only occurs once as it does for 15
and 0
for this date. And then for the date "01-01-2021"
the value 1
occurs three times and so on. Essentially I would like the out to result in:
day = c("2020-12-31", "01-01-2021")
one = c(1, 3)
two = c(0, 0)
three = c(0, 0)
four = c(2, 0)
five = c( 0, 0)
six = c(0, 0)
seven = c(0,0)
eight = c(0, 0)
nine = c(0,0)
ten = c(0,0)
eleven = c(0,0)
twelve = c(0,1)
thirteen = c(0,0)
fourteen = c(0,0)
fifteen = c(1,0)
df = data.frame(day, one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen,
fourteen, fifteen)
so one column for the date and then the next 15 columns represent the number I am counting. (there are more dates for my data i just have not put them all in my example)
I started by grouping the original columns by:
data %>%
group_by(date, total)
But I am not sure how to count the values per group and put it in the resulting dataframe. Thanks!
CodePudding user response:
library(tidyr)
library(dplyr)
data %>%
count(date, total) %>%
complete(date, total = 0:15, fill = list(n = 0)) %>%
pivot_wider(id_cols = date, names_from = total, values_from = n, names_prefix = "total")
# # A tibble: 2 × 17
# date total0 total1 total2 total3 total4 total5 total6 total7 total8 total9 total10 total11 total12
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 01-01… 1 3 0 0 0 0 0 0 0 0 0 0 1
# 2 2020-… 1 1 0 0 2 0 0 0 0 0 0 0 0
# # … with 3 more variables: total13 <dbl>, total14 <dbl>, total15 <dbl>
CodePudding user response:
`as.data.frame.table is the time-honored method:
as.data.frame( with(data, table(date, total)))
#------------------------
date total Freq
1 01-01-2021 0 1
2 2020-12-31 0 1
3 01-01-2021 1 3
4 2020-12-31 1 1
5 01-01-2021 4 0
6 2020-12-31 4 2
7 01-01-2021 12 1
8 2020-12-31 12 0
9 01-01-2021 15 0
10 2020-12-31 15 1
If you wanted this in a "wide" format, which is really a b*tch to work with then leave it as a tble:
with(data, table(date, total))
total
date 0 1 4 12 15
01-01-2021 1 3 0 1 0
2020-12-31 1 1 2 0 1