Home > front end >  r counting the combination of time
r counting the combination of time

Time:02-27

This is harder to code, even harder to explain. My apologies if my explanation is confusing, I will try to explain the data this way.

I have a dataset with 3 columns

ID        Vaccine        Time
 1         A              Winter
 1         B              Spring
 
 2         A              Spring
 2         B              Winter
 2         B              Fall
 
 3         C              Fall
 3         A              Fall
 3         B              Fall

 4         A              Winter
 4         A              Spring

 5         A              Winter

As you can see there are

  • 5 patients here and

  • each can take, any or all, of the 3 vaccines A,B,C.

  • There are 3 seasons: Winter, Spring, Fall.

  • Vaccine A

    • A total of 5 patients took Vaccine A

    • 4 patients (patient 1, patient 2, patient 3, patient 5) took the vaccine only once

      Winter

      • Patient 1
      • Patient 5

      Spring

      • Patient 2

      Fall

      • Patient 3
    • 1 patient (patient 4) took the vaccine twice during Winter and spring


Vaccine      Winter.Only     Spring.Only       Fall.Only     Winter.Spring     Winter.Fall    Spring.Fall
A            2               1                 1             1 
  • Vaccine B

    • A total of 3 patients took Vaccine B

    • 2 patients took the vaccine only once (patient 1, patient 3)

      Spring

      • Patient 1

      Fall

      • Patient 3
    • 1 patient (patient 2) took the vaccine twice during Winter and Fall


Vaccine      Winter.Only     Spring.Only       Fall.Only     Winter.Spring     Winter.Fall    Spring.Fall
B                            1                 1                               1 
  • Vaccine C

    • A total of 1 patient took Vaccine C

    • 1 patient took the vaccine only once (patient 3)

      Fall

      • Patient 3

Vaccine      Winter.Only     Spring.Only       Fall.Only     Winter.Spring     Winter.Fall    Spring.Fall
C                                              1                                

The final dataset should look like this

Vaccine      Winter.Only     Spring.Only       Fall.Only     Winter.Spring     Winter.Fall    Spring.Fall
A            2               1                 1             1 
B                            1                 1                               1 
C                                              1

Mainly I am trying to create a dataset for each vaccine (row), how many patients took only one vaccine and when (Winter, Spring, Fall) and how many patients took two or three vaccines of the same type and when (Winter.Spring or Winter.Fall or Spring. Fall or Winter.Spring.Fall)

Any thoughts or suggestions on how to do this is much appreciated.

CodePudding user response:

library(dplyr); library(tidyr)
df %>% 
  arrange(Time) %>%
  group_by(ID, Vaccine) %>%
  summarize(Times = paste(Time, collapse = "_and_"), .groups = "drop") %>%
  count(Vaccine, Times) %>%
  pivot_wider(names_from = Times, values_from = n)

Result

# A tibble: 3 x 6
  Vaccine  Fall Spring Spring_and_Winter Winter Fall_and_Winter
  <chr>   <int>  <int>             <int>  <int>           <int>
1 A           1      1                 1      2              NA
2 B           1      1                NA     NA               1
3 C           1     NA                NA     NA              NA
  • Related