Is there a way to change the date to a day of a protocol in R?-CodePudding

I have a very large data set of Subjects who underwent two Protocols. Here is an example df:

df <- data.frame (Subject = c('875','875','875','875','875','875','875','875',
                              '875','1392','1392','1392','1392','1392','1392',
                              '1392','1392','1392','1392','1392'), 
                  StartDate = c('20160915','20160916','20160917','20160918',
                                '20160926','20160927','20160928','20160929',
                                '20160930','20160917','20160918','20160919',
                                '20160920','20160921','20160922','20161005',
                                '20161006','20161007','20161008','20161009'),
                  Protocol = c('Training','Training','Training','Training',
                               'Test','Test','Test','Test','Test','Training',
                               'Training','Training','Training','Training',
                               'Training','Test','Test','Test','Test','Test'),
                  Points = c('14','32','33','50','41','56','41','48','49','34',
                              '16','25','21','34','28','61','24','45','45','47'),
                  Attempts = c('24','45','37','69','56','59','53','67','53','55',
                              '34','36','26','48','44','64','36','62','53','55'))

I would like to take the StartDate of each Protocol and change it to a Day of Testing, or even just add in a column with Day of Testing.

There are irregular days for each Protocol, so a simple numbering is not working. And the actual df has thousands of rows, so doing it by hand would be a nightmare. I am looking to be able to compare specific days of a protocol between subjects (ie Day 3 of Training between subjects, etc.)

I would like to change the data frame to look like:

df2 <- data.frame (Subject = c('875','875','875','875','875','875','875','875',
                              '875','1392','1392','1392','1392','1392','1392',
                              '1392','1392','1392','1392','1392'), 
                  StartDate = c('20160915','20160916','20160917','20160918',
                                '20160926','20160927','20160928','20160929',
                                '20160930','20160917','20160918','20160919',
                                '20160920','20160921','20160922','20161005',
                                '20161006','20161007','20161008','20161009'),
                  Day = c('1','2','3','4','1','2','3','4','5','1','2','3',
                          '4','5','6','1','2','3','4','5'),
                  Protocol = c('Training','Training','Training','Training',
                               'Test','Test','Test','Test','Test','Training',
                               'Training','Training','Training','Training',
                               'Training','Test','Test','Test','Test','Test'),
                  Points = c('14','32','33','50','41','56','41','48','49','34',
                              '16','25','21','34','28','61','24','45','45','47'),
                  Attempts = c('24','45','37','69','56','59','53','67','53','55',
                              '34','36','26','48','44','64','36','62','53','55'))

Is there a way to do this? Or does anyone have other ideas?

CodePudding user response：

What do you mean by "There are irregular days for each Protocol"?

Assuming you have your data ordered by StartDate. You could do this

df %>%
  group_by(Subject, Protocol) %>%
  mutate(Day = row_number()) %>%
  ungroup

# A tibble: 20 × 6
   Subject StartDate Protocol Points Attempts   Day
   <fct>   <fct>     <fct>    <fct>  <fct>    <int>
 1 875     20160915  Training 14     24           1
 2 875     20160916  Training 32     45           2
 3 875     20160917  Training 33     37           3
 4 875     20160918  Training 50     69           4
 5 875     20160926  Test     41     56           1
 6 875     20160927  Test     56     59           2
 7 875     20160928  Test     41     53           3
 8 875     20160929  Test     48     67           4
 9 875     20160930  Test     49     53           5
10 1392    20160917  Training 34     55           1
11 1392    20160918  Training 16     34           2
12 1392    20160919  Training 25     36           3
13 1392    20160920  Training 21     26           4
14 1392    20160921  Training 34     48           5
15 1392    20160922  Training 28     44           6
16 1392    20161005  Test     61     64           1
17 1392    20161006  Test     24     36           2
18 1392    20161007  Test     45     62           3
19 1392    20161008  Test     45     53           4
20 1392    20161009  Test     47     55           5