Home > OS >  How do I create a unique value for every 14 days of a dataframe in R
How do I create a unique value for every 14 days of a dataframe in R

Time:06-08

I have a dataframe like this:

vec <- LETTERS[1 : 4]
id <- rep(vec, each = 16)
date <- seq(as.Date("2021-01-01"), as.Date("2021-03-05"), by = "days")
data <- cbind.data.frame(id, date)

I want to create a new column, "section", which has a unique identifying value for every 14 days. I know I need to use dplyr to group_by id, because each ID has its own unique starting date and I want the first two weeks (and so on) of A to be labeled as the same "section" as the first two weeks (and so on) of B, C, and D.

Edit: Some of the dates in my real dataset skip days, so I cannot rely on counting by row.

Thanks in advance.

CodePudding user response:

Update II: thanks to @Henrik: Improved coding!

data %>% 
  group_by(id) %>% 
  mutate(g = (as.numeric(date - first(date))) %/% 14) 

Update I after clarification:

data %>% 
  as_tibble() %>% 
  mutate(diff = date - first(date)) %>% 
  group_by(Y= cumsum(as.numeric(diff) %% 14 == 0)) %>% 
  print(n=80)

   id    date       diff        Y
   <chr> <date>     <drtn>  <int>
 1 A     2021-01-01  0 days     1
 2 A     2021-01-02  1 days     1
 3 A     2021-01-03  2 days     1
 4 A     2021-01-04  3 days     1
 5 A     2021-01-05  4 days     1
 6 A     2021-01-06  5 days     1
 7 A     2021-01-07  6 days     1
 8 A     2021-01-08  7 days     1
 9 A     2021-01-09  8 days     1
10 A     2021-01-10  9 days     1
11 A     2021-01-11 10 days     1
12 A     2021-01-12 11 days     1
13 A     2021-01-13 12 days     1
14 A     2021-01-14 13 days     1
15 A     2021-01-15 14 days     2
16 A     2021-01-16 15 days     2
17 B     2021-01-17 16 days     2
18 B     2021-01-18 17 days     2
19 B     2021-01-19 18 days     2
20 B     2021-01-20 19 days     2
21 B     2021-01-21 20 days     2
22 B     2021-01-22 21 days     2
23 B     2021-01-23 22 days     2
24 B     2021-01-24 23 days     2
25 B     2021-01-25 24 days     2
26 B     2021-01-26 25 days     2
27 B     2021-01-27 26 days     2
28 B     2021-01-28 27 days     2
29 B     2021-01-29 28 days     3
30 B     2021-01-30 29 days     3
31 B     2021-01-31 30 days     3
32 B     2021-02-01 31 days     3
33 C     2021-02-02 32 days     3
34 C     2021-02-03 33 days     3
35 C     2021-02-04 34 days     3
36 C     2021-02-05 35 days     3
37 C     2021-02-06 36 days     3
38 C     2021-02-07 37 days     3
39 C     2021-02-08 38 days     3
40 C     2021-02-09 39 days     3
41 C     2021-02-10 40 days     3
42 C     2021-02-11 41 days     3
43 C     2021-02-12 42 days     4
44 C     2021-02-13 43 days     4
45 C     2021-02-14 44 days     4
46 C     2021-02-15 45 days     4
47 C     2021-02-16 46 days     4
48 C     2021-02-17 47 days     4
49 D     2021-02-18 48 days     4
50 D     2021-02-19 49 days     4
51 D     2021-02-20 50 days     4
52 D     2021-02-21 51 days     4
53 D     2021-02-22 52 days     4
54 D     2021-02-23 53 days     4
55 D     2021-02-24 54 days     4
56 D     2021-02-25 55 days     4
57 D     2021-02-26 56 days     5
58 D     2021-02-27 57 days     5
59 D     2021-02-28 58 days     5
60 D     2021-03-01 59 days     5
61 D     2021-03-02 60 days     5
62 D     2021-03-03 61 days     5
63 D     2021-03-04 62 days     5
64 D     2021-03-05 63 days     5

We could do it this way using gl:

library(dpylr)
data %>% 
  mutate(group =as.integer(gl(n(),14,n())))
 id       date group
1   A 2021-01-01     1
2   A 2021-01-02     1
3   A 2021-01-03     1
4   A 2021-01-04     1
5   A 2021-01-05     1
6   A 2021-01-06     1
7   A 2021-01-07     1
8   A 2021-01-08     1
9   A 2021-01-09     1
10  A 2021-01-10     1
11  A 2021-01-11     1
12  A 2021-01-12     1
13  A 2021-01-13     1
14  A 2021-01-14     1
15  A 2021-01-15     2
16  A 2021-01-16     2
17  B 2021-01-17     2
18  B 2021-01-18     2
19  B 2021-01-19     2
20  B 2021-01-20     2
21  B 2021-01-21     2
22  B 2021-01-22     2
23  B 2021-01-23     2
24  B 2021-01-24     2
25  B 2021-01-25     2
26  B 2021-01-26     2
27  B 2021-01-27     2
28  B 2021-01-28     2
29  B 2021-01-29     3
30  B 2021-01-30     3
31  B 2021-01-31     3
32  B 2021-02-01     3
33  C 2021-02-02     3
34  C 2021-02-03     3
35  C 2021-02-04     3
36  C 2021-02-05     3
37  C 2021-02-06     3
38  C 2021-02-07     3
39  C 2021-02-08     3
40  C 2021-02-09     3
41  C 2021-02-10     3
42  C 2021-02-11     3
43  C 2021-02-12     4
44  C 2021-02-13     4
45  C 2021-02-14     4
46  C 2021-02-15     4
47  C 2021-02-16     4
48  C 2021-02-17     4
49  D 2021-02-18     4
50  D 2021-02-19     4
51  D 2021-02-20     4
52  D 2021-02-21     4
53  D 2021-02-22     4
54  D 2021-02-23     4
55  D 2021-02-24     4
56  D 2021-02-25     4
57  D 2021-02-26     5
58  D 2021-02-27     5
59  D 2021-02-28     5
60  D 2021-03-01     5
61  D 2021-03-02     5
62  D 2021-03-03     5
63  D 2021-03-04     5
64  D 2021-03-05     5
  • Related