Home > OS >  Filling up orders based on eventdate per ID in R
Filling up orders based on eventdate per ID in R

Time:09-12

I would like to ask a simple question about filling up orders based on eventdate per ID.

I have the following table. I would like to make a new column called order so that the day when eventdate is not NA, the order of it is zero. And then, I would like to fill up negative consecutively decreasing integers to the previous rows and positive consecutive increasing integers to the below rows per ID.

 ---- ------------ ------- 
| ID | eventdate  | Value |
 ---- ------------ ------- 
|  1 | NA         |    10 |
|  1 | NA         |    11 |
|  1 | NA         |    12 |
|  1 | NA         |    11 |
|  1 | 2011-03-18 |    15 |
|  1 | NA         |    17 |
|  1 | NA         |    18 |
|  1 | NA         |    15 |
|  2 | NA         |     5 |
|  2 | NA         |     6 |
|  2 | NA         |     7 |
|  2 | 2011-05-28 |     9 |
|  2 | NA         |    10 |
|  2 | NA         |    11 |
|  2 | NA         |    15 |
|  2 | NA         |    16 |
|  3 | NA         |    20 |
|  3 | NA         |    22 |
|  3 | NA         |    23 |
|  3 | NA         |    24 |
|  3 | 2012-05-28 |    28 |
|  3 | NA         |    29 |
|  3 | NA         |    25 |
|  3 | NA         |    24 |
|  3 | NA         |    26 |
|  3 | NA         |    24 |
 ---- ------------ ------- 

In short, I would like to make the following table

 ---- ------------ ------- ------- 
| ID | eventdate  | Value | order |
 ---- ------------ ------- ------- 
|  1 | NA         |    10 |    -4 |
|  1 | NA         |    11 |    -3 |
|  1 | NA         |    12 |    -2 |
|  1 | NA         |    11 |    -1 |
|  1 | 2011-03-18 |    15 |     0 |
|  1 | NA         |    17 |     1 |
|  1 | NA         |    18 |     2 |
|  1 | NA         |    15 |     3 |
|  2 | NA         |     5 |    -3 |
|  2 | NA         |     6 |    -2 |
|  2 | NA         |     7 |    -1 |
|  2 | 2011-05-28 |     9 |     0 |
|  2 | NA         |    10 |     1 |
|  2 | NA         |    11 |     2 |
|  2 | NA         |    15 |     3 |
|  2 | NA         |    16 |     4 |
|  3 | NA         |    20 |    -4 |
|  3 | NA         |    22 |    -3 |
|  3 | NA         |    23 |    -2 |
|  3 | NA         |    24 |    -1 |
|  3 | 2012-05-28 |    28 |     0 |
|  3 | NA         |    29 |     1 |
|  3 | NA         |    25 |     2 |
|  3 | NA         |    24 |     3 |
|  3 | NA         |    26 |     4 |
|  3 | NA         |    24 |     5 |
 ---- ------------ ------- ------- 

Thank you very much in advance!

CodePudding user response:

You can find the position of non-NA value in eventdate and subtract it with row position in that group.

library(dplyr)

df <- df %>%
  group_by(ID) %>%
  mutate(order = row_number() - match(TRUE, !is.na(eventdate))) %>%
  ungroup 
df

# A tibble: 26 × 4
#      ID eventdate  Value order
#   <int> <chr>      <int> <int>
# 1     1 NA            10    -4
# 2     1 NA            11    -3
# 3     1 NA            12    -2
# 4     1 NA            11    -1
# 5     1 2011-03-18    15     0
# 6     1 NA            17     1
# 7     1 NA            18     2
# 8     1 NA            15     3
# 9     2 NA             5    -3
#10     2 NA             6    -2
# … with 16 more rows

In base R, the same thing can be written as -

df <- transform(df, order = ave(!is.na(eventdate), ID, 
              FUN = function(x) seq_along(x) - match(TRUE, x)))
  • Related