Home > database >  R: How can I retain the 2-3 last rows within the same IDs when the values in a column are the same i
R: How can I retain the 2-3 last rows within the same IDs when the values in a column are the same i

Time:03-04

Using R, I would like to select the last rows within the same IDs for longitudinal data. However, I would like to keep 2-3 last rows within the same IDs when values in the time column are the same (e.g., value 5 for ID 1 and value 4 for ID 3) for the last rows (2 rows for ID 1 and 3 rows for ID 3). If the values are different in the time column within the same IDs, I want to keep the last row only (e.g., value 7 for ID 2).

My dataframe is as follows:

id time    dx    code
1   1   primary   A1
1   5   primary   D2
1   5   secondary B3
2   1   primary   A2
2   7   primary   C4
3   4   primary   A1
3   4   secondary B3
3   4   tertiary  D2

I want the following results:

id time    dx    code
1   5   primary   D2
1   5   secondary B3
2   7   primary   C4
3   4   primary   A1
3   4   secondary B3
3   4   tertiary  D2

When I used the following R scripts, d %>% group_by(id) %>% filter(row_number() == n()), these only kept the last row within each ID. Any help would be appreciated!

CodePudding user response:

You can group_by dx as well and use slice_tail:

dat %>% 
  group_by(id, dx) %>% 
  slice_tail(n = 1)

# A tibble: 6 x 4
# Groups:   id, dx [6]
     id  time dx        code 
  <int> <int> <chr>     <chr>
1     1     5 primary   D2   
2     1     5 secondary B3   
3     2     7 primary   C4   
4     3     4 primary   A1   
5     3     4 secondary B3   
6     3     4 tertiary  D2   
  • Related