I have a dataset consisting of patients and their encounters in hospital, each observation is an individual patient encounter, one of the columns is the patient ID, see below,
Some of the encounters are an operation date, I'm trying to get a list of encounters within 30 days after the operation date and am struggling, (using R),
See below an example of the data:
Patient ID | Encounter ID | Operation Date | Encounter Date |
---|---|---|---|
34 | 12 | 2022-01-01 | 2022-01-01 |
34 | 21 | NA | 2022-01-20 |
35 | 15 | NA | 2022-03-02 |
35 | 16 | 2022-04-01 | 2022-04-01 |
35 | 18 | NA | 2022-06-25 |
So for example, I would like to get Encounter 21, but not encounter 18,
If anyone could help me out that'd be much appreciated,
As a side note if anyone knows any packages which are good for dealing with this type of data, that would be really useful, the 'time series' packages I've tried don't seem to help for this type of problem.
CodePudding user response:
library(lubridate)
library(tidyverse)
df <- read_table("Patient_ID Encounter_ID Operation_Date Encounter_Date
34 12 2022-01-01 2022-01-01
34 21 NA 2022-01-20
35 15 NA 2022-03-02
35 16 2022-04-01 2022-04-01
35 18 NA 2022-06-25")
df %>%
group_by(Patient_ID) %>%
filter(Encounter_Date <= first(Encounter_Date) %m % days(30)) %>%
ungroup()
#> # A tibble: 4 x 4
#> Patient_ID Encounter_ID Operation_Date Encounter_Date
#> <dbl> <dbl> <date> <date>
#> 1 34 12 2022-01-01 2022-01-01
#> 2 34 21 NA 2022-01-20
#> 3 35 15 NA 2022-03-02
#> 4 35 16 2022-04-01 2022-04-01
Created on 2022-08-14 by the reprex package (v2.0.1)
CodePudding user response:
How about using two filter
s?
library(dplyr)
df |>
group_by(PatientID) |>
filter(is.na(OperationDate),
between(as.numeric(EncounterDate-min(OperationDate, na.rm = TRUE)), 0, 30)) |>
ungroup()
Output:
# A tibble: 1 × 4
PatientID EncounterID OperationDate EncounterDate
<dbl> <dbl> <date> <date>
1 34 21 NA 2022-01-20
Why it works:
- We group by each patient with
group_by
is.na(OperationDate)
secures that we don't include the encounter of the operation.between
makes sure that we do not include encounters before the operation. E.g. encounter 15.min
selects the first non-NA date (in each group).
Data:
library(readr)
df <- read_table("PatientID EncounterID OperationDate EncounterDate
34 12 2022-01-01 2022-01-01
34 21 NA 2022-01-20
35 15 NA 2022-03-02
35 16 2022-04-01 2022-04-01
35 18 NA 2022-06-25")