I plan to identify and extract the subjects who experienced drug change from old drug to new drug.
In the code below, there are two types of drugs: A and B, type A is an old drug, while type B is a new drug, and B has different drug brands: 2,3 and 4.
With the time passing by for each person, there are 3 patterns of drug change:
Patient 11 changed drug from type A to type B, and only the change from A and B is ok, but change from B to A is not regarded as change.
Patient 12 was always using type B, but he changed from brand 2 to brand 3.
Patient 13 changed from type A to type B, but he again changed from brand 2 to brand 3.
df <- data.frame(id = c(11,11,11,11,12,12,12,12,13,13,13,13), drug_type = c("A","A","B","B","B","B","B","B","A","A","B","B"), drug_brand = c(1,1,2,2,2,3,3,3,1,1,2,3), date = c("2020-01-01","2020-02-01","2020-03-01","2020-03-13", "2019-04-05","2019-05-02","2019-06-03","2019-08-04", "2021-02-02","2021-02-27","2021-03-22","2021-04-11")) df$date <- as.Date(df$date)
So how should I filter the patients who changed drugs from this dataset?
To solve this, I summarized the last date of use of drug for type A and the first date of use of drug for type B in two data frames. And I inner join them with id and filter with the condition that first date of type B is later than last date of type A, but this may only solve the change from type A to type B. I don't know how to identify all the patterns of drug change.
I haven't found any solution or any similar question about this, so I sincerely hope you can share your ideas with me. Thank you for your time.
CodePudding user response:
Is your end goal simply to filter out any subject who changed drug_brand
or drug_type
? If so, you can use a grouped filter with dplyr::n_distinct()
to remove subjects with >1 brand or >1 type:
library(dplyr)
df %>%
group_by(id) %>%
filter(
n_distinct(drug_type) == 1,
n_distinct(drug_brand) == 1
) %>%
ungroup()
CodePudding user response:
Perhaps you could look at the transition of drug_type
from "A" to "B", or include where the number of distinct drug_brand
is greater than 1?
library(tidyverse)
df %>%
group_by(id) %>%
filter(any(drug_type == "B" & lag(drug_type) == "A") |
n_distinct(drug_brand) > 1)