How to identify value change in one column in R?-CodePudding

I plan to identify and extract the subjects who experienced drug change from old drug to new drug. In the code below, there are two types of drugs: A and B, type A is an old drug, while type B is a new drug, and B has different drug brands: 2,3 and 4.
With the time passing by for each person, there are 3 patterns of drug change:

Patient 11 changed drug from type A to type B, and only the change from A and B is ok, but change from B to A is not regarded as change.
Patient 12 was always using type B, but he changed from brand 2 to brand 3.

Patient 13 changed from type A to type B, but he again changed from brand 2 to brand 3.

df <- data.frame(id = c(11,11,11,11,12,12,12,12,13,13,13,13),
              drug_type = c("A","A","B","B","B","B","B","B","A","A","B","B"),
              drug_brand = c(1,1,2,2,2,3,3,3,1,1,2,3),
              date = c("2020-01-01","2020-02-01","2020-03-01","2020-03-13",
                       "2019-04-05","2019-05-02","2019-06-03","2019-08-04",
                       "2021-02-02","2021-02-27","2021-03-22","2021-04-11"))
 df$date <- as.Date(df$date)

So how should I filter the patients who changed drugs from this dataset?
To solve this, I summarized the last date of use of drug for type A and the first date of use of drug for type B in two data frames. And I inner join them with id and filter with the condition that first date of type B is later than last date of type A, but this may only solve the change from type A to type B. I don't know how to identify all the patterns of drug change.

I haven't found any solution or any similar question about this, so I sincerely hope you can share your ideas with me. Thank you for your time.

CodePudding user response：

Is your end goal simply to filter out any subject who changed drug_brand or drug_type? If so, you can use a grouped filter with dplyr::n_distinct() to remove subjects with >1 brand or >1 type:

library(dplyr)

df %>%
  group_by(id) %>%
  filter(
    n_distinct(drug_type) == 1,
    n_distinct(drug_brand) == 1
  ) %>%
  ungroup()

CodePudding user response：

Perhaps you could look at the transition of drug_type from "A" to "B", or include where the number of distinct drug_brand is greater than 1?

library(tidyverse)

df %>%
  group_by(id) %>%
  filter(any(drug_type == "B" & lag(drug_type) == "A") |
           n_distinct(drug_brand) > 1)