Home > OS >  R: How to delete ID from a list of multiple strings in a longitudinal format
R: How to delete ID from a list of multiple strings in a longitudinal format

Time:02-15

I had an earlier post regarding how to delete ID if any of the rows within ID contain certain strings (e.g., A or D) from the following data frame in a longitudinal format. These are R code examples that I received from the earlier post in order:

  1. dat %>% group_by(id) %>% filter(!any(dx %in% c("A", "D"))) %>% ungroup()
  2. filter(df1, !id %in% id[dx %in% c("A", "D")])
  3. subset(df, !ave(dx %in% c("A", "D"), id, FUN = any)).

While these all worked well, I realized that I had to remove more than 600 strings (e.g., A, D, E2, F112, G203, etc), so I created a csv file for the list of these strings without a column name. 1. Is it the right approach to make a list? 2. How should I modify the above R codes if I intend to use the file of the strings list? Although I reviewed the other post or Google search results, I could not figure out what to do with my case. I would appreciate any suggestions!

Data frame:

id   time   dx
1     1     C
1     2     B
2     1     A
2     2     B
3     1     D
4     1     G203
4     2     E1

The results I want:

id    time  dx
 1     1     C
 1     2     B

CodePudding user response:

This is a good strategy:

Put your values in a vector or list here my_list then filter the dx column by negating by ! and using %in% operator:

library(dplyr)

my_list <- c("A", "D")

df %>% 
  filter(!dx %in% my_list)
  id time   dx
1  1    1    C
2  1    2    B
3  2    3    B
4  4    1 G203
5  4    1   E1

Expanding the list of values: my_list <- c("A", "D", "G203", "E1")

gives with the same code:

library(dplyr)

df %>% 
  filter(!dx %in% my_list)

  id time dx
1  1    1  C
2  1    2  B
3  2    3  B
  • Related