My dataframe looks like this:
df
BEN_ID Val_1 Val_2 Val_3 Val_4 Val_5 AGE GENDER
1 ID1 vA303 . . . . 25 F
2 ID1 9351 A303 53019 49390 F5D12 52 F
3 ID2 541AZ 1120 462 4019 A36B0 58 M
4 ID2 30302 5939 2768 4019 2724 65 M
5 ID2 305A1 78652 9190 4019 33829 61 M
6 ID3 305A3 29590 5715 . . 53 M
7 ID3 Z57B9 35981 5849 570 4254 35 M
8 ID3 5693 78900 30590 30500 Z25H2 19 M
9 ID3 7AD59 7881 30301 78900 78791 57 M
10 ID4 7AD59 5780 53530 30390 3051 57 F
I wanted to get rows that match with any of Val_1 to Val_5 starting as patterns of "303" or "305".
So my output should look like this:
BEN_ID Val_1 Val_2 Val_3 Val_4 Val_5 AGE GENDER
4 ID2 30302 5939 2768 4019 2724 65 M
5 ID2 305A1 78652 9190 4019 33829 61 M
6 ID3 305A3 29590 5715 . . 53 M
8 ID3 5693 78900 30590 30500 Z25H2 19 M
9 ID3 7AD59 7881 30301 78900 78791 57 M
10 ID4 7AD59 5780 53530 30390 3051 57 F
I tried this code
library(dplyr)
diag_cols = names(df %>% select(starts_with("Val")))
dat_read = dat_read %>% mutate(across(matches("Val"),as.character))
values = "303|3050"
subdf = df %>% filter(grepl(values,do.call(paste,c(df[,diag_cols],sep = ","))))
With this code Row1 is true as it has "va303" in Val_1.
I tried doing with taking values = "^303|^305"
but that gives wrong output
TIA!
CodePudding user response:
A dplyr
solution
library(dplyr)
df %>%
filter(if_any(starts_with("Val"), ~ grepl("^303|^305", .x)))
BEN_ID Val_1 Val_2 Val_3 Val_4 Val_5 AGE GENDER
4 ID2 30302 5939 2768 4019 2724 65 M
5 ID2 305A1 78652 9190 4019 33829 61 M
6 ID3 305A3 29590 5715 . . 53 M
8 ID3 5693 78900 30590 30500 Z25H2 19 M
9 ID3 7AD59 7881 30301 78900 78791 57 M
10 ID4 7AD59 5780 53530 30390 3051 57 F
CodePudding user response:
An R base approach:
df[apply(df[, -c(1,7,8)], 1, function(x) any(grepl("^303|^305", x))), ]
BEN_ID Val_1 Val_2 Val_3 Val_4 Val_5 AGE GENDER
4 ID2 30302 5939 2768 4019 2724 65 M
5 ID2 305A1 78652 9190 4019 33829 61 M
6 ID3 305A3 29590 5715 . . 53 M
8 ID3 5693 78900 30590 30500 Z25H2 19 M
9 ID3 7AD59 7881 30301 78900 78791 57 M
10 ID4 7AD59 5780 53530 30390 3051 57 F