I have 2 dataframes (both dataframes have 1 column each) and I want to search for strings present in the 1st column in the 1st dataframe for their presence in each row in the 2nd column of the other dataframe. If present, return the string value in a new column ("String") and a boolean column ("Match"). I tried a few commands like grepl and stringr but could not make it work. Thanks!
Sample below:
1st Dataframe
SName |
---|
svc1 |
svc123 |
svc567 |
2nd Dataframe
Description |
---|
- ls svc368 -@#@# |
mkdir test svc #*-/ |
mkdir df2 svc123 #*-/ |
mkdir random svc1 #*-/ |
mkdir test svc1 *&%^$%$ |
mkdir fr svc567 *&%@ |
mkdir 82 svc56 *&??// |
mkdir kol svc *& |
Result desired:
Description | Match | String |
---|---|---|
- ls svc368 -@#@# | No | |
mkdir test svc #*-/ | No | |
mkdir df2 svc123 #*-/ | Yes | svc123 |
mkdir random svc1 #*-/ | Yes | svc1 |
mkdir test svc1 *&%^$%$ | Yes | svc1 |
mkdir fr svc567 *&%@ | Yes | svc567 |
mkdir 82 svc56 *&??// | No | |
mkdir kol svc *& | No |
CodePudding user response:
One approach would be to form a regex alternation of the terms in the first dataframe. Then use grepl
and sub
to generate the output columns.
regex <- paste0("\\b(", paste(df1$SName, collapse="|"), ")\\b")
df2$match <- ifelse(grepl(regex, df2$Description), "Yes", "No")
df2$String <- ifelse(grepl(regex, df2$Description),
sub(paste0(".*", regex, ".*"), "\\1", df2$Description),
"")
df2
Description match String
1 - ls svc368 -@#@# No
2 mkdir test svc #*-/ No
3 mkdir df2 svc123 #*-/ Yes svc123
...