Home > Software design >  How to find match from list in data frame column
How to find match from list in data frame column

Time:10-26

I want to compare the value in one column to a list of values in another column. If the value is there I would like to put a 1 in a 3rd column indicating it found a match. Below is what I'm looking for

library(tidyverse)

df_original <- tribble(
~record_num, ~filedate, ~filedate_list, 
1, 1998, c(1998, 1999, 2000, 2001),
2, 1999, c(1998, 1999, 2000, 2001),
3, 2005, c(1998, 1999, 2000, 2001),
4, 2006, c(1998, 1999, 2000, 2001),
)

I would like the output to look like this

df_solution<- tribble(
~record_num, ~filedate, ~filedate_list, ~match_found,
1, 1998, c(1998, 1999, 2000, 2001), 1, 
2, 1999, c(1998, 1999, 2000, 2001), 1, 
3, 2005, c(1998, 1999, 2000, 2001), 0,
4, 2006, c(1998, 1999, 2000, 2001), 0
)

Below is what I've already attempted (this results in a "match_found" column of all 0s

incorrect_solution <- df %>%
mutate(match_found = if_else(filedate %in% filedate_list, 1, 0)
)

Any ideas?

CodePudding user response:

A base R option using mapply %in%

> transform(df_original, match_found =  mapply(`%in%`, filedate, filedate_list))
  record_num filedate          filedate_list match_found
1          1     1998 1998, 1999, 2000, 2001           1
2          2     1999 1998, 1999, 2000, 2001           1
3          3     2005 1998, 1999, 2000, 2001           0
4          4     2006 1998, 1999, 2000, 2001           0

CodePudding user response:

library(tidyverse)

df_original <- tribble(
~record_num, ~filedate, ~filedate_list, 
1, 1998, c(1998, 1999, 2000, 2001),
2, 1999, c(1998, 1999, 2000, 2001),
3, 2005, c(1998, 1999, 2000, 2001),
4, 2006, c(1998, 1999, 2000, 2001),
)
df_original %>% rowwise() %>% 
  mutate(match_found = filedate %in% filedate_list)

#> # A tibble: 4 × 4
#> # Rowwise: 
#>   record_num filedate filedate_list match_found
#>        <dbl>    <dbl> <list>              <int>
#> 1          1     1998 <dbl [4]>               1
#> 2          2     1999 <dbl [4]>               1
#> 3          3     2005 <dbl [4]>               0
#> 4          4     2006 <dbl [4]>               0

CodePudding user response:

df_original %>%  mutate(match_found =  mapply(is.element,filedate,filedate_list))
  •  Tags:  
  • r
  • Related