Home > Net >  I want to join two data frames one with a year range and the other with a year
I want to join two data frames one with a year range and the other with a year

Time:11-02

I have two data frames that look like this

data1<-data.frame(name=c("Jessica_Smith","John_Smith","John_Smith") ,
max_year=c(2000,1989,2005), 
min_year=c(1990,1989,2001))

data2<-data.frame(name=c("Jessica_Smith","John_Smith","John_Smith") ,
year_of_birth=c(1995,1989,2002),unique_id=c("aby37","nf93","fnd34"))

I would like to merge the data frames such that for each person in data1 I have a unique_id. If possible I would also like to get the nearest integer match for those for whom there is no overlap. To clarify I can't just left_join on name because some entries have the same name but different dates of birth.

I'm relatively new to R so I don't even really know where to begin.

Much appreciated!

CodePudding user response:

You may try,

data1 %>% 
  left_join(data2, by = "name") %>%
  rowwise %>%
  mutate(year_key = min_year <= year_of_birth && year_of_birth <= max_year) %>%
  filter(year_key == TRUE) %>% 
  select(-year_key)

  name          max_year min_year year_of_birth unique_id
  <chr>            <dbl>    <dbl>         <dbl> <chr>    
1 Jessica_Smith     2000     1990          1995 aby37    
2 John_Smith        1989     1989          1989 nf93     
3 John_Smith        2005     2001          2002 fnd34  

CodePudding user response:

With data.table you can to non-equal joins:

library(data.table)
data1<-data.table(name=c("Jessica_Smith","John_Smith","John_Smith") ,
                  max_year=c(2000,1989,2005), 
                  min_year=c(1990,1989,2001))

data2<-data.table(name=c("Jessica_Smith","John_Smith","John_Smith") ,
                  year_of_birth=c(1995,1989,2002),unique_id=c("aby37","nf93","fnd34"))

data2[data1, .(name, year_of_birth, unique_id), 
      on = .(name, year_of_birth <= max_year, year_of_birth >= min_year)]

            name year_of_birth unique_id
1: Jessica_Smith          2000     aby37
2:    John_Smith          1989      nf93
3:    John_Smith          2005     fnd34
  • Related