Home > Enterprise >  How best to do this join in R?
How best to do this join in R?

Time:09-16

Below is the sample data. I know that I have to do a left join. The question is how to have it only return values that match (indcodelist = indcodelist2) but with the highest codetype value.

 indcodelist <- c(110000,111000,112000,113000,114000,115000,121000,210000,211000,315000)
 estemp <- c(11,21,31,41,51,61,55,21,22,874)
 projemp <- c(15,25,36,45,52,61,31,29,31,899)
 nchg <- c(4,4,5,4,1,0,-24,8,9,25)


 firsttable <- data.frame(indcodelist,estemp,projemp,nchg)


  indcodelist2 <- c(110000,111000,112000,113000,114000,115000,121000,210000,211000,315000,110000,111000,112000,113000)
  codetype <- c(18,18,18,18,18,18,18,18,18,18,10,10,10,10)
  codetitle <- c("Accountant","Doctor","Lawyer","Teacher","Economist","Financial Analyst","Meteorologist","Dentist", "Editor","Veterinarian","Accounting Technician","Doctor","Lawyer","Teacher")
  secondtable <- data.frame(indcodelist2,codetype,codetitle)


  firsttable <- left_join(firsttable,secondtable, by =c(indcodelist = indcodelist2)

  Desired Result

  indcodelist       estemp     projemp      nchg        codetitle   

  110000              11          15           4           Accountant
  111000              21          25           4           Doctor

CodePudding user response:

If you only want values that match in both tables, inner_join might be what you’re looking for. You can see this answer to understand different types of joins.

To get the highest codetype, you can use dplyr::slice_max(). Be aware the default behavior is to return values that tie. If there is more than one codetitle at the same codetype, they’ll all be returned.

library(tidyverse)
firsttable %>%
  inner_join(., secondtable, by = c("indcodelist" = "indcodelist2")) %>%
  group_by(indcodelist) %>%
  slice_max(codetype)
#> # A tibble: 10 × 6
#> # Groups:   indcodelist [10]
#>    indcodelist estemp projemp  nchg codetype codetitle        
#>          <dbl>  <dbl>   <dbl> <dbl>    <dbl> <chr>            
#>  1      110000     11      15     4       18 Accountant       
#>  2      111000     21      25     4       18 Doctor           
#>  3      112000     31      36     5       18 Lawyer           
#>  4      113000     41      45     4       18 Teacher          
#>  5      114000     51      52     1       18 Economist        
#>  6      115000     61      61     0       18 Financial Analyst
#>  7      121000     55      31   -24       18 Meteorologist    
#>  8      210000     21      29     8       18 Dentist          
#>  9      211000     22      31     9       18 Editor           
#> 10      315000    874     899    25       18 Veterinarian

Created on 2022-09-15 by the reprex package (v2.0.1)

  • Related