Home > Net >  How can I cross check two data frames in R using dplyr with multiple conditions?
How can I cross check two data frames in R using dplyr with multiple conditions?

Time:06-07

I have a dataset that looks like this :

id1 sp
A SWO
B BFT
C HKE
D MUR
E OCC
F ALB
M ANN
N BFT

and a second one that contains the licenses of each id (but it could contain different ids)

id sp_lic
A SWO
B BFT
Z BFT
F SWO
S ALB
T SWO

The first data frame is the declared dataframe and the second one is the data frame that contains all the licenses paid.Now there are separate licenses for SWO for BFT and for ALB.

I want to check if the ids in the first dataframe are licensed to have what they declare. For example the id A is licensed to have SWO same as B but F is not licensed to have ALB because it's licensed to have SWO. The problem is that the N id has declared that has BFT but it is NA in the left_join() function in R and reports me "not needed" but I want to report "danger" because he/she has no license for that.

So there are three conditions here and ideally the final data frame must look like this:

id1 sp Check
A SWO ok
B BFT ok
C HKE not_needed
D MUR not_needed
E OCC not_needed
F ALB danger
M ANN not_needed
N BFT danger

How can I make this cross check and print these messages according to the logical condition in R using dplyr?

license = c("BFT","ALB","SWO")
library(tidyverse)
id =c("A","B","C","D","E","F","M","N")
sp = c("SWO","BFT","HKE","MUR","OCC","ALB","ANN","BFT")
data1 = tibble(id,sp);data1
id = c("A","B","Z","F","S","T")
sp_l = c("SWO","BFT","BFT","SWO","ALB","SWO")
data2 = tibble(id,sp_l);data2



left_join(data1, data2, by = "id")%>% 
 mutate(sp_l = case_when(is.na(sp_l) ~ "not_needed", 
                         sp == sp_l ~ "ok_checked", 
                         sp != sp_l ~ "danger", 
                         TRUE ~ NA_character_)) %>% 
 rename("check" = "sp_l")
# A tibble: 8 × 3
 id    sp    check     
 <chr> <chr> <chr>     
1 A     SWO   ok_checked
2 B     BFT   ok_checked
3 C     HKE   not_needed
4 D     MUR   not_needed
5 E     OCC   not_needed
6 F     ALB   danger    
7 M     ANN   not_needed
8 N     BFT   not_needed

CodePudding user response:

We could use case_when with the provided logic:

library(dplyr)

data1 %>% 
  left_join(data2) %>% 
  mutate(sp_l = case_when(sp == sp_l ~ "ok",
                          sp %in% license | !is.na(sp_l) ~ "danger",
                          is.na(sp_l) ~ "not_needed", 
                          TRUE ~ NA_character_)
         )    
  id    sp    sp_l      
  <chr> <chr> <chr>     
1 A     SWO   ok        
2 B     BFT   ok        
3 C     HKE   not_needed
4 D     MUR   not_needed
5 E     OCC   not_needed
6 F     ALB   danger    
7 M     ANN   not_needed
8 N     BFT   danger  
  • Related