I have a dataset that looks like this :
id1 | sp |
---|---|
A | SWO |
B | BFT |
C | HKE |
D | MUR |
E | OCC |
F | ALB |
M | ANN |
N | BFT |
and a second one that contains the licenses of each id (but it could contain different ids)
id | sp_lic |
---|---|
A | SWO |
B | BFT |
Z | BFT |
F | SWO |
S | ALB |
T | SWO |
The first data frame is the declared dataframe and the second one is the data frame that contains all the licenses paid.Now there are separate licenses for SWO for BFT and for ALB.
I want to check if the ids in the first dataframe are licensed to have what they declare. For example the id A is licensed to have SWO same as B but F is not licensed to have ALB because it's licensed to have SWO. The problem is that the N id has declared that has BFT but it is NA in the left_join() function in R and reports me "not needed" but I want to report "danger" because he/she has no license for that.
So there are three conditions here and ideally the final data frame must look like this:
id1 | sp | Check |
---|---|---|
A | SWO | ok |
B | BFT | ok |
C | HKE | not_needed |
D | MUR | not_needed |
E | OCC | not_needed |
F | ALB | danger |
M | ANN | not_needed |
N | BFT | danger |
How can I make this cross check and print these messages according to the logical condition in R using dplyr
?
license = c("BFT","ALB","SWO")
library(tidyverse)
id =c("A","B","C","D","E","F","M","N")
sp = c("SWO","BFT","HKE","MUR","OCC","ALB","ANN","BFT")
data1 = tibble(id,sp);data1
id = c("A","B","Z","F","S","T")
sp_l = c("SWO","BFT","BFT","SWO","ALB","SWO")
data2 = tibble(id,sp_l);data2
left_join(data1, data2, by = "id")%>%
mutate(sp_l = case_when(is.na(sp_l) ~ "not_needed",
sp == sp_l ~ "ok_checked",
sp != sp_l ~ "danger",
TRUE ~ NA_character_)) %>%
rename("check" = "sp_l")
# A tibble: 8 × 3
id sp check
<chr> <chr> <chr>
1 A SWO ok_checked
2 B BFT ok_checked
3 C HKE not_needed
4 D MUR not_needed
5 E OCC not_needed
6 F ALB danger
7 M ANN not_needed
8 N BFT not_needed
CodePudding user response:
We could use case_when
with the provided logic:
library(dplyr)
data1 %>%
left_join(data2) %>%
mutate(sp_l = case_when(sp == sp_l ~ "ok",
sp %in% license | !is.na(sp_l) ~ "danger",
is.na(sp_l) ~ "not_needed",
TRUE ~ NA_character_)
)
id sp sp_l
<chr> <chr> <chr>
1 A SWO ok
2 B BFT ok
3 C HKE not_needed
4 D MUR not_needed
5 E OCC not_needed
6 F ALB danger
7 M ANN not_needed
8 N BFT danger