I have two dataframes like this:
First | Second | Third | Fourth |
---|---|---|---|
12345 | 23423 | 12342 | 43214 |
23453 | 22342 | 23421 | 43214 |
23453 | NA | NA | NA |
Code | Type |
---|---|
12345 | A |
23453 | B |
23423 | Other |
12342 | A |
I would like to add a fifth column to the first data frame based on the codes in each row. Using the second dataframe, if all the codes in a row are Type A, I would like to insert "A" into the fifth column.
If all the codes in a row are Type B, I would like to insert "B" in the fifth column.
If all the codes in a row are Type "Other", I would like to insert "Other".
If there are codes that are both Type A and B, I would like to insert "Both".
If there are codes that are both Type A, B, or A and B and Other, I would like to insert A, B, or Both, respectively.
I would also like to ignore all the NAs in a row.
Thank you so much!
CodePudding user response:
A bit (probably over-)complicated. First join all columns with their respective code, then unite
columns and remove NA
. Then check for unique values with the aid of map
and finally add the rule for "both".
library(tidyverse)
# add row to show whether rule "Both" works
df <- df %>% add_row(First = 12345, Second = 23453)
df %>%
left_join(type_df, by = c("First" = "Code")) %>%
left_join(type_df, by = c("Second" = "Code")) %>%
left_join(type_df, by = c("Third" = "Code")) %>%
left_join(type_df, by = c("Fourth" = "Code")) %>%
tidyr::unite(contains("Type"), col = "type", na.rm = TRUE, sep = ",") %>%
mutate(type = stringr::str_split(type, ",") %>%
map(., ~ unique(.x)) %>%
map_chr(., ~ paste(.x, collapse = ", "))) %>%
mutate(type = case_when(
str_detect(type, "A") & str_detect(type, "B") ~ "Both",
TRUE ~ type
))
Output:
First Second Third Fourth type
1 12345 23423 12342 43214 A, Other
2 23453 22342 23421 43214 B
3 23453 NA NA NA B
4 12345 23453 NA NA Both
Data:
df <- read.table(text = "First Second Third Fourth
12345 23423 12342 43214
23453 22342 23421 43214
23453 NA NA NA", h=T)
type_df <- read.table(text = "Code Type
12345 A
23453 B
23423 Other
12342 A", h = T)