I am trying to add a column to a first data frame based upon a second data frame. Basically, in the data frame 1, I have values, that are existing in data frame 2 but with additional information that I would like to extract into data frame 1.
Down below a MWE:
Class <- c("Fruit", "Fruit", "Vegetable", "Vegetable")
Food_a <- c("Apple", "Banana", "Onion", "Carrot")
dt_a <- data.frame(Class, Food_a)
dt_a
Class Food_a
1 Fruit Apple
2 Fruit Banana
3 Vegetable Onion
4 Vegetable Carrot
Food_b <- c("Apple", "Bread", "Onion", "Banana")
dt_b <- data.frame(Food_b)
Food_b
1 Apple
2 Bread
3 Onion
4 Banana
Desired output:
Food_b Class
1 Apple Fruit
2 Bread NA
3 Onion Vegetable
4 Banana Fruit
I have looked at other questions on SO but I cannot seem to get them to work on my example. The closes one I could find were these two: R dplyr: Find a specific value in a column, then replace the adjacent cell in the subsequent columns to the right with that value and Replace NA in column with value in adjacent column.
As I have two data frames, I am unsure how to apply this.
I thought about using ifelse
or case_when
but I don't know how to link these to the desired Class
Here is what I came up with so far:
dt_c <- dt_b %>% mutate(Class=(ifelse(dt_b$Food_b %in% dt_a$Food_a, dt_a$Class, NA)))
Food_b Class
1 Apple Fruit
2 Bread <NA>
3 Onion Vegetable
4 Banana Vegetable
I would be very grateful for any help and suggestions!
CodePudding user response:
As you are using the tidyverse you just need a left_join
...
dt_b %>% left_join(dt_a %>% rename(Food_b = Food_a))
Joining, by = "Food_b"
Food_b Class
1 Apple Fruit
2 Bread <NA>
3 Onion Vegetable
4 Banana Fruit
The rename
is to ensure there is a column with the same name in each dataframe for joining. You can also use a by
argument in the left_join
to achieve the same thing (dt_b %>% left_join(dt_a, by = c("Food_b" = "Food_a"))
).