Home > Back-end >  Add column in first data frame based upon two columns in second data frame
Add column in first data frame based upon two columns in second data frame

Time:05-08

I am trying to add a column to a first data frame based upon a second data frame. Basically, in the data frame 1, I have values, that are existing in data frame 2 but with additional information that I would like to extract into data frame 1.

Down below a MWE:

Class <- c("Fruit", "Fruit", "Vegetable", "Vegetable")
Food_a <- c("Apple", "Banana", "Onion", "Carrot")
dt_a <- data.frame(Class, Food_a)

dt_a
      Class Food_a
1     Fruit  Apple
2     Fruit Banana
3 Vegetable  Onion
4 Vegetable Carrot
   
Food_b <- c("Apple", "Bread", "Onion", "Banana")
dt_b <- data.frame(Food_b)

  Food_b
1  Apple
2  Bread
3  Onion
4 Banana

Desired output:

  Food_b Class
1  Apple Fruit
2  Bread NA
3  Onion Vegetable
4 Banana Fruit

I have looked at other questions on SO but I cannot seem to get them to work on my example. The closes one I could find were these two: R dplyr: Find a specific value in a column, then replace the adjacent cell in the subsequent columns to the right with that value and Replace NA in column with value in adjacent column.

As I have two data frames, I am unsure how to apply this.

I thought about using ifelse or case_when but I don't know how to link these to the desired Class Here is what I came up with so far:

dt_c <- dt_b %>% mutate(Class=(ifelse(dt_b$Food_b %in% dt_a$Food_a, dt_a$Class, NA)))

  Food_b     Class
1  Apple     Fruit
2  Bread      <NA>
3  Onion Vegetable
4 Banana Vegetable

I would be very grateful for any help and suggestions!

CodePudding user response:

As you are using the tidyverse you just need a left_join...

dt_b %>% left_join(dt_a %>% rename(Food_b = Food_a))

Joining, by = "Food_b"

  Food_b     Class
1  Apple     Fruit
2  Bread      <NA>
3  Onion Vegetable
4 Banana     Fruit

The rename is to ensure there is a column with the same name in each dataframe for joining. You can also use a by argument in the left_join to achieve the same thing (dt_b %>% left_join(dt_a, by = c("Food_b" = "Food_a"))).

  • Related