Home > Back-end >  R: Add new column in dataframe based on values in another dataframe column with repetitive values
R: Add new column in dataframe based on values in another dataframe column with repetitive values

Time:02-23

My Dataframe1 looks like this, with Pattern and Name as my column headings:

Pattern         Name
Floral          Rose
Vector          Jess
Medieval        Monica
Victorian       Marta
Floral          Jane
Vector         Monica
Vector          Elise
Medieval        Jess
Floral          Monica

Then I have dataframe2, with Name and Rank as headings:

Name          Rank
Jess         Twenty
Elise         One
Jane          Two
Rose          Ten
Marta         Three
Monica        Five

I would like to add Dataframe1 to dataframe2, and have one row per name for Rank and Pattern to be something like this (where there is extra information for Pattern, they both merge into a single row and be separated by comma:

Name          Rank         Pattern
Jess         Twenty     Vector, Medieval
Elise         One            Vector
Jane          Two           Floral
Rose          Ten           Floral
Marta         Three        Victorian
Monica        Five       Medieval, Floral, Vector 

I have used:

dataframe2$Pattern <- Dataframe1$Pattern [match(dataframe2$Name, Dataframe1$Name)]

but it only captures one Pattern only. Is there any straightforward way to do it?

Cheers!

CodePudding user response:

Data:

df1 <- tribble(
  ~Pattern,       ~Name,
  "Floral",       "Rose",
  "Vector",       "Jess",
  "Medieval" ,    "Monica",
  "Victorian",    "Marta",
  "Floral",       "Jane",
  "Vector",       "Monica",
  "Vector",       "Elise",
  "Medieval",     "Jess",
  "Floral",       "Monica"
)

df2 <- tribble(
  ~Name,        ~Rank,
  "Jess",       "Twenty",
  "Elise",      "One",
  "Jane",       "Two",
  "Rose",       "Ten",
  "Marta",      "Three",
  "Monica",     "Five"
)

In first data.frame you need to find all Pattern for different Name and paste them together. You can achieve that by using group_by summarise str_c functions. Afterwards you need to join two tables by Name:

library(tidyverse)

df1 %>%
  group_by(Name) %>%
  summarise(Pattern = str_c(Pattern, collapse = ", ")) %>%
  inner_join(df2, by = "Name")

# A tibble: 6 x 3
  Name   Pattern                  Rank  
  <chr>  <chr>                    <chr> 
1 Elise  Vector                   One   
2 Jane   Floral                   Two   
3 Jess   Vector, Medieval         Twenty
4 Marta  Victorian                Three 
5 Monica Medieval, Vector, Floral Five  
6 Rose   Floral                   Ten 

CodePudding user response:

You also use a left join:

left_join(dataframe2,dataframe1,by='Name') %>%
group_by(Name, Rank) %>%
summarise(Pattern = str_c(Pattern, collapse = ", "))
  •  Tags:  
  • r
  • Related