Let's say I have two dataframes like below, (real dataset has many more rows and cols)
df = data.frame("Worker" = c("JBB","JDD","MB","JBB"),
"Age" = c(4,5,6,4))
df2 = data.frame("Initials" = c("JBB","JDD","MB","JOD"),
"Worker" = c("Joe Bloggs/JBB", "Jane Doe/JDD",
"Mr. Big/MB", "John Doe/JOD"))
I would like to replace the Worker
col in df
with the Worker
col from df2
In the future more workers will be added to both dataframes so it would be nice if there was a quick and easy way to do this rather than manually doing something like this for each set of initials
df$Worker<-paste(gsub("JBB", "Joe Bloggs/JBB", df$Worker, perl=TRUE))
Perhaps a loop or simply some kind of tidyverse::replace
solution
I have tried various joins
but they don't work for me.
Have also tried
df %>%
mutate(new_Worker = case_when(df$Worker == df2$Initials ~ df2$Worker)
This gives errors too.
CodePudding user response:
A possible solution:
library(dplyr)
inner_join(df, df2, by = c("Worker" = "Initials"))
#> Worker Age Worker.y
#> 1 JBB 4 Joe Bloggs/JBB
#> 2 JDD 5 Jane Doe/JDD
#> 3 MB 6 Mr. Big/MB
#> 4 JBB 4 Joe Bloggs/JBB
CodePudding user response:
A simple solution is to rename the Worker
column in df
as you do the join
left_join(rename(df, Initials = Worker),
df2)
this results in the data.frame
with columns 'Initials', 'Worker' and 'Age'. It also assumes that df
is the data and that df2
is the lookup list.
I don't think you would use case_when
for this example. Presumably there is a large number of Initials.
The other option is to filter
and pull
the values from df2.
df |>
mutate(Worker = map_chr(Worker,
~ filter(df2, Initials == .x) |>
pull(Worker)
)
)
The map_chr
above is needed otherwise the nested filter does not work