Home > Blockchain >  Replace string by match to another data frame
Replace string by match to another data frame

Time:09-29

I want to replace the strings in column ID of df2 with the column genus of df1 based on the matching string in column species in df1. Any tips appreciated, especially dplyr. Maybe left_join?

> df1
             genus                 species
1  Orthobunyavirus           Variola virus
2 Alphatorquevirus     Torque teno virus 6
3     Yatapoxvirus Yaba-like disease virus

.

> df2
                       ID
1           Variola virus
2     Torque teno virus 6
3 Yaba-like disease virus

.

desired out
                         ID
1           Orthobunyavirus
2          Alphatorquevirus
3              Yatapoxvirus

> dput(df1)
structure(list(genus = c("Orthobunyavirus", "Alphatorquevirus", 
"Yatapoxvirus"), species = c("Variola virus", "Torque teno virus 6", 
"Yaba-like disease virus")), class = "data.frame", row.names = c(NA, 
-3L))
> dput(df2)
structure(list(ID = c("Variola virus", "Torque teno virus 6", 
"Yaba-like disease virus")), class = "data.frame", row.names = c(NA, 
-3L))

CodePudding user response:

You could simply use match

df2$ID <- df1$genus[match(df2$ID, df1$species)]

df2
#>                 ID
#> 1  Orthobunyavirus
#> 2 Alphatorquevirus
#> 3     Yatapoxvirus

CodePudding user response:

df2$ID <- df1$genus[match(df2$ID,df1$species)]

replaces it, removing your original df2 data

df3 <- data.frame(ID = df1$genus[match(df2$ID,df1$species)])

creates a third df with the results.

  • Related