I want to return the value city_df[3,2] in a new column (in the same row as the matching vector value) because it matches with city_df[1,1]. How can I do this without using the actual names, it needs to be generalizable because I have to use it on a huge dataset. With other words: I need help with how to add the column "name3".
name <- c("City1", "City2", "City3", "City4", "City5")
name2 <- c("City6", "City7", "City1", "City9", "City10")
name3 <- c("City1", "NA", "NA", "NA", "NA") ## I need help to create this column...
city_df <- data.frame(name, name2, name3 )
city_df
Desired output:
name name2 name3
1 City1 City6 City1
2 City2 City7 NA
3 City3 City1 NA
4 City4 City9 NA
5 City5 City10 NA
I have tried loops, ifelse, etc. but with no luck.I only manage to return if the vector name are in the colum (using %in%).
CodePudding user response:
You could use dplyr
's left_join
:
library(dplyr)
city_df %>%
left_join(city_df, by = c("name" = "name2"), suffix = c("", ".y"), keep = TRUE) %>%
select(name, name2, name3 = name2.y)
This returns
name name2 name3
1 City1 City6 City1
2 City2 City7 <NA>
3 City3 City1 <NA>
4 City4 City9 <NA>
5 City5 City10 <NA>
Another possibility could be
city_df$name3 <- ifelse(city_df$name %in% city_df$name2, city_df$name, NA_character_)
creating the same output.
CodePudding user response:
You can use intersect
, conveniently with do.call
, since df[1:2]
is already a list. To prevent recycling, we cann fill with NA
using `length<-`
.
city_df$name3 <- `length<-`(do.call(intersect, unname(city_df[1:2])), nrow(city_df))
city_df
# name name2 name3
# 1 City1 City6 City1
# 2 City2 City7 <NA>
# 3 City3 City1 <NA>
# 4 City4 City9 <NA>
# 5 City5 City10 <NA>