I am having trouble finding this kind of solution. I am need to compare two columns but not in the same row in a data frame. Say the dataframe is this:
df <- data.frame (first_column = c("value_1", "value_5", "value_9", "value_13"),
second_column = c("value_2", "value_6", "value_10", "value_14"),
third_column = c("value_3", "value_7", "value_1", "value_15"),
fourth_column = c("value_4", "value_8", "value_12", "value_16")
)
I am wanting to compare col 1 to col 3 and see if there are any matches but not in the same row. And then mark or return the rows that have matching values.
Desired output:
first_column second_column third_column fourth_column match
1 value_1 value_2 value_3 value_4 TRUE
2 value_5 value_6 value_7 value_8 FALSE
3 value_9 value_10 value_1 value_12 TRUE
4 value_13 value_14 value_15 value_16 FALSE
or
first_column second_column third_column fourth_column
1 value_1 value_2 value_3 value_4
3 value_9 value_10 value_1 value_12
CodePudding user response:
df$match <- df$first_column %in% df$third_column | df$third_column %in% df$first_column
df
#> first_column second_column third_column fourth_column match
#> 1 value_1 value_2 value_3 value_4 TRUE
#> 2 value_5 value_6 value_7 value_8 FALSE
#> 3 value_9 value_10 value_1 value_12 TRUE
#> 4 value_13 value_14 value_15 value_16 FALSE
Created on 2022-03-22 by the reprex package (v2.0.1)
CodePudding user response:
Here's a for
loop to do it:
match = logical(nrow(df))
for(i in 1:nrow(df)) {
match[i] = df$first_column[i] %in% df$third_column[-i] |
df$third_column[i] %in% df$first_column[-i]
}
df$match = match
# df
# first_column second_column third_column fourth_column match
# 1 value_1 value_2 value_3 value_4 TRUE
# 2 value_5 value_6 value_7 value_8 FALSE
# 3 value_9 value_10 value_1 value_12 TRUE
# 4 value_13 value_14 value_15 value_16 FALSE
CodePudding user response:
anothjer approach, using intersect
df$match <- rowSums(df[, c(1,3)] == intersect(df$first_column, df$third_column)) > 0