Home > Back-end >  row 1 col 1 compare to row 3 col 3
row 1 col 1 compare to row 3 col 3

Time:03-22

I am having trouble finding this kind of solution. I am need to compare two columns but not in the same row in a data frame. Say the dataframe is this:

df <- data.frame (first_column = c("value_1", "value_5", "value_9", "value_13"),
              second_column = c("value_2", "value_6", "value_10", "value_14"),
              third_column = c("value_3", "value_7", "value_1", "value_15"),
              fourth_column = c("value_4", "value_8", "value_12", "value_16")
              )

I am wanting to compare col 1 to col 3 and see if there are any matches but not in the same row. And then mark or return the rows that have matching values.

Desired output:

  first_column second_column third_column fourth_column match
1      value_1       value_2      value_3       value_4 TRUE
2      value_5       value_6      value_7       value_8 FALSE
3      value_9      value_10      value_1      value_12 TRUE
4     value_13      value_14     value_15      value_16 FALSE

or

  first_column second_column third_column fourth_column
1      value_1       value_2      value_3       value_4
3      value_9      value_10      value_1      value_12

CodePudding user response:

df$match <- df$first_column %in% df$third_column | df$third_column %in% df$first_column
df

#>   first_column second_column third_column fourth_column match
#> 1      value_1       value_2      value_3       value_4  TRUE
#> 2      value_5       value_6      value_7       value_8 FALSE
#> 3      value_9      value_10      value_1      value_12  TRUE
#> 4     value_13      value_14     value_15      value_16 FALSE

Created on 2022-03-22 by the reprex package (v2.0.1)

CodePudding user response:

Here's a for loop to do it:

match = logical(nrow(df))
for(i in 1:nrow(df)) {
  match[i] = df$first_column[i] %in% df$third_column[-i] | 
    df$third_column[i] %in% df$first_column[-i]
}
df$match = match
# df
#   first_column second_column third_column fourth_column match
# 1      value_1       value_2      value_3       value_4  TRUE
# 2      value_5       value_6      value_7       value_8 FALSE
# 3      value_9      value_10      value_1      value_12  TRUE
# 4     value_13      value_14     value_15      value_16 FALSE

CodePudding user response:

anothjer approach, using intersect

df$match <- rowSums(df[, c(1,3)] == intersect(df$first_column, df$third_column)) > 0
  • Related