I'm working with a dataset with two columns that look something like this:
Row1 | Row2 |
---|---|
1, 2 | 2, 5 |
2, 6, 4 | 2, 6 |
3, 1 | 1, 3 |
2, 1, 4 | 1, 4, 2 |
3 | 3, 2 |
I want to run a script that allows me to identify whether Row2 matches Row1. The need to have the same exact values, but they don't need to be in the same order. So given the above, I'd want a result that tells me the following:
Row1 | Row2 | Match |
---|---|---|
1, 2 | 2, 5 | FALSE |
2, 6, 4 | 2, 6 | FALSE |
3, 1 | 1, 3 | TRUE |
2, 1, 4 | 1, 4, 2 | TRUE |
3 | 3, 2 | FALSE |
I've tried using match() and compare() and haven't found success with either. Match() produces TRUE as long as all the elements of Row1 are found in Row2, but this isn't what I'm looking for. I need to produce TRUE only when Row2 has the same exact numbers as Row1 and only those numbers, irrespective of order. On the other hand, Compare() produces an error if I try to create a new column to identify matches. This is what I enter:
df$match <- compareIgnoreOrder(df$row1, df$row2)
I've also tried this way:
df$match <- compare(df$row1, df$row2, ignoreAll = TRUE)
Both methods yield the following error: "Input must be a vector, not a object." And at this point I'm stuck. I've searched high and low but can't find any solutions. Help would be much appreciated.
CodePudding user response:
Something like:
data %>%
rowwise() %>%
mutate(Match = length(intersect(Row1,Row2)) == length(union(Row1,Row2)))
Output:
Row1 Row2 Match
<list> <list> <lgl>
1 <dbl [2]> <dbl [2]> FALSE
2 <dbl [3]> <dbl [2]> FALSE
3 <dbl [2]> <dbl [2]> TRUE
4 <dbl [3]> <dbl [3]> TRUE
5 <dbl [1]> <dbl [2]> FALSE
Input:
data <- tibble(
Row1 = list(c(1,2), c(2,6,4), c(3,1), c(2,1,4), c(3)),
Row2 = list(c(2,5), c(2,6), c(1,3), c(1,4,2), c(3,2))
)
CodePudding user response:
You're comparing sets, so a set operation like ?setequal
makes sense to me:
dat <- data.frame(
Row1 = I(list(c(1,2), c(2,6,4), c(3,1), c(2,1,4), c(3))),
Row2 = I(list(c(2,5), c(2,6), c(1,3), c(1,4,2), c(3,2)))
)
mapply(setequal, dat$Row1, dat$Row2)
##[1] FALSE FALSE TRUE TRUE FALSE