I have a data frame. You can see that some rows just differs in the order "A"-"B" and "B"-"A" and these two rows have the same Value
df <- tibble(
V1 = c("A", "C", "B","D"),
V2 = c("B", "D", "A","C"),
Value = c(1,2,1,2)
)
V1 V2 Value
<chr> <chr> <dbl>
1 A B 1
2 C D 2
3 B A 1
4 D C 2
I want to remove one duplicated rows 0 or 2, to make it like below
V1 V2 Value
0 A B 1
1 C D 2
How can I remove those repetitive rows?
CodePudding user response:
df[!duplicated(t(apply(df,1,sort))),]
V1 V2 Value
0 A B 1
1 C D 2
or even:
df[!duplicated(cbind(pmax(df$V1, df$V2), pmin(df$V1, df$V2))),]
V1 V2 Value
0 A B 1
1 C D 2
CodePudding user response:
An option with tidyverse
library(dplyr)
library(stringr)
library(purrr)
df %>%
filter(!duplicated(pmap_chr(across(V1:V2), ~ str_c(sort(c(...)),
collapse = ""))))
# A tibble: 2 × 3
V1 V2 Value
<chr> <chr> <dbl>
1 A B 1
2 C D 2