Home > Back-end >  How do I find two rows with the same value, but in reverse order to remove one of them? in R
How do I find two rows with the same value, but in reverse order to remove one of them? in R

Time:10-21

everyone. I have a dataset of 1000 rows (nodes and links) with two columns V1 and V2 in txt, which I imported with read.table. There are rows in the dataset that are reversed, e.g:

net <- read.table("DD242.txt", quote="\"", comment.char="")

    V1 V2
    4  5
    5  4
    6  7
    7  8
    and so on...

but I do not know which values repeat. How do I find these repeating rows and delete one of them? In this case i want to remove the second row inverted= 5 4. So that I only have:

V1 V2
4  5
6  7
7  8

Thanks a lot!

CodePudding user response:

You can filter by lag:

library(dplyr)
df %>% 
  filter(!(V1 == lag(V2, default = 0) & V2 == lag(V1, default = 0)))

#  V1 V2
#1  4  5
#2  6  7
#3  7  8

Or in base R:

as.data.frame((df <- t(apply(df, 1, sort)))[!duplicated(df), ])

  V1 V2
1  4  5
2  6  7
3  7  8

CodePudding user response:

data.table solution

library(data.table)

setDT(net) # or use fread instead of read.table to get a data.table right away

net <- net[, sorted := apply(.SD, 1, function(x) list(unname(sort(x))))][!duplicated(sorted)]
net[, sorted := NULL]

results

net

   V1 V2
1:  4  5
2:  6  7
3:  7  8

data

net <- data.frame(
  V1 = c(4, 5, 6, 7),
  V2 = c(5, 4, 7, 8)
)
  • Related