How should I modify to only count the frequency of paired of value without considering the location/-CodePudding

I want to count the frequency of paired value in two column, but I want to ignore the paired location. Such as the example below, the general aggregate or table function would reports three paired values (-0.25, 0.9), (0.9, -0.25) and (-0.77,2.9), but what I want to get is only two pairs which are (-0.25, 0.9) and (-0.77,2.9). How should I modify to only count the frequency of paired of value without considering the column location/names?

data <- data.frame(col1=c(-.25, 0.9, -.25, -.77, -.25),
                   col2=c(0.9, -.25, 0.9, 2.9, 0.9))

CodePudding user response：

Try this

> data[!duplicated(cbind(do.call(pmax, data), do.call(pmin, data))), ]
   col1 col2
1 -0.25  0.9
4 -0.77  2.9

CodePudding user response：

One solution. First, we paste together the two columns:

paste(data$col1, data$col2)
[1] "-0.25 0.9" "0.9 -0.25" "-0.25 0.9" "-0.77 2.9" "-0.25 0.9"

Then split them into a list:

str_split(paste(data$col1, data$col2), " ")
[[1]]
[1] "-0.25" "0.9"  

[[2]]
[1] "0.9"   "-0.25"

[[3]]
[1] "-0.25" "0.9"  

[[4]]
[1] "-0.77" "2.9"  

[[5]]
[1] "-0.25" "0.9"

Create a custom function to sort and paste the values back together and sapply to the list:

count_function = function(x) {
    x = sort(x)
    paste(x, collapse=", ")
}
sapply(str_split(paste(data$col1, data$col2), " "), count_function)
[1] "-0.25, 0.9" "-0.25, 0.9" "-0.25, 0.9" "-0.77, 2.9" "-0.25, 0.9"

Then take the unique values of this vector:

unique(sapply(str_split(paste(data$col1, data$col2), " "), count_function))
[1] "-0.25, 0.9" "-0.77, 2.9"