Is there any existing R functionality to check if two columns have a one-to-one relationship (regardless of column type).
Example of expected output:
A B C
0 'a' 'apple'
1 'b' 'banana'
2 'c' 'apple'
A & B are one-to-one? TRUE
A & C are one-to-one? FALSE
B & C are one-to-one? FALSE
CodePudding user response:
If you match
a vector to itself it will return an integer vector giving the first index each unique value occurs at. We can compare these integer vectors directly:
is_one_to_one = function(x, y) {
xu = match(x, x)
yu = match(y, y)
identical(xy, yu)
}
You could then apply this to each pair of columns.
Wrapping it up in a function:
cor_1to1 = function(df) {
mat = vapply(df, \(x) match(x, x), FUN.VALUE = integer(nrow(df)))
nm = combn(colnames(mat), m = 2, FUN = paste, collapse = " :: ")
val = combn(colnames(mat), m = 2, FUN = function(i) {
identical(mat[, i[1]], mat[, i[2]])
}, simplify = TRUE)
setNames(val, nm)
}
# A :: B A :: C B :: C
# TRUE FALSE FALSE
CodePudding user response:
You can do:
one_to_one <- function(data){
data[] <- sapply(data, \(x) match(x, x))
pairs <- t(combn(seq_len(ncol(data)), 2))
cbind(t(matrix(colnames(data)[t(pairs)], nrow = 2)),
One2One = apply(pairs, 1, function(x) all(Reduce(`==`, data[, x])))) |>
as.data.frame()
}
test
one_to_one(df)
# V1 V2 One2One
#1 A B TRUE
#2 A C FALSE
#3 B C FALSE