I have a matrix:
A<-t(matrix(
c(0, 0, 1,
0, 0, 0,
0, 0, 1,
0, 0, 1,
0, 0, 0,
1, 1, 0), 3, 6))
and I need to keep columns that appear only once. So, the expected result is just the 3rd column: (1, 0, 1, 1, 0, 0).
I have found the unique and duplicated functions but I need something stronger to delete all columns that appear more than once (in my example the 1st and 2nd).
CodePudding user response:
Looks like we need a double duplicated
.
A[, !(duplicated(t(A)) | duplicated(t(A), fromLast = TRUE)), drop = FALSE]
[,1]
[1,] 1
[2,] 0
[3,] 1
[4,] 1
[5,] 0
[6,] 0
The idea applies to a vector, too.
x <- c(1, 1, 2, 3, 4, 3, 4, 5)
x[!(duplicated(x) | duplicated(x, fromLast = TRUE))]
[1] 2 5
CodePudding user response:
We can try the base R code below using aggregate
to summarize the uniqueness info of A
by columns
with(
aggregate(
. ~ id,
data.frame(id = c(col(A)), val = c(A)),
toString
),
A[, ave(id, val, FUN = length) == 1, drop = FALSE]
)
or equivalently
A[
,
with(
aggregate(
. ~ id,
data.frame(id = c(col(A)), val = c(A)),
toString
),
ave(id, val, FUN = length) == 1
),
drop = FALSE
]
which gives
[,1]
[1,] 1
[2,] 0
[3,] 1
[4,] 1
[5,] 0
[6,] 0
CodePudding user response:
Another possible solution:
A[,apply(sapply(which(duplicated(A, MARGIN = 2)),
\(x) sapply(1:ncol(A), \(y) all(A[,x] == A[,y]))), 1, \(z) all(!z)), drop = F]
#> [,1]
#> [1,] 1
#> [2,] 0
#> [3,] 1
#> [4,] 1
#> [5,] 0
#> [6,] 0
Or:
A[,colSums(outer(which(duplicated(A, MARGIN = 2)), 1:ncol(A),
Vectorize(\(x, y) all(A[,x] == A[,y])))) == 0, drop = F]
#> [,1]
#> [1,] 1
#> [2,] 0
#> [3,] 1
#> [4,] 1
#> [5,] 0
#> [6,] 0