Home > Software engineering >  how to select multiple columns in r when max()>=1
how to select multiple columns in r when max()>=1

Time:12-02

This is:

y=data.frame( rv1=c(0,1,0,0,0), rv2=c(0,0,0,0,0), rv3=c(0,0,0,0,0), rv4=c(0,0,0,0,1), rv5=c(0,0,3,1,0))

output:

rv1 rv2 rv3 rv4 rv5
1 0 0 0 0 0
2 1 0 0 0 0
3 0 0 0 0 3
4 0 0 0 0 1
5 0 0 0 1 0

output expected: select columns where max()>1

rv1 rv4 rv5
1 0 0 0
2 1 0 0
3 0 0 3
4 0 0 1
5 0 1 0

This was as close as I could get:

y %>% colSums()>=1 %>%

Output: rv1 rv2 rv3 rv4 rv5 TRUE FALSE FALSE TRUE TRUE

CodePudding user response:

We can use select with a logical expression constructed within where using max and >=

library(dplyr)
y %>% 
    select(where(~ max(.) >= 1))

-output

  rv1 rv4 rv5
1   0   0   0
2   1   0   0
3   0   0   3
4   0   0   1
5   0   1   0

Or use colMaxs from matrixStats to create the logical vector

library(matrixStats)
y[colMaxs(as.matrix(y)) >=1]

Or using base R by looping over the columns with sapply/lapply or using apply with MARGIN = 2 to construct the logical vector

y[sapply(y, max) >= 1]
  • Related