Given a dataframe
structure(list(ID = c("ID001", "ID002", "ID003", "ID004", "ID005",
"ID006"), Column2 = c(1L, 0L, 1L, 0L, 1L, 1L), Column3 = c(0L,
0L, 0L, 0L, 0L, 0L), Column4 = c(1L, 0L, 1L, 1L, 0L, 0L), Column5 = c(1L,
1L, 0L, 0L, 1L, 1L)), class = "data.frame", row.names = c(NA,
-6L))
How would one return the IDs that are in rows that have Columns with more than a single one? For example row 1 has 3 ones in separate columns, which are Column 2,4,5 so ID001 should be returned. however, row 2 should not be returned because only Column 5 has 1 and no others.
These Ids should be returned Id001, ID003, ID005, ID006
CodePudding user response:
With dplyr
, you can filter
to keep only the rows with more than a single one. Then, you can pull
the ID
to return just the ID
s.
library(dplyr)
df %>%
filter(rowSums(select(., -ID)) > 1) %>%
pull(ID)
# [1] "ID001" "ID003" "ID005" "ID006"
CodePudding user response:
Create a logical expression with rowSums
on the columns other than the first one, subset
the data and extract the 'ID'
subset(df, rowSums(df[-1] == 1) > 1)$ID
[1] "ID001" "ID003" "ID005" "ID006"
CodePudding user response:
Base R using which
:
df$ID[which(rowSums(df[-1])>1)]
[1] "ID001" "ID003" "ID005" "ID006"