Home > front end >  Finding rows that have columns with matching values R
Finding rows that have columns with matching values R

Time:04-12

Given a dataframe

structure(list(ID = c("ID001", "ID002", "ID003", "ID004", "ID005", 
"ID006"), Column2 = c(1L, 0L, 1L, 0L, 1L, 1L), Column3 = c(0L, 
0L, 0L, 0L, 0L, 0L), Column4 = c(1L, 0L, 1L, 1L, 0L, 0L), Column5 = c(1L, 
1L, 0L, 0L, 1L, 1L)), class = "data.frame", row.names = c(NA, 
-6L))

How would one return the IDs that are in rows that have Columns with more than a single one? For example row 1 has 3 ones in separate columns, which are Column 2,4,5 so ID001 should be returned. however, row 2 should not be returned because only Column 5 has 1 and no others.

These Ids should be returned Id001, ID003, ID005, ID006

CodePudding user response:

With dplyr, you can filter to keep only the rows with more than a single one. Then, you can pull the ID to return just the IDs.

library(dplyr)

df %>% 
  filter(rowSums(select(., -ID)) > 1) %>% 
  pull(ID)

# [1] "ID001" "ID003" "ID005" "ID006"

CodePudding user response:

Create a logical expression with rowSums on the columns other than the first one, subset the data and extract the 'ID'

subset(df, rowSums(df[-1] == 1) > 1)$ID
[1] "ID001" "ID003" "ID005" "ID006"

CodePudding user response:

Base R using which:

df$ID[which(rowSums(df[-1])>1)]
[1] "ID001" "ID003" "ID005" "ID006"
  •  Tags:  
  • r
  • Related