I made a minimum reproducible example, but my real data is really huge
sat_score<-c(100,4,30,4,20,77,99)
state <-c("NC","NC","CA","WA","NC","SC","NY")
id <- 1: 7
score_1 <-c(1, 1, 0.99, 1, 1, 1, 1)
score_2 <-c(1, 0.99, 1, 1, 1, 1, 1)
score_3 <-c(1, 0.99, 1, 1, 1, 1, 0.99)
score_4 <-c(1, 1, 0.99, 1, 1, 0.99, 1)
data<-data.frame(sat_score,state,id,score_1,score_2,score_3,score_4)
so, the data is like this:
sat_score state id score_1 score_2 score_3 score_4
1 100 NC 1 1.00 1.00 1.00 1.00
2 4 NC 2 1.00 0.99 0.99 1.00
3 30 CA 3 0.99 1.00 1.00 0.99
4 4 WA 4 1.00 1.00 1.00 1.00
5 20 NC 5 1.00 1.00 1.00 1.00
6 77 SC 6 1.00 1.00 1.00 0.99
7 99 NY 7 1.00 1.00 0.99 1.00
across all the scores (in this example, I have 3 scores, but in my real data, it has 15 scores) I want to extract the rows (person) that at least one score is not 1.
For example, in this example, the rows that ID 2, 3, 6, 7 should be extracted because one of their scores is/are not 1 (but all the columns should be preserved)
How can I do this?
CodePudding user response:
Using the tidyverse:
library(tidyverse)
sat_score <- c(100,4,30,4,20,77,99)
state <- c("NC","NC","CA","WA","NC","SC","NY")
id <- 1:7
score_1 <- c(1, 1, 0.99, 1, 1, 1, 1)
score_2 <- c(1, 0.99, 1, 1, 1, 1, 1)
score_3 <- c(1, 0.99, 1, 1, 1, 1, 0.99)
score_4 <- c(1, 1, 0.99, 1, 1, 0.99, 1)
data <- data.frame(sat_score, state,id, score_1, score_2, score_3, score_4)
data %>%
filter(if_any(starts_with("score"), ~ . < 1))
sat_score state id score_1 score_2 score_3 score_4
1 4 NC 2 1.00 0.99 0.99 1.00
2 30 CA 3 0.99 1.00 1.00 0.99
3 77 SC 6 1.00 1.00 1.00 0.99
4 99 NY 7 1.00 1.00 0.99 1.00