Home > Blockchain >  extracting rows in r
extracting rows in r

Time:05-28

I made a minimum reproducible example, but my real data is really huge

sat_score<-c(100,4,30,4,20,77,99)
state <-c("NC","NC","CA","WA","NC","SC","NY")
id <- 1: 7
score_1 <-c(1, 1, 0.99, 1, 1, 1, 1)
score_2 <-c(1, 0.99, 1, 1, 1, 1, 1)
score_3 <-c(1, 0.99, 1, 1, 1, 1, 0.99)
score_4 <-c(1, 1, 0.99, 1, 1, 0.99, 1)
data<-data.frame(sat_score,state,id,score_1,score_2,score_3,score_4)

so, the data is like this:

 sat_score state id score_1 score_2 score_3 score_4
1       100    NC  1    1.00    1.00    1.00    1.00
2         4    NC  2    1.00    0.99    0.99    1.00
3        30    CA  3    0.99    1.00    1.00    0.99
4         4    WA  4    1.00    1.00    1.00    1.00
5        20    NC  5    1.00    1.00    1.00    1.00
6        77    SC  6    1.00    1.00    1.00    0.99
7        99    NY  7    1.00    1.00    0.99    1.00

across all the scores (in this example, I have 3 scores, but in my real data, it has 15 scores) I want to extract the rows (person) that at least one score is not 1.

For example, in this example, the rows that ID 2, 3, 6, 7 should be extracted because one of their scores is/are not 1 (but all the columns should be preserved)

How can I do this?

CodePudding user response:

Using the tidyverse:

library(tidyverse)
sat_score <- c(100,4,30,4,20,77,99)
state <- c("NC","NC","CA","WA","NC","SC","NY")
id <- 1:7
score_1 <- c(1, 1, 0.99, 1, 1, 1, 1)
score_2 <- c(1, 0.99, 1, 1, 1, 1, 1)
score_3 <- c(1, 0.99, 1, 1, 1, 1, 0.99)
score_4 <- c(1, 1, 0.99, 1, 1, 0.99, 1)
data <- data.frame(sat_score, state,id, score_1, score_2, score_3, score_4)

data %>% 
  filter(if_any(starts_with("score"), ~ . < 1))
  sat_score state id score_1 score_2 score_3 score_4
1         4    NC  2    1.00    0.99    0.99    1.00
2        30    CA  3    0.99    1.00    1.00    0.99
3        77    SC  6    1.00    1.00    1.00    0.99
4        99    NY  7    1.00    1.00    0.99    1.00
  • Related