For some context, I wanna have a way to filter for a certain value in a certain test. If a person has the required value, all his tests and their results should remain in the data. If not, all their data should be removed. The following code shoul help.
person <- c('pers1', 'pers1', 'pers2', 'pers2', 'pers2', 'pers3', 'pers3', 'pers4', 'pers5', 'pers5', 'pers6')
test <-c('a', 'b', 'a', 'b', 'c', 'b', 'c', 'a' , 'a', 'c', 'b' )
value <-c(2, 3, 4, 2, 1, 5, 7, 4, 1, 3, 1)
data <- data.frame(person,test,value)
head(data,20)
With the following code, I'm removing all the people who haven't had an "a" test. Which are person 3 and 6. I'm keeping every person who have had a test "a" done, with all their other test's stored aswell, so I later on can do some statistics and correlations.
data1 <- data[data$person %in% data[data$test=='a',]$person,]
data1
However, I want to add another layer on the "filtering". I want to filter out the people who's value in test "a" are 3 or above. Which means I would only have Person 1 and 5 left (with their other tests aswell.) to make things clear, this is what I want to have left:
person1 <- c('pers1', 'pers1', 'pers5', 'pers5')
test1 <- c('a', 'b', 'a', 'c')
value1 <- c(2,3,1,3)
data1 <- data.frame(person1,test1,value1)
data1
I hope this is enough data for you to work with. First time here posting a code.
CodePudding user response:
You can add that requirement to your sub-query
data[data$person %in% data[data$test=='a' & data$value<3,]$person,]
# person test value
# 1 pers1 a 2
# 2 pers1 b 3
# 9 pers5 a 1
# 10 pers5 c 3
If you wanted to use dplyr
, you could use group_by()
and filter()
library(dplyr)
data %>%
group_by(person) %>%
filter(any(test=="a" & value < 3))
CodePudding user response:
Using ave
from base R
subset(data, ave(test == 'a' & value < 3, person, FUN = any))
person test value
1 pers1 a 2
2 pers1 b 3
9 pers5 a 1
10 pers5 c 3