Home > OS >  R: Compare elements from a column based upon other column conditions?
R: Compare elements from a column based upon other column conditions?

Time:03-27

I would like to create a new df, based upon whether the second or third condition's for each subject are greater than the first condition.

Example df:

df1 <- data.frame(subject = rep(1:5, 3),
                  condition = rep(c("first", "second", "third"), each = 5),
                  values = c(.4, .4, .4, .4, .4, .6, .6, .6, .6, .4, .6, .6, .6, .4, .4))
> df1
   subject condition values
1        1     first    0.4
2        2     first    0.4
3        3     first    0.4
4        4     first    0.4
5        5     first    0.4
6        1    second    0.6
7        2    second    0.6
8        3    second    0.6
9        4    second    0.6
10       5    second    0.4
11       1     third    0.6
12       2     third    0.6
13       3     third    0.6
14       4     third    0.4
15       5     third    0.4

The resulting df would be this:

> df2
   subject condition values
1        1     first    0.4
2        2     first    0.4
3        3     first    0.4
4        4     first    0.4
6        1    second    0.6
7        2    second    0.6
8        3    second    0.6
9        4    second    0.6
11       1     third    0.6
12       2     third    0.6
13       3     third    0.6
14       4     third    0.4

Here, subject #5 does not meet the criteria. This is because only subject #5's values are not greater than the first condition in either the second or third condition.

Thanks.

CodePudding user response:

We may group by 'subject' and filter if any of the second or third 'values' are greater than 'first'

library(dplyr)
df1 %>% 
 group_by(subject) %>%
 filter(any(values[2:3] > first(values))) %>%
 ungroup

-output

# A tibble: 12 × 3
   subject condition values
     <int> <chr>      <dbl>
 1       1 first        0.4
 2       2 first        0.4
 3       3 first        0.4
 4       4 first        0.4
 5       1 second       0.6
 6       2 second       0.6
 7       3 second       0.6
 8       4 second       0.6
 9       1 third        0.6
10       2 third        0.6
11       3 third        0.6
12       4 third        0.4

CodePudding user response:

Using ave.

df1[with(df1, ave(values, subject, FUN=\(x) any(x[2:3] > x[1])) == 1), ]
#    subject condition values
# 1        1     first    0.4
# 2        2     first    0.4
# 3        3     first    0.4
# 4        4     first    0.4
# 6        1    second    0.6
# 7        2    second    0.6
# 8        3    second    0.6
# 9        4    second    0.6
# 11       1     third    0.6
# 12       2     third    0.6
# 13       3     third    0.6
# 14       4     third    0.4
  •  Tags:  
  • r
  • Related