Apply multiple conditions to single column-CodePudding

Code	count	AA	BB	CC
101	1	No	NO	4
101	2	Yes	NO	5
101	3	Yes	NO	10
102	1	Yes	NO	7
102	2	Yes	NO	40
102	3	Yes	NO	6
102	4	No	NO	12

I want to apply the condition as,

If the count column is 1 with respect to code column then AA should be "NO" and BB should be "NO".
For count between the max and min count with respect to code column then AA can be "NO" or "YES" and BB should be "NO".
For the max count column with respect to code column then AA should be "NO" and BB should be "NO".

Code	count	AA	BB	CC
101	1	No	NO	4
101	2	Yes	NO	5
102	2	Yes	NO	40
102	3	Yes	NO	6
102	4	No	NO	12

Hi,@Darren Tsai Whatever might be the case if the count column is 1 then it is getting deleted completely, by using you code I am getting the below output

Code	count	AA	BB	CC
101	2	Yes	NO	5
102	2	Yes	NO	40
102	3	Yes	NO	6
102	4	No	NO	12

CodePudding user response：

A dplyr solution:

library(dplyr)

df %>%
  group_by(Code) %>%
  mutate(flag = count %in% range(count)) %>%
  filter(flag & if_all(c(AA, BB), ~ toupper(.x) == 'NO') | !flag & toupper(BB) == 'NO') %>%
  ungroup() %>%
  select(-flag)

# # A tibble: 5 × 5
#    Code count AA    BB       CC
#   <int> <int> <chr> <chr> <int>
# 1   101     1 No    NO        4
# 2   101     2 Yes   NO        5
# 3   102     2 Yes   NO       40
# 4   102     3 Yes   NO        6
# 5   102     4 No    NO       12

A base equivalent:

df |>
  transform(flag = ave(count, Code, FUN = \(x) x %in% range(x))) |>
  subset(flag & toupper(AA) == 'NO' & toupper(BB) == 'NO' | !flag & toupper(BB) == 'NO', -flag)

Data

df <- structure(list(Code = c(101L, 101L, 101L, 102L, 102L, 102L, 102L),
count = c(1L, 2L, 3L, 1L, 2L, 3L, 4L), AA = c("No", "Yes",
"Yes", "Yes", "Yes", "Yes", "No"), BB = c("NO", "NO", "NO", "NO",
"NO", "NO", "NO"), CC = c(4L, 5L, 10L, 7L, 40L, 6L, 12L)), class = "data.frame", row.names = c(NA,-7L))

Update with another dataset

This dataset has 12 rows with 3 ID 8540, 2254, 607. After running my code the 2nd, 4th, 12th rows are removed.

library(dplyr)

df2 <- structure(list(Unique_Id = c(8540, 8540, 2254, 2254, 607, 607, 607, 607, 607, 607, 607, 607),
                      AA = c("No", "Yes", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No"),
                      count = c(1, 2, 1, 2, 1, 2, 3, 4, 5, 6, 7, 8),
                      BB = c("No", "Yes", "No", "Yes", "No", "No", "No", "No", "No", "No", "No", "Yes")),
                      class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -12L))

df2
# A tibble: 12 × 4
   Unique_Id AA    count BB
       <dbl> <chr> <dbl> <chr>
 1      8540 No        1 No
 2      8540 Yes       2 Yes
 3      2254 No        1 No
 4      2254 No        2 Yes
 5       607 No        1 No
 6       607 No        2 No   
 7       607 No        3 No
 8       607 No        4 No
 9       607 No        5 No
10       607 No        6 No
11       607 No        7 No
12       607 No        8 Yes

df2 %>%
  group_by(Unique_Id) %>%
  mutate(flag = count %in% range(count)) %>%
  filter(flag & if_all(c(AA, BB), ~ toupper(.x) == 'NO') | !flag & toupper(BB) == 'NO') %>%
  ungroup() %>%
  select(-flag)

# A tibble: 9 × 4
  Unique_Id AA    count BB
      <dbl> <chr> <dbl> <chr>
1      8540 No        1 No
2      2254 No        1 No
3       607 No        1 No
4       607 No        2 No
5       607 No        3 No
6       607 No        4 No
7       607 No        5 No
8       607 No        6 No   
9       607 No        7 No

Code	count	AA	BB	CC
101	1	No	NO	4
101	2	Yes	NO	5
101	3	Yes	NO	10
102	1	Yes	NO	7
102	2	Yes	NO	40
102	3	Yes	NO	6
102	4	No	NO	12

Code	count	AA	BB	CC
101	1	No	NO	4
101	2	Yes	NO	5
101	3	Yes	NO	10
102	1	Yes	NO	7
102	2	Yes	NO	40
102	3	Yes	NO	6
102	4	No	NO	12

Code	count	AA	BB	CC
101	1	No	NO	4
101	2	Yes	NO	5
101	3	Yes	NO	10
102	1	Yes	NO	7
102	2	Yes	NO	40
102	3	Yes	NO	6
102	4	No	NO	12