I'm struggling with a problem in R. I want to create a new variable (qc) by group_by the variable (NAME and PLOT) using case_when for where "EH” > “PH” then give me B else give me Q......
I have a data set like this:
df <- tibble(
NAMEOFEXPERIMENT= c("A","A","A","A","A","A","A","B","B","B","B","B","B","B","B"),
PLOT= c(2,1,2,1,2,1,2,1,2,1,2,1,2,1,2),
trait= c("EH","NP","NP","PH","PH","PL","PL","EH","EH","NP","NP","PH","PH","PL","PL"),
traitValue= c(125,36,36,240,"NA",36,36,90,110,35,33,215,190,36,31)
)
# A tibble: 15 x 4
NAME PLOT trait traitValue
<chr> <dbl> <chr> <chr>
1 A 2 EH 250
2 A 1 NP 36
3 A 2 NP 36
4 A 1 PH 240
5 A 2 PH 200
6 A 1 PL 36
7 A 2 PL 36
8 B 1 EH 90
9 B 2 EH 110
10 B 1 NP 35
11 B 2 NP 33
12 B 1 PH 215
13 B 2 PH 190
14 B 1 PL 36
15 B 2 PL 31
This is what I want to achieve: If “EH” > “PH” then give me B else give me Q If “PL” > “NP” then give me B else give me Q
Thus, line qc line 4 to be empty since there is no NAME "A", PLOT 1, Trait "EH"
to compare with
# A tibble: 15 x 4
NAME PLOT trait traitValue dc
<chr> <dbl> <chr> <chr> <chr>
1 A 2 EH 250 B
2 A 1 NP 36 Q
3 A 2 NP 36 Q
4 A 1 PH 240
5 A 2 PH 200 B
6 A 1 PL 36 Q
7 A 2 PL 36 Q
8 B 1 EH 90 Q
9 B 2 EH 110 Q
10 B 1 NP 35 B
11 B 2 NP 33 Q
12 B 1 PH 215 Q
13 B 2 PH 190 Q
14 B 1 PL 36 B
15 B 2 PL 31 Q
When I run this code
dt2 <- df %>%
group_by(NAME, PLOT) %>%
traitValue[trait == "EH"] > traitValue[trait == "PH"] ~ "B",
traitValue[trait == "EH"] < traitValue[trait == "PH"] ~ "Q",
traitValue[trait == "PL"] > traitValue[trait == "NP"] ~ "B",
traitValue[trait == "PL"] < traitValue[trait == "NP"] ~ "Q"
))
I got this Error
Error in `mutate()`:
! Problem while computing `data_qc = case_when(...)`.
i The error occurred in group 1: NAME = "A", PLOT = 1.
Caused by error in`case_when()`:
! `traitValue[trait == "EH"] > traitValue[trait == "PH"] ~ "B"`, traitValue[trait == "EH"] < traitValue[trait == "PH"] ~ "Q"`
must be length 3 or one, not 0.
CodePudding user response:
I don't fully understand your constraints. You did not specify what would happen if "PH" > "EH" and "PL" > "NP" at the same time. In this case, will the final outcome be "B" or "Q".
However, to get you started I wrote the following code:
## Loading the required libraries
library(dplyr)
library(tidyverse)
## Creating the dataframe
df <- data.frame(
NAMEOFEXPERIMENT= c("A","A","A","A","A","A","A","B","B","B","B","B","B","B","B"),
PLOT= c(2,1,2,1,2,1,2,1,2,1,2,1,2,1,2),
trait= c("EH","NP","NP","PH","PH","PL","PL","EH","EH","NP","NP","PH","PH","PL","PL"),
traitValue= c(125,36,36,240,200,36,36,90,110,35,33,215,190,36,31)
)
## Removing duplicates
unique(df)
## Pivot longer to wider
df %>%
pivot_wider(names_from = trait, values_from = traitValue) %>%
arrange(NAMEOFEXPERIMENT,PLOT) %>%
mutate(ConditionalValue1 = ifelse(EH>PH,"B", "Q"),
ConditionalValue2 = ifelse(PL>NP,"B", "Q"))
Output
# A tibble: 4 x 8
NAMEOFEXPERIMENT PLOT EH NP PH PL ConditionalValue1 ConditionalValue2
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 A 1 NA 36 240 36 NA Q
2 A 2 125 36 200 36 Q Q
3 B 1 90 35 215 36 Q B
4 B 2 110 33 190 31 Q Q