My code is as below:
data = data.frame(x1 = c(1,1,1,1)
,x2 = c(0,1,0,1)
,x3 = c(1,1,0,1),x4 = c(1,1,0,0)) %>% rowSums
data%>%
case_when(. == 0 ~ 0,
. %in% c(1,2)~ 1,
. %in% c(3:5)~ 2)
The sample data is as below:
x1 | x2 | x3 | x4 |
---|---|---|---|
1 | 0 | 1 | 1 |
1 | 1 | 1 | 1 |
1 | 0 | 0 | 0 |
1 | 1 | 1 | 0 |
where x1,x2,x3,x4
are in one data frame and they are binary variables.
Then, the rowsums of x1,x2,x3,x4
are calculated.
The result is as below:
rowsums |
---|
3 |
4 |
1 |
3 |
I would like to use case_when to do classification, however, when I run the above code, the error:
! Case 1 (
.) must be a two-sided formula, not a double vector.
also appears and I cannot solve it by using different method...
CodePudding user response:
The pipe inserts the left-hand expression as the first argument into the right-hand side call.
That is, your call is equivalent to:
case_when(data,
data == 0 ~ 0,
data %in% c(1,2) ~ 1,
data %in% c(3:5) ~ 2)
To prevent this, surround the right-hand side with {…}
:
data %>% {
case_when(. == 0 ~ 0,
. %in% c(1,2) ~ 1,
. %in% c(3:5) ~ 2)
}
The documentation gives the following description:
For example,
iris %>% subset(1:nrow(.) %% 2 == 0)
is equivalent toiris %>% subset(., 1:nrow(.) %% 2 == 0)
but slightly more compact. It is possible to overrule this behavior by enclosing therhs
in braces. For example,1:10 %>% {c(min(.), max(.))}
is equivalent toc(min(1:10), max(1:10))
.