Consider iris
dataset. Let's say I want to create a column count
if values "sepal" columns are between 1 to 5.
Here's what I have:
iris %>% rowwise() %>%
mutate(count = sum(if_any(contains("sepal", ignore.case = TRUE),
.fns = ~ between(.x, 1, 5)))) %>%
arrange(desc(count))
But the output is not what I want.
Sepal.Length Sepal.Width Petal.Length Petal.Width Species count
<dbl> <dbl> <dbl> <dbl> <fct> <int>
1 5.1 3.5 1.4 0.2 setosa 1 # Should be 1
2 4.9 3 1.4 0.2 setosa 1 # Should be 2
3 4.7 3.2 1.3 0.2 setosa 1 # Should be 2
4 4.6 3.1 1.5 0.2 setosa 1 # Should be 2
5 5 3.6 1.4 0.2 setosa 1 # Should be 2
6 5.4 3.9 1.7 0.4 setosa 1 # Should be 1
7 4.6 3.4 1.4 0.3 setosa 1 # Should be 2
8 5 3.4 1.5 0.2 setosa 1 # Should be 2
9 4.4 2.9 1.4 0.2 setosa 1 # Should be 2
10 4.9 3.1 1.5 0.1 setosa 1 # Should be 2
I can use case_when
or if_else
for the two columns but the actual dataset has a lot more columns. So I'm looking for a dplyr
solution where I don't have to type out all the columns.
CodePudding user response:
library(tidyverse)
iris %>%
mutate(
count = rowSums(across(contains("Sepal"), ~ between(.x, 1, 5)))
)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species count
1 5.1 3.5 1.4 0.2 setosa 1
2 4.9 3.0 1.4 0.2 setosa 2
3 4.7 3.2 1.3 0.2 setosa 2
4 4.6 3.1 1.5 0.2 setosa 2
5 5.0 3.6 1.4 0.2 setosa 2
6 5.4 3.9 1.7 0.4 setosa 1
7 4.6 3.4 1.4 0.3 setosa 2
8 5.0 3.4 1.5 0.2 setosa 2
9 4.4 2.9 1.4 0.2 setosa 2
10 4.9 3.1 1.5 0.1 setosa 2
EDIT:
With c_across
. To my understanding, c_across
has to be used with rowwise()
to perform rowwise aggregation and calculation.
iris %>%
rowwise() %>%
mutate(count = sum(between(c_across(contains("Sepal")), 1, 5)))