This question arose, while working on this question Replace list names if they exist
I have this manipulated iris dataset with two vectors:
new_name <- c("new_setoas", "new_virginica")
to_select <- c("setosa", "virginica")
iris %>%
group_by(Species) %>%
slice(1:2) %>%
mutate(Species = as.character(Species))
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <chr>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 7 3.2 4.7 1.4 versicolor
4 6.4 3.2 4.5 1.5 versicolor
5 6.3 3.3 6 2.5 virginica
6 5.8 2.7 5.1 1.9 virginica
I would like to replace values in Species selected from a vector (to_select
) with values from another vector (new_name
)
When I do:
new_name <- c("new_setoas", "new_virginica")
to_select <- c("setosa", "virginica")
iris %>%
group_by(Species) %>%
slice(1:2) %>%
mutate(Species = as.character(Species)) %>%
mutate(Species = ifelse(Species %in% to_select, new_name, Species))
# I get:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <chr>
1 5.1 3.5 1.4 0.2 new_setoas
2 4.9 3 1.4 0.2 **new_virginica** # should be new_setoas
3 7 3.2 4.7 1.4 versicolor
4 6.4 3.2 4.5 1.5 versicolor
5 6.3 3.3 6 2.5 **new_setoas** # should be new_virginica
6 5.8 2.7 5.1 1.9 new_virginica
While I know this is happening because of recycling. I don't know how to avoid this!
CodePudding user response:
We may use recode
- instead of grouping and then modifying the group column afterwards, it can be done at the group_by
step itself
library(dplyr)
iris %>%
group_by(Species = recode(as.character(Species),
!!!setNames(new_name, to_select))) %>%
slice(1:2)
-output
# A tibble: 6 × 5
# Groups: Species [3]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <chr>
1 5.1 3.5 1.4 0.2 new_setoas
2 4.9 3 1.4 0.2 new_setoas
3 7 3.2 4.7 1.4 versicolor
4 6.4 3.2 4.5 1.5 versicolor
5 6.3 3.3 6 2.5 new_virginica
6 5.8 2.7 5.1 1.9 new_virginica
CodePudding user response:
A solution with match
is more complicated than akrun's solution but here it goes.
suppressPackageStartupMessages(
library(dplyr)
)
new_name <- c("new_setoas", "new_virginica")
to_select <- c("setosa", "virginica")
iris %>%
group_by(Species) %>%
slice(1:2) %>%
mutate(Species = as.character(Species)) %>%
mutate(i_new = match(Species, to_select)) %>%
mutate(Species = ifelse(is.na(i_new), Species, new_name[i_new])) %>%
select(-i_new)
#> # A tibble: 6 × 5
#> # Groups: Species [3]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 new_setoas
#> 2 4.9 3 1.4 0.2 new_setoas
#> 3 7 3.2 4.7 1.4 versicolor
#> 4 6.4 3.2 4.5 1.5 versicolor
#> 5 6.3 3.3 6 2.5 new_virginica
#> 6 5.8 2.7 5.1 1.9 new_virginica
Created on 2022-11-04 with reprex v2.0.2