I want to change/recode the levels for a column based on a specific value of the same column and another column. As an example, let's use ggplot2::diamonds. In this scenario, I want to change the value of "Premium" in the cut column to "Perfect" if the column color is "D" and change the value of "Premium" to "Amazing" if column color is "J". This is my attempt:
df <- ggplot2::diamonds
unique(df$cut) #to look at the initial values
df$cut <- with(df,ifelse(cut == "Premium" & color == "D", "Perfect",
ifelse(cut== "Premium" & color == "J","Amazing", cut)))
The issue is that when looking at the cut column afterwards, the other values have also been changed.
unique(df$cut)
[1] "5" "4" "2" "3" "1" "Perfect" "Amazing"
Can someone please tell me what I am doing wrong here? If there are other ways than how I attempted to do this, I would also appreciate seeing that as well!
CodePudding user response:
Using case_when()
:
library(dplyr)
df <- df %>%
mutate(cut =
as.factor(case_when(
(cut == "Premium" & color == "D") ~ "Perfect",
(cut == "Premium" & color == "J") ~ "Amazing",
TRUE ~ as.character(cut))
)
)
unique(df$cut)
Output:
[1] Ideal Premium Good Very Good Fair Perfect Amazing
Levels: Amazing Fair Good Ideal Perfect Premium Very Good
CodePudding user response:
I often use this base
R method:
df$cut <- as.character(df$cut)
df$cut[df$color == "D" & df$cut == "Premium" ] <- "Perfect"
df$cut[df$color == "J" & df$cut == "Premium" ] <- "Amazing"
df$cut <- as.ordered(df$cut)
But you have to turn your factor to character first, or you will get an error.