I want to change values in my username
variable but only when they meet a condition set from the variable chatforum
. For example, I want all instances of users called "Alex" from Canadian chatrooms to be relabeled as "AlexCA":
# mock dataset
library(tidyverse)
username <- c("Alex", "Alex", "Alex", "Alex")
id <- c(1001, 1002, 1003, 1001)
chatforum <- c("Canada", "U.S.", "U.K.", "Canada")
df <- cbind(username, id, chatforum)
df <- as_tibble(df)
glimpse(df)
df <- df %>% filter(chatforum=="Canada") %>%
mutate(username = replace(username, username == "Alex", "AlexCA"))
Though the code above works, I want the entire dataset returned to me, with the changes I just made. Using filter
returns a dataset with only the filtered rows, not the entire dataset.
I was advised to use if_else
or case_when()
but this also changes the username Alice
to AlexCA
, when I only want the username
"Alex" to change when chatroom == Canada
:
df <- df %>% mutate(username = if_else(chatforum=="Canada", "AlexCA", username))
Do you know how I can change the values in my username
column based on the condition that the value is Alex
and the chatroom
value is equal to Canada
?
CodePudding user response:
For using case_when
or ifelse
, you can have multiple conditions that must be met in order to the apply the change. So, if chatforum == "Canada" & username == "Alex"
, then we change the name to AlexCA
.
library(tidyverse)
df %>%
mutate(username = case_when(
chatforum == "Canada" & username == "Alex" ~ "AlexCA",
TRUE ~ username
))
Or in base R:
df[df$chatforum == "Canada" & df$username == "Alex",]$username <- "AlexCA"
Output
username id chatforum
<chr> <chr> <chr>
1 AlexCA 1001 Canada
2 Alex 1002 U.S.
3 Alex 1003 U.K.
4 AlexCA 1001 Canada
But if you need to do this for a lot of countries, then you might want to create a key or add a new column with the abbreviation you want. For example, you could do something like this, where we create an abbreviation from the chatforum
, then combine it with the username
.
df %>%
mutate(abrv = toupper(substr(str_replace_all(chatforum, "[[:punct:]]", ""), 1, 2))) %>%
unite(username, c(username, abrv), sep = "")
# username id chatforum
# <chr> <chr> <chr>
#1 AlexCA 1001 Canada
#2 AlexUS 1002 U.S.
#3 AlexUK 1003 U.K.
#4 AlexCA 1001 Canada
Or instead of uniting after creating an abbreviation column, you could still use case_when
for certain conditions.
df %>%
mutate(abrv = toupper(substr(str_replace_all(chatforum, "[[:punct:]]", ""), 1, 2))) %>%
mutate(username = case_when(
chatforum == "Canada" & username == "Alex" ~ paste0(username, abrv),
TRUE ~ username
))
# username id chatforum abrv
# <chr> <chr> <chr> <chr>
#1 AlexCA 1001 Canada CA
#2 Alex 1002 U.S. US
#3 Alex 1003 U.K. UK
#4 AlexCA 1001 Canada CA