Home > Back-end >  Change values in a variable based on a conditional value in R
Change values in a variable based on a conditional value in R

Time:03-13

I want to change values in my username variable but only when they meet a condition set from the variable chatforum. For example, I want all instances of users called "Alex" from Canadian chatrooms to be relabeled as "AlexCA":

# mock dataset
library(tidyverse)
username <- c("Alex", "Alex", "Alex", "Alex")
id <- c(1001, 1002, 1003, 1001)
chatforum <- c("Canada", "U.S.", "U.K.", "Canada")

df <- cbind(username, id, chatforum)
df <- as_tibble(df)
glimpse(df)

df <- df  %>% filter(chatforum=="Canada") %>% 
  mutate(username = replace(username, username == "Alex", "AlexCA"))

Though the code above works, I want the entire dataset returned to me, with the changes I just made. Using filter returns a dataset with only the filtered rows, not the entire dataset.

I was advised to use if_else or case_when() but this also changes the username Alice to AlexCA, when I only want the username "Alex" to change when chatroom == Canada:

df <- df %>% mutate(username = if_else(chatforum=="Canada", "AlexCA", username))

Do you know how I can change the values in my username column based on the condition that the value is Alex and the chatroom value is equal to Canada?

CodePudding user response:

For using case_when or ifelse, you can have multiple conditions that must be met in order to the apply the change. So, if chatforum == "Canada" & username == "Alex", then we change the name to AlexCA.

library(tidyverse)

df %>%
  mutate(username = case_when(
    chatforum == "Canada" & username == "Alex" ~ "AlexCA",
    TRUE ~ username
  ))

Or in base R:

df[df$chatforum == "Canada" & df$username == "Alex",]$username <- "AlexCA"

Output

  username id    chatforum
  <chr>    <chr> <chr>    
1 AlexCA   1001  Canada   
2 Alex     1002  U.S.     
3 Alex     1003  U.K.     
4 AlexCA   1001  Canada  

But if you need to do this for a lot of countries, then you might want to create a key or add a new column with the abbreviation you want. For example, you could do something like this, where we create an abbreviation from the chatforum, then combine it with the username.

df %>%
  mutate(abrv = toupper(substr(str_replace_all(chatforum, "[[:punct:]]", ""), 1, 2))) %>%
  unite(username, c(username, abrv), sep = "")

#  username id    chatforum
#  <chr>    <chr> <chr>    
#1 AlexCA   1001  Canada   
#2 AlexUS   1002  U.S.     
#3 AlexUK   1003  U.K.     
#4 AlexCA   1001  Canada   

Or instead of uniting after creating an abbreviation column, you could still use case_when for certain conditions.

df %>%
  mutate(abrv = toupper(substr(str_replace_all(chatforum, "[[:punct:]]", ""), 1, 2))) %>%
  mutate(username = case_when(
    chatforum == "Canada" & username == "Alex" ~ paste0(username, abrv),
    TRUE ~ username
  ))

#  username id    chatforum abrv 
#  <chr>    <chr> <chr>     <chr>
#1 AlexCA   1001  Canada    CA   
#2 Alex     1002  U.S.      US   
#3 Alex     1003  U.K.      UK   
#4 AlexCA   1001  Canada    CA   
  • Related