Home > front end >  How can I rename multiple string in same column with another name in R
How can I rename multiple string in same column with another name in R

Time:01-09

The following names are in a column. I want to retain just five distinct names, while replace the rest with others. how do I go about that?

df <- data.frame(names = c('Marvel Comics','Dark Horse Comics','DC Comics','NBC - Heroes','Wildstorm',
                           'Image Comics',NA,'Icon Comics',
                           'SyFy','Hanna-Barbera','George Lucas','Team Epic TV','South Park',
                           'HarperCollins','ABC Studios','Universal Studios','Star Trek','IDW Publishing',
                           'Shueisha','Sony Pictures','J. K. Rowling','Titan Books','Rebellion','Microsoft',
                           'J. R. R. Tolkien'))

CodePudding user response:

If I am understanding you correctly, use %in% and ifelse. Here, I chose the first five names as an example. I also created it in a new column, but you could just overwrite the column as well or create a vector:

df <- data.frame(names = c('Marvel Comics','Dark Horse Comics','DC Comics','NBC - Heroes','Wildstorm',
                           'Image Comics',NA,'Icon Comics',
                           'SyFy','Hanna-Barbera','George Lucas','Team Epic TV','South Park',
                           'HarperCollins','ABC Studios','Universal Studios','Star Trek','IDW Publishing',
                           'Shueisha','Sony Pictures','J. K. Rowling','Titan Books','Rebellion','Microsoft',
                           'J. R. R. Tolkien'))

fivenamez <- c('Marvel Comics','Dark Horse Comics','DC Comics','NBC - Heroes','Wildstorm')

df$names_transformed <- ifelse(df$names %in% fivenamez, df$names, "Other")

# names names_transformed
# 1      Marvel Comics     Marvel Comics
# 2  Dark Horse Comics Dark Horse Comics
# 3          DC Comics         DC Comics
# 4       NBC - Heroes      NBC - Heroes
# 5          Wildstorm         Wildstorm
# 6       Image Comics             Other
# 7               <NA>             Other
# 8        Icon Comics             Other
# 9               SyFy             Other

If you want to keep NA values as NA, just use df$names_transformed <- ifelse(df$names %in% fivenamez | is.na(df$names), df$names, "Other")

CodePudding user response:

You can also use something like case when. The following code will keep marvel, dark horse, dc comics, JK Rowling and George Lucas the same and change all others to "Other". It functionally the same as u/jpsmith, but (in my humble opinion) offers a little more flexibility because you can change multiple things a bit more easily or make different comics have the same name should you choose to do so.

df = df %>% 
  mutate(new_names = case_when(names == 'Marvel Comics' ~ 'Marvel Comics',
                           names == 'Dark Horse Comics' ~ 'Dark Horse Comics',
                           names == 'DC Comics' ~ 'DC Comics', 
                           names == 'George Lucas' ~ 'George Lucas',
                           names == 'J. K. Rowling' ~ 'J. K. Rowling',
                           TRUE ~ "Other"))
  •  Tags:  
  • Related