I have column name "tourney_name" and one of the values of the column has the incorrect name. It is named Us Open instead of US Open. How do I change this? Below shows the unique names of the Grand Slams in tennis on the data set. I don't want to rename the column name, but how to revalue this? Also, this is my first question, so I'm sorry if I made some mistake.
Atp_together_68_21 %>%
filter(round == "F", tourney_level == "G") %>%
select(tourney_name) %>%
unique()
tourney_name
1 Roland Garros
2 Wimbledon
3 US Open
4 Australian Open
210 Us Open
CodePudding user response:
We can use purrr:modify_if()
library(dplyr)
library(purrr)
df %>% mutate(tournay_name = modify_if(tournay_name, ~.x == "Us Open", "US Open"))
CodePudding user response:
We can use stringr
from tidyverse
to replace any instances of the mis-capitalized word.
library(tidyverse)
df %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Us Open"), "US Open"))
Or in base R:
df$tourney_name <- gsub(fixed("Us Open"), "US Open", df$tourney_name)
Output
tourney_name
1 Roland Garros
2 Wimbledon
3 US Open
4 Australian Open
210 US Open
Data
df <- structure(list(tourney_name = c("Roland Garros", "Wimbledon",
"US Open", "Australian Open", "Us Open")), class = "data.frame", row.names = c("1",
"2", "3", "4", "210"))
CodePudding user response:
Just to offer an alternative, in data.table
you could do it by reference:
library(data.table)
setDT(df)
df[tourney_name == "Us Open", tourney_name := "US Open"]
Note that running this line of code will not print any output - by default the output is not printed and it is as if you typed df <- df %>% ...
. If you just run df
in console you will see that df
was changed.