I have a tibble in R with 3 columns. I am trying to rename the values in tourney_name. But, some of the tourney_name are repeated, thus not sure how to deal with this.
Halle
federer_final_short <- federer_final %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Basel"), "Basel (2014)")) %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Sydney"), "Sydney (2002)")) %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Halle"), "Halle (2017)")) %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Rotterdam"), "Rotterdam (2018)")) %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Munich"), "Munich (2003)")) %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Bangkok"), "Bangkok (2004)")) %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Halle"), "Halle (2004)")) %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Basel"), "Basel (2007)")) %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Doha"), "Doha (2005)")) %>%
mutate(tourney_name = str_replace_all(tourney_name, fixed("Miami Masters"), "Miami (2019)"))
Ideally, what I want is as follows tourney_name minutes Basel (2014) 52 Sydney (2002) 53 Halle(2017) 53 and so on.
I understand that this tibble is small enough that I could just create my own tibble from scratch, but I would like to learn how to do this.
CodePudding user response:
You should rather extract the year from the second column and combine that with the title. With a dplyr approach:
df %>%
mutate(
tourney_date = ymd(tourney_date),
tourney_name = str_c(tourney_name, " (", year(tourney_date), ")")
)
# A tibble: 10 × 3
# Rowwise:
tourney_name tourney_date minutes
<chr> <date> <dbl>
1 Basel (2014) 2014-10-20 52
2 Sydney (2002) 2002-01-07 53
3 Halle (2017) 2017-06-19 53
4 Rotterdam (2018) 2018-02-12 55
5 Munich (2003) 2003-04-28 56
6 Bangkok (2004) 2004-09-27 57
7 Halle (2004) 2004-06-07 57
8 Basel (2007) 2007-10-22 61
9 Doha (2005) 2005-01-03 63
CodePudding user response:
You can use data.table
# load data.table
library(data.table)
# convert your tibble to a data.table
setDT(federer_final)
# create the new names and put them in a new variable
federer_final[, newName := paste0(tourney_name, " (", substr(tourney_date, 1, 4), ")")]
If you want to keep only one tourney_name
column with the updated values, then:
federer_final[, tourney_name := paste0(tourney_name, " (", substr(tourney_date, 1, 4), ")")]
will do the trick.