Home > Software design >  Rename a character in R based upon certain conditions
Rename a character in R based upon certain conditions

Time:12-31

I have a tibble in R with 3 columns. I am trying to rename the values in tourney_name. But, some of the tourney_name are repeated, thus not sure how to deal with this.

This is the federer_final

Here is after I tried.

Halle
federer_final_short <- federer_final %>%
  mutate(tourney_name = str_replace_all(tourney_name, fixed("Basel"), "Basel (2014)")) %>% 
  mutate(tourney_name = str_replace_all(tourney_name, fixed("Sydney"), "Sydney (2002)")) %>%
  mutate(tourney_name = str_replace_all(tourney_name, fixed("Halle"), "Halle (2017)")) %>%
  mutate(tourney_name = str_replace_all(tourney_name, fixed("Rotterdam"), "Rotterdam (2018)")) %>%
  mutate(tourney_name = str_replace_all(tourney_name, fixed("Munich"), "Munich (2003)")) %>%
  mutate(tourney_name = str_replace_all(tourney_name, fixed("Bangkok"), "Bangkok (2004)")) %>%
  mutate(tourney_name = str_replace_all(tourney_name, fixed("Halle"), "Halle (2004)")) %>%
  mutate(tourney_name = str_replace_all(tourney_name, fixed("Basel"), "Basel (2007)")) %>%
  mutate(tourney_name = str_replace_all(tourney_name, fixed("Doha"), "Doha (2005)")) %>%
  mutate(tourney_name = str_replace_all(tourney_name, fixed("Miami Masters"), "Miami (2019)"))
  

Ideally, what I want is as follows tourney_name minutes Basel (2014) 52 Sydney (2002) 53 Halle(2017) 53 and so on.

I understand that this tibble is small enough that I could just create my own tibble from scratch, but I would like to learn how to do this.

CodePudding user response:

You should rather extract the year from the second column and combine that with the title. With a dplyr approach:

df %>%
  mutate(
    tourney_date = ymd(tourney_date),
    tourney_name = str_c(tourney_name, " (", year(tourney_date), ")")
  )

# A tibble: 10 × 3
# Rowwise: 
   tourney_name         tourney_date minutes
   <chr>                <date>         <dbl>
 1 Basel (2014)         2014-10-20        52
 2 Sydney (2002)        2002-01-07        53
 3 Halle (2017)         2017-06-19        53
 4 Rotterdam (2018)     2018-02-12        55
 5 Munich (2003)        2003-04-28        56
 6 Bangkok (2004)       2004-09-27        57
 7 Halle (2004)         2004-06-07        57
 8 Basel (2007)         2007-10-22        61
 9 Doha (2005)          2005-01-03        63

CodePudding user response:

You can use data.table

# load data.table
library(data.table)

# convert your tibble to a data.table
setDT(federer_final)

# create the new names and put them in a new variable
federer_final[, newName := paste0(tourney_name, " (", substr(tourney_date, 1, 4), ")")]

If you want to keep only one tourney_name column with the updated values, then:

federer_final[, tourney_name := paste0(tourney_name, " (", substr(tourney_date, 1, 4), ")")]

will do the trick.

  • Related