I have a dataset of cricket matches with a range of columns. In one of the columns "MatchID", the value is T20 # xxxx. In order to perform numerical analysis i would like to remove the part which says T20 #. This is an example of how the data looks now:
19/10/2017 T20 # 1
26/10/2017 T20 # 2
28/10/2017 T20 # 3
The desired output would be
19/10/2017 1
26/10/2017 2
28/10/2017 3
Any tips?
CodePudding user response:
You can use gsub to replace a regular expression :
df$MatchID = gsub("T 20 \\#", "", df$MatchID)
CodePudding user response:
Another possible solution:
library(tidyverse)
df %>%
mutate(MatchID = str_remove(MatchID, "T20") %>% parse_number)
#> Date MatchID
#> 1 19/10/2017 1
#> 2 26/10/2017 2
#> 3 28/10/2017 3