How to remove specific characters from string in a column in R?-CodePudding

I've got the following data.

df <- data.frame(Name = c("TOMTom Catch",
                          "BIBill Ronald",
                          "JEFJeffrey Wilson",
                          "GEOGeorge Sic",
                          "DADavid Irris"))

How do I clean the data in names column?

I've tried nchar and substring however some names need the first two characters removed where as other need the first three?

CodePudding user response：

We can use regex lookaround patterns.

gsub("^[A-Z] (?=[A-Z])", "", df$Name, perl = T)
#> [1] "Tom Catch"      "Bill Ronald"    "Jeffrey Wilson" "George Sic"    
#> [5] "David Irris"