Home > Software design >  removing characters in brackets of indefinite length from end of column names in R
removing characters in brackets of indefinite length from end of column names in R

Time:11-28

I have some column names in a df as follows:

column1 (-)
column2 (unwantedstring)
column3
column4 (4)

Note that some columns do not have unwanted brackets on the end and should be kept the same.

I want to get rid of the brackets at the end and anything inside it (along with the space before the opening brackets) to get:

column1
column2
column3
column4

Am I on the right track with the below?

df <- df %>%
  rename_with(~str_remove(.x, " \(*\)$"))

Any help would be appreciated

CodePudding user response:

You may use sub here for a base option:

x <- c("column1 (-)", "column2 (unwantedstring)", "column3", "column4 (4)")
output <- sub("\\s \\(.*\\)$", "", x)
output

[1] "column1" "column2" "column3" "column4"

CodePudding user response:

You can also use the following solution, however, it's a bit complicated.

trimws(regmatches(x, regexpr("\\([^()]\\](SKIP*)(FAIL*)|(?<!\\()[^()]*(?!\\))", x, perl = TRUE)))

[1] "column1" "column2" "column3" "column4"

Thanks for the data Tim Biegeleisen

  • Related