I would appreciate your help.
I want to change variables' name in R. The name of variables is "CACRAGE"
, "CACRAA1"
, etc.
I want to remove "AC"
in the middle, changing variables' name to "CRAGE"
, "CCRAA1"
.
I've tried the following expression and it does not work. Please help me!
gsub(pattern = '^CAC([A-Z]{3,}', replacement = '^C([A-Z]{3,}', colnames(milwaukee_test), fixed = TRUE)
Thank you.
CodePudding user response:
Why not just replace "CAC" with "C" if it occurs at the beginning of the name?
milwaukee_test <- data.frame(CACRAGE = 1:3, CACRAA1 = 2:4)
names(milwaukee_test) <- sub(pattern = '^CAC', "C", colnames(milwaukee_test))
milwaukee_test
#> CRAGE CRAA1
#> 1 1 2
#> 2 2 3
#> 3 3 4
Created on 2022-08-20 with reprex v2.0.2
CodePudding user response:
You can use
gsub(pattern = '^CAC(?=[A-Z]{3})', replacement = 'C', colnames(milwaukee_test), perl = TRUE)
Note
- You need
perl = TRUE
, notfixed = TRUE
, in order to use regex and lookarounds in it ^CAC(?=[A-Z]{3})
matchesCAC
at the start of string that is immediately followed with three uppercase ASCII letters.