I want to rearrange the column names:
For example, the column names are 2009sum
, 2010sum
, 2011sum
and so on. I want change the names to sum2009
, sum2010
, sum2011
.
I have tried the following code in R, but it's not working.
colnames(dataframe) <- gsub("(\\W )(\\w )", "\\2\\1", colnames(dataframe))
CodePudding user response:
Or you can use this one:
vec <- c("2009sum", "2010sum", "2011sum")
gsub("(^[0-9] )([[:alpha:]] $)", "\\2\\1", vec, perl = TRUE)
[1] "sum2009" "sum2010" "sum2011"
CodePudding user response:
Here are a couple more ways:
x <- data.frame(matrix(ncol=3))
nam <- c("2009sum", "2010sum", "2011sum")
a <- gsub(pattern = "\\d{4}", replacement = "", x = nam)
a
#> [1] "sum" "sum" "sum"
b <- gsub(pattern = ".*(\\d{4}).*", replacement = "\\1", x = nam)
b
#> [1] "2009" "2010" "2011"
colnames(x) <- paste0(a, b)
x
#> sum2009 sum2010 sum2011
#> 1 NA NA NA
colnames(x) <- sprintf("%s%s", a, b)
x
#> sum2009 sum2010 sum2011
#> 1 NA NA NA
Created on 2021-09-22 by the reprex package (v2.0.1)
CodePudding user response:
This should work for you:
colnames(dataframe) <- paste0("sum",as.numeric(gsub("([0-9] ).*$", "\\1", names(dataframe))))
This renames your columns by pasting "sum" the numeric component of your column names together, in that order.
Which gives us:
[1] "sum2009" "sum2010" "sum2011"
Dput:
structure(list(`2009sum` = 1, `2010sum` = 1, `2011sum` = 1), class = "data.frame", row.names = c(NA,
-1L))
CodePudding user response:
You're actually not that far from the right and (most elegant) solution, which uses double backreference:
colnames(dataframe) <- sub("(\\d )(sum)", "\\2\\1", colnames(dataframe))
Why does colnames(dataframe) <- gsub("(\\W )(\\w )", "\\2\\1", colnames(dataframe))
not work?
Couple of things here:
gsub
is possible but is not necessary;sub
suffices as you have just one match per string\\W
is a negative character class for anything that is neither a letter nor a number (and an underscore) - wrong because it's the four digits at the beginning of the string that you want to match\\w
(with lower-case 'w') is a positive character class that matches exactly what\\W
does not match, i.e., it matches both letters and numbers and the underscore - wrong because what you want to match are letters only - not numbers