I have a big df with colnames like this:
dput(head(colnames(count)[c(2,3,4,7,8)]))
c("A001", "A002", "A004", "A008", "A009")
I want to substract the number part and keep the letter, which is not a constant string and in other columns it will be B, C etc. result should look like this:
c("A000", "A001", "A003", "A007", "A008")
So far i was trying this which deals with the number -1 but it doesn't keep the letter.
as.numeric(str_extract(colnames(count), "[0-9] "))-1
c("0", "1", "3", "7", "8")
Thanks for the help
CodePudding user response:
One base R option:
x = c("A001", "A002", "B004", "C008", "D009")
sapply(
strsplit(x, "(?<=[A-Z])", perl = TRUE),
\(x) sprintf("%sd", x[1], as.numeric(x[2])-1)
)
# [1] "A000" "A001" "B003" "C007" "D008"
CodePudding user response:
You can use gsub
to extract the numbers and letter and operate on them, then sprintf
to format your string:
vec <- c("A001", "A002", "B004", "A008", "D009")
sprintf("%sd", gsub("\\d", "", vec), as.integer(gsub('\\D', "", vec)) - 1)
# [1] "A000" "A001" "B003" "A007" "D008"