I have been reading about renaming column names using gsub and sub. I want to take various column names 20220427_209944540823_SC835404_12.RCC and truncate to SC835404_12. What am I missing in my code. Currently, I am getting 209944540823_SC835404_12
colnames(expr_counts) <- gsub(c(".RCC"), "", sub("^[^_]*_", "", colnames(expr_counts)))
CodePudding user response:
You can use
colnames(expr_counts) <- sub(".*_(.*_\\d )\\.RCC$", "\\1", colnames(expr_counts))
See the regex demo.
Details:
.*
- any zero or more chars as many as possible_
- an underscore(.*_\d )
- Group 1: any zero or more chars as many as possible,_
, one or more digits\.RCC
- an.RCC
string$
- end of string.
The replacement is the \1
backreference that replaces the whole match with the Group 1 value.