Home > Software design >  Rename column names with gsub and sub
Rename column names with gsub and sub

Time:06-16

I have been reading about renaming column names using gsub and sub. I want to take various column names 20220427_209944540823_SC835404_12.RCC and truncate to SC835404_12. What am I missing in my code. Currently, I am getting 209944540823_SC835404_12

colnames(expr_counts) <- gsub(c(".RCC"), "", sub("^[^_]*_", "", colnames(expr_counts)))

CodePudding user response:

You can use

colnames(expr_counts) <- sub(".*_(.*_\\d )\\.RCC$", "\\1", colnames(expr_counts))

See the regex demo.

Details:

  • .* - any zero or more chars as many as possible
  • _ - an underscore
  • (.*_\d ) - Group 1: any zero or more chars as many as possible, _, one or more digits
  • \.RCC - an .RCC string
  • $ - end of string.

The replacement is the \1 backreference that replaces the whole match with the Group 1 value.

  • Related