Home > Mobile >  Replacing numbers from alphanumeric strings
Replacing numbers from alphanumeric strings

Time:03-08

I have a large dataset with two sorts of labels. The first is of the form 'numeric_alphanumeric_alpha' and another which is 'alphanumeric_alpha'. I need to strip the numeric prefix from the first label so that it matches the second label. I know how to remove numbers from alphanumeric data (as below) but this would remove numbers that I need.

gsub('[0-9] ', '', x)

Below is an example of the two different labels I am encountered with well as the prefer

c('12345_F24R2_ABC', 'r87R2_DEFG')

Below is the desired output

c('F24R2_ABC', 'r87R2_DEFG')

CodePudding user response:

A simple regex can do it. ^ refers to the start of a string, \\d refers to any digits, indicates one or more time it appears.

gsub("^\\d _", "", c('12345_F24R2_ABC', 'r87R2_DEFG'), perl = T)

[1] "F24R2_ABC"  "r87R2_DEFG"

CodePudding user response:

Your code a litte modified:

^[0-9]*.....starts with number followed by numbers

\\_ .... matches underscore

gsub('^[0-9]*\\_', '', x)
[1] "F24R2_ABC"  "r87R2_DEFG"
  •  Tags:  
  • r
  • Related