Home > Net >  Regex ignore special character with greedy
Regex ignore special character with greedy

Time:12-21

I used the following regex to catch 10 numbers and letters:

/[a-zA-Z0-9]{10}/g

It works fine if the 10 characters are only numbers and letters.

e.g. input: 12345xcdw034342
it catches 12345xcdw0

But in this case with special characters or space, it doesn't catch it.
123}456712234324Zz3 or 123}45 71223AB3

It should catch 10 numbers and letters regardness of characters.

Any help would be gratefully appreciated.

CodePudding user response:

You can use

/[a-zA-Z0-9](?:[^a-zA-Z0-9]*[a-zA-Z0-9]){9}/g

See the regex demo. Details:

  • [a-zA-Z0-9] - an alphanumeric
  • (?:[^a-zA-Z0-9]*[a-zA-Z0-9]){9} - nine occurrences of any zero or more chars other than an alphanumeric char and then an alphanumeric char.

CodePudding user response:

You can do it but not without any extra processing

As you have not spetified what language you're using Ill use Javascript for being quite universal but the same logic must apply in any language.

Here are the options I can think of

if I have testString = "12@34{56A789BDE"

  1. Match the all until the first ten alphanumeric caracters, and then remove the spetial characters in the resulting string
testString.match(/(\w.*?){10}/)[0].replaceAll(/\W/g, '')
// results '123456A789'
// explanation: we take the first \w and use .*? to indicate that we dont care if the alphanumeric has a non-alphanumeric right next to it, then we clean the result by removing \W which means non-alphanumeric 
  1. Match only the first ten alphanumeric caracters and then join them to make a result string
testString.match(/\w/g).splice(0,10).join('')
// results '123456A789'
// explanation: we match 10 groups of aphanumeric characters represented by  \w (note the lowercase) and we join the first 10 (using splice to get them) as each group "()" is in the case of javascript returned as an element of an array of matches
  1. Remove the spetial characters from your string and then take the first ten
testString.replaceAll(/\W/g,'').match(/\w{10}/)[0]
// results '123456A789'
// explanation:  we replace \W which means non alpha numeric characters, with '' to delete them then we match the first ten
  • Related