Home > database >  Regex to pick out key information between words/characters
Regex to pick out key information between words/characters

Time:11-12

I have a string as follows:

players: 2-8

Using regex how would I match the 2 and the 8 without matching everything else (ie 'players: ' and the '-')?

I have tried:

players:\s*([^.] |\S )

However, this matches the entire phrase and also uses a '.' at the end to mark the end of the string which might not always be the case.

It'd be much better if I could use the '-' to match the numbers, but I also need it to be looking ahead from 'players' as I will be using this to know that the data is correct for a given variable.

Using python if that's important

Thanks!!

CodePudding user response:

Using players:\s*([^.] |\S ) will use a single capture group matching either any char except a dot, or match a non whitespace char. Combining those, it can match any character.


To get the matches only using, you could make use of the Python PyPi regex module you can use the \G anchor:

(?:\bplayers:\s |\G(?!^))-?\K\d 

The pattern matches:

  • (?: Non capture group
    • \bplayers:\s A word boundary to prevent a partial word match, then match players: and 1 whitespace chars
    • | Or
    • \G(?!^) Anchor to assert the current position at the end of the previous match to continue matching
  • ) Close non capture group
  • -?\K Match an optional - and forget what is matched so far
  • \d Match 1 digits

Regex demo | Python demo

import regex

s = "players: 2-8"
pattern = r"(?:\bplayers:\s |\G(?!^))-?\K\d "
print(regex.findall(pattern, s))

Output

['2', '8']

You could also use a approach using 2 capture groups with re

import re

s = "players: 2-8"
pattern = r"\bplayers:\s (\d )-(\d )\b"
print(re.findall(pattern, s))

Output

[('2', '8')]
  • Related