Home > database >  Extract out data from the second identifier in regex instead of the first
Extract out data from the second identifier in regex instead of the first

Time:11-14

I'm trying to obtain the value (john) from name ID and put it under the first capturing group but cant seems to get it. I'm able to get the first one but not the second.

For e.g.

Survey Result:13:11:2021:14:15:22 Street Survey Result

Target 1:
    
    name ID: andy
    country name: china
    city name: beijing

Target 2:

    name ID: john
    country name: thailand
    city name: bangkok

I'm able to extract out the name andy using the following regex filter: name ID: \s(.*?)\s

I saw example of using the same filter \s(.*?).(.*?)\s but does not seems to work

or even using the {1} to ignore the first non-capturing group but it is getting 'name ID' as group 1, and 'andy' as group 2

CodePudding user response:

In your pattern you have en extra \s at the end, which can match after andy because there is an empty line between the 2 name ID parts.

You can omit the \s and you don't have to make the dot non greedy. To match at least a single non whitespace character to not get an empty string, you can start the match with \S

To capture the last value in repeating lines with the same format, you can use a capture group in a repeating non capture group:

(?:^name ID:\s (\S.*\s*)) 

Regex demo

If there is more data in between, you can match the last occurrence using a negative lookahead:

\bname ID:(?![\s\S]*\bname ID:)\s*(\S )

The pattern matches:

  • \bname ID: Match literally
  • (?! Negative lookahead
    • [\s\S]*\bname ID: Optionally match any character and then name ID:
  • ) Close lookahead
  • \s* Match optional whitespace chars
  • (\S ) Capture group 1, match 1 non whitespace chars

Regex demo

CodePudding user response:

Use name ID:\s(\w ) pattern. There will be two matches with specific name in group 1

Here is the link https://regex101.com/r/AfLb49/1

  • Related