I am trying to regex search location in a document. However, I am having trouble with capturing only the location part of the text.
For example, for the text: LOCATION 03-ED-50-39.5/48.7 DIVISION HIGHWAY ROAD 44 CONTRACT ITEMS
, we would only want LOCATION 03-ED-50-39.5/48.7
.
Currently, I have the following code:
LOCATION\s (\d )
We know that the location string starts with a digit and ends with a digit with no space. Is there a way to capture the entire word/string right next to the location? Any help would be much appreciated. Thanks!
CodePudding user response:
Like this using \K
and GNU grep
:
grep -oP '^LOCATION\s \K\S ' file
With Perl
:
perl -lne 'print for /^LOCATION\s \K\S /' file
With Python
(using positive look behind
):
>>> import re
>>> s = 'LOCATION 03-ED-50-39.5/48.7 DIVISION HIGHWAY ROAD 44 CONTRACT ITEMS'
>>> pattern = '(?<=LOCATION\s{3})\S '
>>> matches = re.finditer(pattern, s)
>>> for match in matches:
... print(match.group())
...
03-ED-50-39.5/48.7
Output
03-ED-50-39.5O/48.7
The regular expression matches as follows:
Node | Explanation |
---|---|
^ |
the beginning of the string |
LOCATION |
'LOCATION' |
\s |
whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) |
\K |
resets the start of the match (what is K ept) as a shorter alternative to using a look-behind assertion: perlmonks look arounds and Support of K in regex |
\S |
non-whitespace (all but \n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) |