I have a file and I only want to find lines that have "here". In each of these lines there are multiple string and integer values (see example below). I only want the first integer of each line that matches the pattern.
I have created a solution that uses a bash script, but is there a simpler method I am missing. I was hoping something like grep -w here -Eo [0-9] file
would work. However when I try that it expects anything that comes after "here" to be the file.
STEP 1 STAGE 1 here other info
foo
bar
STEP 2 STAGE 1 here other info
more
foo
bar
STEP 3 STAGE 1 here other info
For this file the desired output would be
1
2
3
CodePudding user response:
Another variant with gnu-grep
using -P
for Perl-compatible regular expressions if supported:
grep -oP "^\D*\K\d (?=.*\bhere\b)" file
The pattern matches:
^
Start of string\D*
Match optional non digits\K
Forget what is matched do far\d
Match 1 digits(?=.*\bhere\b)
Positive lookahead, asserthere
to the right
Output
1
2
3
CodePudding user response:
This simpler awk
should work for you:
awk '/ here / {sub(/^[^0-9] /, ""); print $1 0}' file
1
2
3
CodePudding user response:
With GNU awk
you could try following awk
code. Written and tested with your shown samples.
awk '
match($0,/(^|[[:space:]] )([0-9] )[[:space:]] .*here /,arr){
print arr[2]
}
' Input_file
Explanation: In GNU awk
first searching string here
keyword AND then using match
function of GNU awk
where using (^|[[:space:]] )([0-9] )[[:space:]] .*here
regex which creates 2 capturing Groups and stores their values into an array named arr
with index of 1,2 respectively. If all these conditions are verified then printing the 2nd element of that array which is required value(integer of line).
CodePudding user response:
grep
is not the right command for this. I'd use sed
:
sed -n '/ here /s/[^0-9]*\([0-9]*\).*/\1/p' file