I'm working on the above stated regex problem which is pretty simple: (0x90\s*){4,}
My questions is, how can I extend this to any hexadecimal number that appears more than 4 times.
Suppose I have the following in a file:
0x90 0x90 0x90 0x90
0x80 0x80 0x80 0x80 0x80 0x80 0x80 0x80
0x10 0x10 0x10
I want to write a regex that will match the first 2 lines since 0x90 appears more than 3 times and then 0x80 appears more than 3 times in it's own line. 0x10 appears exactly 3 times so there wouldn't be any match.
Here is what I've attempted:
(0(x|x)[0-9a-fA-F] \s*)\1{3,}
The way it's written seems to work but only if there is atleast one space at the end of the line. For example, This regex will match the first line 0x90 0x90 0x90 0x90 only if there is a space after the last 0x90. I thought the issue is taken care of with the \s*
?
CodePudding user response:
The reason it only matches if there is a space at the end in the example data, is because the backreference \1
refers to group 1 that also has matched a space.
(0(x|x)[0-9a-fA-F] \s*)\1{3,}
^^^^^^^^^^^^^^^^^^^^^^^
group 1
Note that this capture group (x|x)
matches either x or x which is the same as (x)
You can make a single capture group out of (0x[0-9a-fA-F] )
followed by a whitespace char and a backreference to group 1 that is without a space 3 or more times.
(0x[0-9a-fA-F] )([[:space:]]\1){3,}
arr=(
"0x90 0x90 0x90 0x90"
"0x80 0x80 0x80 0x80 0x80 0x80 0x80 0x80"
"0x10 0x10 0x10"
"0x90 0x90 0x90 0x90 "
)
pattern='(0x[0-9a-fA-F] )([[:space:]]\1){3,}'
for i in "${arr[@]}"
do
if [[ "$i" =~ $pattern ]]
then
echo "Match: ${BASH_REMATCH[0]}"
else
echo "No match: $i"
fi
done
Output
Match: 0x90 0x90 0x90 0x90
Match: 0x80 0x80 0x80 0x80 0x80 0x80 0x80 0x80
No match: 0x10 0x10 0x10,
Match: 0x90 0x90 0x90 0x90
CodePudding user response:
Extended Regex with grep:
grep -E '(0x([[:xdigit:]]){2}[[:space:]]?){4,}'
And since this question is tagged bash
:
#!/usr/bin/env bash
while read -r line || [ "$line" ]; do
[[ $line =~ (0x([[:xdigit:]]){2}[[:space:]]?){4,} ]] && printf '%s\n' "$line"
done < inputfile