I am fairly new to bash scripting and was trying to echo only lines that match a specific formatting. I have this code so far:
LINE=1
while read -r CURRENT_LINE
do
if [[ $CURRENT_LINE == ??-?-??? ]]
then
echo "$LINE: $CURRENT_LINE"
fi
((LINE ))
done < "./new-1.txt"
The text file contains number sequences on each line that match the following format: "12-3-456", but also contains sequences that are in different formats as well, such as "123-89203-9420" or "123-456-7890". I can't quite understand why the if statement inside the while loop does not result to True on lines that match the formatting. I've tried using the * as well, but using it gives me incorrect results.
Here are the contents of the text file new-1.txt. I want the script to output "Line 1: 11-1-111", but it doesn't output anything.
11-1-111
222-22-2222
333-33-3333
444-444-4444
555-555-5555
CodePudding user response:
Maybe try using the bash regex operator (=~
), e.g.
while read -r CURRENT_LINE;
do
if [[ $CURRENT_LINE =~ [0-9][0-9]-[0-9]-[0-9][0-9][0-9] ]]
then
echo "$LINE: $CURRENT_LINE"
fi
((LINE ))
done < new-1.txt
Or, if you are open to alternatives, using nl
(number the lines) and sed
:
nl -n ln new-1.txt | sed -n '/[[:digit:]]\{2\}-[[:digit:]]-[[:digit:]]\{3\}/p'
Or with GNU grep
:
grep -noP "[0-9][0-9]-[0-9]-[0-9][0-9][0-9]" new-1.txt
CodePudding user response:
In the regex
parlance, the ?
makes the character or selection optional, ie , a character/selection is allowed to occur at most one time but zero occurrences are also tolerated.
However, the ==
operation is not the regex matching operator. It is =~
.
So changing your if
clause to the below would do the job.
[[ $CURRENT_LINE =~ "^[0-9]{2}-[0-9]{1}-[0-9]{3}$" ]]
Here
- The
^
specifies the beginning of regex and$
the end. So we have a tight coupling of the pattern to match [0-9]
denotes a range, here any number from zero to nine.- The
{n}
mandates that the preceding character/selection should match exactlyn
number of times
Note : You can also use a more verbose [[:digit:]]
instead of [0-9]