I have a text file containing :
A 25 27 50
B 35 75
C 75 78
D 99 88 76
I wanted to delete the line that does not have the fourth field(the fourth pair of digits). Expected output :
A 25 27 50
D 99 88 76
I know that awk command would be the best option for such task, but i'm wondering what's the problem with my sed command since it should work as you can see below :
sed -E '/^[ABCD] ([0-9][0-9]) \1$/d' text.txt
Using POSIX ERE with back-referencing (\1) to refer to the previous pattern surrounded with parenthesis.
I have tried this command instead :
sed -E '/^[ABCD] ([0-9][0-9]) [0-9][0-9]$/d' text.txt
But it seems to delete only the first occurrence of what i want. I would appreciate further explanation of,
- why the back-referencing doesn't work as expected.
- what's the matter with the first occurrence in the second attempt,should i included global option if yes then how, since i already tried adding it at the end along side with /d (for delete) but it didn't work .
CodePudding user response:
Much much easier with awk
:
awk 'NF == 4' file
A 25 27 50
D 99 88 76
This awk
command uses default field separator of space or tab and checks a condition NF == 4
to make sure we print lines with 4 fields only.
With sed
it would be (assuming no leading trailing spaces in each line):
sed -nE '/^[^[:blank:]] ([[:blank:]] [^[:blank:]] ){3}$/p' file
A 25 27 50
D 99 88 76
CodePudding user response:
This might work for you (GNU sed):
sed -En 's/\S /&/4p' file
Turn off implicit printing -n
and on extended regexp -E
.
Substitute the 4th field with itself and print the result.