sed regular expression part-CodePudding

Right now I am trying to delete all the lines of the file that has number 80000 or greater at the end of the line

For example

Jennifer Cowan:548-834-2348:583 Laurel Ave., Kingsville, TX 83745:10/1/35:58900
Jon DeLoach:408-253-3122:123 Park St., San Jose, CA 04086:7/25/53:85100

When I run sed, the command should only delete the line of Jon DeLoach

I tried till

sed '/:0*[1-9][0-9]{5,}|:0*[8-9][0-9]{4,}/d' datebook.txt

since

egrep ':0*[1-9][0-9]{5,}|:0*[8-9][0-9]{4,}' datebook.txt

returns all the lines that has 800000 or greater

however, sed command actually does not work and find out that because regular expression that I made

 ':0*[1-9][0-9]{5,}|:0*[8-9][0-9]{4,}'

only work for egrep not grep

I am new to regular expression and kind of confuse how to change from egrep to grep

CodePudding user response：

It's an awkward question, but you could tweak your existing answer to get:

sed '/:[8-9][0-9]\{3,\}$/d; /:[0-9]\{6,\}$/d' file

I'm not sure what else you can do with sed (it's pretty fragile); does that solve your problem?

CodePudding user response：

Regular expressions do not understand numeric values, and if you do it like this it's going to be a nightmare to maintain.

You are dealing with data that is in fields, so it is a good task for awk.

You want awk to go through all the lines and then print out the ones where the fifth field is less than 80000.

awk -F":" '$5 < 80000' datebook.txt

The -F":" says that a colon is the field separator. $5 means the fifth field.