Home > Back-end >  egrep between 2 ranges in same column csv
egrep between 2 ranges in same column csv

Time:11-24

not sure how to iterate between 2 sets of data on the same column, so lets say i have a CSV file with all titanic passangers and i want to extract the people between 20 and 29 years old and from 40 to 49 years old, and people who spoke english AND other lenguage lets say french, since both data are in the same column is quite challenging. egrep does not seem to have a AND only and or so im struggling to find how to do it

so what i was trying was something like (from a coma separated csv) 3rd columns is Age and 8th is lenguage

(despite that i know that it might be easier solutions with some sed/awk etc i need it for training porposes in egrep)

egrep "^.*,.*,[2-0][0-9],.*,.*,[eng.*]" titanic-passengers.csv

thanks in advance.

CodePudding user response:

You should use [^,]* to match a single column. .* will match across multiple columns.

To match 20-29 use 2[0-9]; to match 40-49 use 4[0-9]. You can then combine them with [24][0-9].

You don't need to put [] around the language, that's for matching a single character that's any of the characters in the brackets.

grep -E '^[^,]*,[^,]*,[24][0-9],[^,]*,[^,]*,[^,]*,[^,]*,eng' titanic-passengers.csv

CodePudding user response:

maybe this one?

grep -E '^[^,]*,[^,*],[24][0-9],[^,]*,[^,]*,[^,]*,[^,]*,[^,]*( english|english )[^,]*' titanic-passengers.csv

@Barmar explained well the other patterns so I'll explain the "language" part.

To be sure to match at least one more language than english, you need to force a space before or after the word english. The OR operator is expressed by (pattern1|pattern2)

  • Related