I have a data file (file.txt
) contains the below lines:
123 pro=tegs, ETA=12:00, team=xyz,user1=tom,dom=dby.com
345 pro=rbs, team=abc,user1=chan,dom=sbc.int,ETA=23:00
456 team=efg, pro=bvy,ETA=22:00,dom=sss.co.uk,user2=lis
I'm expecting to get the first column ($1
) only if the ETA=
number is greater than 15, like here I will have 2nd and 3rd line first column only is expected.
345
456
I tried like cat file.txt | awk -F [,TPF=]' '{print $1}'
but its print whole line which has ETA at the end.
CodePudding user response:
I would harness GNU AWK
for this task following way, let file.txt
content be
123 pro=tegs, ETA=12:00, team=xyz,user1=tom,dom=dby.com
345 pro=rbs, team=abc,user1=chan,dom=sbc.int,ETA=23:00
456 team=efg, pro=bvy,ETA=02:00,dom=sss.co.uk,user2=lis
then
awk 'substr($0,index($0,"ETA=") 4,2) 0>15{print $1}' file.txt
gives output
345
Explanation: I use String functions, index
to find where is ETA=
then substr
to get 2 characters after ETA=
, 4 is used as ETA=
is 4 characters long and index
gives start position, I use 0
to convert to integer then compare it with 15
. Disclaimer: this solution assumes every row has ETA=
followed by exactly 2 digits.
(tested in GNU Awk 5.0.1)
CodePudding user response:
It's unclear why you think your attempt would do anything of the sort. Your attempt uses a completely different field separator and does not compare anything against the number 15.
You'll also want to get rid of the useless use of cat
.
When you specify a column separator with -F
that changes what the first column $1
actually means; it is then everything before the first occurrence of the separator. Probably separately split
the line to obtain the first column, space-separated.
awk -F 'ETA=' '$2 > 15 { split($0, n, /[ \t] /); print n[1] }' file.txt
The value in $2
will be the data after the first separator (and up until the next one) but using it in a numeric comparison simply ignores any non-numeric text after the number at the beginning of the field. So for example, on the first line, we are actually literally checking if 12:00, team=xyz,user1=tom,dom=dby.com
is larger than 15 but it effectively checks if 12 is larger than 15 (which is obviously false).
When the condition is true, we split the original line $0
into the array n
on sequences of whitespace, and then print the first element of this array.
CodePudding user response:
Using awk
$ awk -F"[=, ]" '{for (i=1;i<NF;i ) if ($i=="ETA") if ($(i 1) > 15) print $1}' input_file
345
456
CodePudding user response:
With your shown samples please try following GNU awk
code. Using match
function of GNU awk
where I am using regex (^[0-9] ).*ETA=([0-9] ):[0-9]
which creates 2 capturing groups and saves its values into array arr. Then checking condition if 2nd element of arr is greater than 15 then print 1st value of arr array as per requirement.
awk '
match($0,/(^[0-9] ).*ETA=([0-9] ):[0-9] /,arr) && arr[2]>15{
print arr[1]
}
' Input_file
CodePudding user response:
Using awk
you could match ETA=
followed by 1 or more digits. Then get the match without the ETA= part and check if the number is greater than 15 and print the first field.
awk 'match($0, /ETA=[0-9] /) {
if(substr($0, RSTART 4, RLENGTH-4) > 15) print $1
}' file
Output
345
456
If the first field should start with a number:
awk '/^[0-9]/ && match($0, /ETA=[0-9] /) {
if(substr($0, RSTART 4, RLENGTH-4) > 15) print $1
}' file