I have an "example_file" like below,
a2022.1 a2022.1 80
a2022.1 a2022.2 90
a2022.1 a2023.1 80
a2022.2 a2022.2 90
a2022.2 a2023.1 40
a2022.3 a2022.1 50
b20225.1 a2022.1 80
and I would like to select lines which has the third column > 80 and also has the 1st column (before the dot) different from the 2nd column (before the dot).
So the desired result would be
a2022.1 a2023.1 80
b20225.1 a2022.1 80
So I already managed to exclude the lines with 3rd column <80 and with the same 1st and 2nd column by using the line below
awk '($3 > 80 && $1!=$2)' example_file
and I tried awk '($3 > 80 && $1!~$2)' example_file
, but it does not exclude the line
a2022.1 a2022.2 90
I thought it could be easy, but I can't figure it out. Could somebody help? Thanks!
CodePudding user response:
You may use this awk
:
awk -F '[.[:blank:]]' '$NF >= 80 && $1 != $3' file
a2022.1 a2023.1 80
b20225.1 a2022.1 80
Here -F '[.[:blank:]]'
sets input field separator to a dot or whitespace character. This allows us to use parts of first column as $1
and $2
and second column as $3
and $4
.
Alternatively using gnu-awk
:
awk '$3 >= 80 &&
gensub(/\..*/, "", "1", $1) != gensub(/\..*/, "", "1", $2)' file
a2022.1 a2023.1 80
b20225.1 a2022.1 80
CodePudding user response:
1st solution: With your shown samples, please try following awk
code. Using split
function of awk
here to get exact values from fields and compare them accordingly.
awk '
{
split($1,arr1,".")
split($2,arr2,".")
}
$3>=80 && arr1[1]!=arr2[1]
' Input_file
2nd solution: With GNU awk
using it's match
function please try following. Using regex ^.([^.]*)\..\s .([^.]*)\..\s (\S )
for getting values as per requirement into the array(through capturing groups) and comparing them as per conditions in further condition.
awk '
match($0,/^.([^.]*)\..\s .([^.]*)\..\s (\S )/,arr) && (arr[3]>=80 && (arr[1]!=arr[2]))
' Input_file
CodePudding user response:
Alternatively, you can use rquery (https://github.com/fuyuncat/rquery/releases) to do a straight forward query.
$./rq -q "p d/ /r | s @raw | f @3>=80 and substr(@1,0,instr(@1,'.'))!=substr(@2,0,instr(@2,'.'))" samples/example_file
a2022.1 a2023.1 80
b20225.1 a2022.1 80