Home > Software engineering >  assigning a var inside AWK for use outside awk
assigning a var inside AWK for use outside awk

Time:05-11

I am using ksh on AIX.

I have a file with multiple comma delimited fields. The value of each field is read into a variable inside the script.

The last field in the file may contain multiple | delimited values. I need to test each value and keep the first one that doesn't begin with R, then stop testing the values.

sample value of $principal_diagnosis0 R65.20|A41.9|G30.9|F02.80

I've tried: echo $principal_diagnosis0 | awk -F"|" '{for (i = 1; i<=NF; i ) {if ($i !~ "R"){echo $i; primdiag = $i}}}' but I get this message : awk: Field $i is not correct.

My goal is to have a variable that I can use outside of the awk statement that gets assigned the first non-R code (in this case it would be A41.9).

echo $principal_diagnosis0 | awk -F"|" '{for (i = 1; i<=NF; i ) {if ($i !~ "R"){print $i}}}' gets me the output of : A41.9 G30.9 F02.80

So I know it's reading the values and evaluating properly. But I need to stop after the first match and be able to use that value outside of awk.

Thanks!

CodePudding user response:

you can make FS and OFS do all the hard work :

echo "${principal_diagnosis0}" |

mawk NF=NF FS='^(R[^|] [|]) |[|]. $' OFS= 

A41.9

——————————————————————————————————————————

another slightly different variation of the same concept — overwriting fields but leaving OFS as is :

gawk -F'^.*R[^|] [|]|[|]. $' '$--NF=$--NF' 

A41.9

this works, because when you break it out :

gawk -F'^.*R[^|] [|]|[|]. $' '

                       { print NF 
} $(_=--NF)=$(__=--NF) { print _, __, NF, $0 }'

3
1 2 1 A41.9

you'll notice you start with NF = 3, and the two subsequent decrements make it equivalent to $1 = $2,

but since final NF is now reduced to just 1, it would print it out correctly instead of 2 copies of it

…… which means you can also make it $0 = $2, as such :

gawk -F'^.*R[^|] [|]|[|]. $' '$-_=$-—NF'

A41.9

——————————————————————————————————————————

a 3rd variation, this time using RS instead of FS :

mawk NR==2 RS='^.*R[^|] [|]|[|]. $'

A41.9

——————————————————————————————————————————

and if you REALLY don't wanna mess with FS/OFS/RS, use gsub() instead :

nawk 'gsub("^.*R[^|] [|]|[|]. $",_)'
 
A41.9

CodePudding user response:

To answer your specific question:

$ principal_diagnosis0='R65.20|A41.9|G30.9|F02.80'

$ foo=$(echo "$principal_diagnosis0" | awk -v RS='|' '/^[^R]/{sub(/\n/,""); print; exit}')

$ echo "$foo"
A41.9

The above will work with any awk, you can do it more briefly with GNU awk if you have it:

foo=$(echo "$principal_diagnosis0" | awk -v RS='[|\n]' '/^[^R]/{print; exit}')
  • Related