assigning a var inside AWK for use outside awk-CodePudding

I am using ksh on AIX.

I have a file with multiple comma delimited fields. The value of each field is read into a variable inside the script.

The last field in the file may contain multiple | delimited values. I need to test each value and keep the first one that doesn't begin with R, then stop testing the values.

sample value of $principal_diagnosis0 R65.20|A41.9|G30.9|F02.80

I've tried: echo $principal_diagnosis0 | awk -F"|" '{for (i = 1; i<=NF; i ) {if ($i !~ "R"){echo $i; primdiag = $i}}}' but I get this message : awk: Field $i is not correct.

My goal is to have a variable that I can use outside of the awk statement that gets assigned the first non-R code (in this case it would be A41.9).

echo $principal_diagnosis0 | awk -F"|" '{for (i = 1; i<=NF; i ) {if ($i !~ "R"){print $i}}}' gets me the output of : A41.9 G30.9 F02.80

So I know it's reading the values and evaluating properly. But I need to stop after the first match and be able to use that value outside of awk.

Thanks!

CodePudding user response：

you can make FS and OFS do all the hard work :

echo "${principal_diagnosis0}" |

mawk NF=NF FS='^(R[^|] [|]) |[|]. $' OFS= 

A41.9

——————————————————————————————————————————

another slightly different variation of the same concept — overwriting fields but leaving OFS as is :

gawk -F'^.*R[^|] [|]|[|]. $' '$--NF=$--NF' 

A41.9

this works, because when you break it out :

gawk -F'^.*R[^|] [|]|[|]. $' '

                       { print NF 
} $(_=--NF)=$(__=--NF) { print _, __, NF, $0 }'

3
1 2 1 A41.9

you'll notice you start with NF = 3, and the two subsequent decrements make it equivalent to $1 = $2,

but since final NF is now reduced to just 1, it would print it out correctly instead of 2 copies of it

…… which means you can also make it $0 = $2, as such :

gawk -F'^.*R[^|] [|]|[|]. $' '$-_=$-—NF'

A41.9

——————————————————————————————————————————

a 3rd variation, this time using RS instead of FS :

mawk NR==2 RS='^.*R[^|] [|]|[|]. $'

A41.9

——————————————————————————————————————————

and if you REALLY don't wanna mess with FS/OFS/RS, use gsub() instead :

nawk 'gsub("^.*R[^|] [|]|[|]. $",_)'
 
A41.9

CodePudding user response：

To answer your specific question:

$ principal_diagnosis0='R65.20|A41.9|G30.9|F02.80'

$ foo=$(echo "$principal_diagnosis0" | awk -v RS='|' '/^[^R]/{sub(/\n/,""); print; exit}')

$ echo "$foo"
A41.9

The above will work with any awk, you can do it more briefly with GNU awk if you have it:

foo=$(echo "$principal_diagnosis0" | awk -v RS='[|\n]' '/^[^R]/{print; exit}')