Home > Mobile >  putting result of awk (multi-line variable) to another awk output
putting result of awk (multi-line variable) to another awk output

Time:12-19

I just posted a question about using grep on multi-line shell variable, but I just realized that what I needed was slightly different. grep multiline shell variable from output of executable file

What I tried to do was this: I have a grep/awk result (I'll name this as result1):

blahblah ID1 blahblah aaa
blahblah ID2 blahblah bbb
blahblah ID3 blahblah ccc
...
blahblah ID(m) blahblah mmm
blahblah ID(n) blahblah nnn

And I have another awk result from a execution output (run | awk ~~~) (I'll name this as result2):

ID1 (some sentence 1)
ID2 (some sentence 2)
ID3 (some sentence 3)
...
IDn (some sentence n)

I'm trying to get the ID1~n and the last part of result1 (aaa~nnn) from result1 and add it to result2. what I want to make looks like this:

ID1 (sentence) aaa
ID2 (sentence) bbb
...
IDn (sentence) nnn

I somehow succeeded getting

ID1 aaa
ID2 bbb

from result1, so I only have the IDn's that I have in result2, but I have no idea how to separate it and put it exactly with matching lines of result2, so I can match ID1-aaa, ID2-bbb...and so on, so I can get

ID1 (sentence) aaa
ID2 (sentence) bbb
...
IDn (sentence) nnn

something like this.

CodePudding user response:

Like this?

$ head f1.txt  f2.txt 
==> f1.txt <==
blahblah ID1 blahblah aaa
blahblah ID2 blahblah bbb
blahblah ID3 blahblah ccc
blahblah ID(n) blahblah nnn

==> f2.txt <==
ID1 (some sentence 1)
ID2 (some sentence 2)
ID3 (some sentence 3)
IDn (some sentence n)

$ paste -d' ' f2.txt <(awk '{print $NF}' f1.txt)
ID1 (some sentence 1) aaa
ID2 (some sentence 2) bbb
ID3 (some sentence 3) ccc
IDn (some sentence n) nnn

Note that it's really helpful if one can assume (as I have) that the line numbers (record numbers, the IDs) match up within the files.

CodePudding user response:

Assumptions:

  • result1 has space-separated columns and the strings aaa ... nnn are in the last columns.
  • IDn in result1 consists of literal string ID followed by digits.
  • IDn in result2 are located in the first column.

Then would you please try:

awk '
    NR==FNR {
        if (match($0, /ID[0-9] /)) {
            id = substr($0, RSTART, RLENGTH)
            a[id] = $NF
        }
        next
    }
    {
        print $0, a[$1]
    }
' result1 result2
  • The NR==FNR { .. ; next} block is an idiom to be exectuted for the file only in the first argument (result1 in this case).
  • The function match($0, /ID[0-9] /) returns true if a substring in the record matches a string ID followed by digits, assigining awk variables RSTART and RLENGTH to the starting position and the length of the match, individually.
  • substr($0, RSTART, RLENGTH) extracts the substring IDn where n is the digits.
  • a[id] = $NF associates the last part (e.g. aaa) to the id.
  • The {print $0, a[$1]} block is executed for result2 only.
  • Related