putting result of awk (multi-line variable) to another awk output-CodePudding

I just posted a question about using grep on multi-line shell variable, but I just realized that what I needed was slightly different. grep multiline shell variable from output of executable file

What I tried to do was this: I have a grep/awk result (I'll name this as result1):

blahblah ID1 blahblah aaa
blahblah ID2 blahblah bbb
blahblah ID3 blahblah ccc
...
blahblah ID(m) blahblah mmm
blahblah ID(n) blahblah nnn

And I have another awk result from a execution output (run | awk ~~~) (I'll name this as result2):

ID1 (some sentence 1)
ID2 (some sentence 2)
ID3 (some sentence 3)
...
IDn (some sentence n)

I'm trying to get the ID1~n and the last part of result1 (aaa~nnn) from result1 and add it to result2. what I want to make looks like this:

ID1 (sentence) aaa
ID2 (sentence) bbb
...
IDn (sentence) nnn

I somehow succeeded getting

ID1 aaa
ID2 bbb

from result1, so I only have the IDn's that I have in result2, but I have no idea how to separate it and put it exactly with matching lines of result2, so I can match ID1-aaa, ID2-bbb...and so on, so I can get

ID1 (sentence) aaa
ID2 (sentence) bbb
...
IDn (sentence) nnn

something like this.

CodePudding user response：

Like this?

$ head f1.txt  f2.txt 
==> f1.txt <==
blahblah ID1 blahblah aaa
blahblah ID2 blahblah bbb
blahblah ID3 blahblah ccc
blahblah ID(n) blahblah nnn

==> f2.txt <==
ID1 (some sentence 1)
ID2 (some sentence 2)
ID3 (some sentence 3)
IDn (some sentence n)

$ paste -d' ' f2.txt <(awk '{print $NF}' f1.txt)
ID1 (some sentence 1) aaa
ID2 (some sentence 2) bbb
ID3 (some sentence 3) ccc
IDn (some sentence n) nnn

Note that it's really helpful if one can assume (as I have) that the line numbers (record numbers, the IDs) match up within the files.

CodePudding user response：

Assumptions:

result1 has space-separated columns and the strings aaa ... nnn are in the last columns.
IDn in result1 consists of literal string ID followed by digits.
IDn in result2 are located in the first column.

Then would you please try:

awk '
    NR==FNR {
        if (match($0, /ID[0-9] /)) {
            id = substr($0, RSTART, RLENGTH)
            a[id] = $NF
        }
        next
    }
    {
        print $0, a[$1]
    }
' result1 result2

The NR==FNR { .. ; next} block is an idiom to be exectuted for the file only in the first argument (result1 in this case).
The function match($0, /ID[0-9] /) returns true if a substring in the record matches a string ID followed by digits, assigining awk variables RSTART and RLENGTH to the starting position and the length of the match, individually.
substr($0, RSTART, RLENGTH) extracts the substring IDn where n is the digits.
a[id] = $NF associates the last part (e.g. aaa) to the id.
The {print $0, a[$1]} block is executed for result2 only.