I just posted a question about using grep on multi-line shell variable, but I just realized that what I needed was slightly different. grep multiline shell variable from output of executable file
What I tried to do was this: I have a grep/awk result (I'll name this as result1):
blahblah ID1 blahblah aaa
blahblah ID2 blahblah bbb
blahblah ID3 blahblah ccc
...
blahblah ID(m) blahblah mmm
blahblah ID(n) blahblah nnn
And I have another awk result from a execution output (run | awk ~~~) (I'll name this as result2):
ID1 (some sentence 1)
ID2 (some sentence 2)
ID3 (some sentence 3)
...
IDn (some sentence n)
I'm trying to get the ID1~n and the last part of result1 (aaa~nnn) from result1 and add it to result2. what I want to make looks like this:
ID1 (sentence) aaa
ID2 (sentence) bbb
...
IDn (sentence) nnn
I somehow succeeded getting
ID1 aaa
ID2 bbb
from result1, so I only have the IDn's that I have in result2, but I have no idea how to separate it and put it exactly with matching lines of result2, so I can match ID1-aaa, ID2-bbb...and so on, so I can get
ID1 (sentence) aaa
ID2 (sentence) bbb
...
IDn (sentence) nnn
something like this.
CodePudding user response:
Like this?
$ head f1.txt f2.txt
==> f1.txt <==
blahblah ID1 blahblah aaa
blahblah ID2 blahblah bbb
blahblah ID3 blahblah ccc
blahblah ID(n) blahblah nnn
==> f2.txt <==
ID1 (some sentence 1)
ID2 (some sentence 2)
ID3 (some sentence 3)
IDn (some sentence n)
$ paste -d' ' f2.txt <(awk '{print $NF}' f1.txt)
ID1 (some sentence 1) aaa
ID2 (some sentence 2) bbb
ID3 (some sentence 3) ccc
IDn (some sentence n) nnn
Note that it's really helpful if one can assume (as I have) that the line numbers (record numbers, the IDs) match up within the files.
CodePudding user response:
Assumptions:
result1
has space-separated columns and the stringsaaa
...nnn
are in the last columns.IDn
inresult1
consists of literal stringID
followed by digits.IDn
inresult2
are located in the first column.
Then would you please try:
awk '
NR==FNR {
if (match($0, /ID[0-9] /)) {
id = substr($0, RSTART, RLENGTH)
a[id] = $NF
}
next
}
{
print $0, a[$1]
}
' result1 result2
- The
NR==FNR { .. ; next}
block is an idiom to be exectuted for the file only in the first argument (result1 in this case). - The function
match($0, /ID[0-9] /)
returns true if a substring in the record matches a stringID
followed by digits, assigining awk variablesRSTART
andRLENGTH
to the starting position and the length of the match, individually. substr($0, RSTART, RLENGTH)
extracts the substringIDn
wheren
is the digits.a[id] = $NF
associates the last part (e.g.aaa
) to the id.- The
{print $0, a[$1]}
block is executed forresult2
only.