Compare two files and update one with other in shell script-CodePudding

I have two files which are having same key column and based on that I have to update one file with other. Could you please help me to write logic for this in shell script?

Example:

src file:

 ------- ------- 
|call_id|call_nm|
 ------- ------- 
|    100|     QC|
|    105|     XY|
|    110|     NM|
|    115|     AB|
 ------- -------

lkp file:

 ------- ------- 
|call_id|call_nm|
 ------- ------- 
|    100|     QZ|
|    105|     XY|
|    106|     XZ|
|    115|     JQ|
 ------- -------

In above example I have to match both the files and update src_File based on the value in lkp_file. We have key_column - call_id.

First check key_column and then update the value in src_file. The final output should be like this-

src file:

 ------- ------- 
|call_id|call_nm|
 ------- ------- 
|    100|     QZ|
|    105|     XY|
|    110|     NM|
|    115|     JQ|
 ------- -------

You can see value of 100 & 115 have changed based on lkp_file values.

CodePudding user response：

Try this(back up your original file before exec this command):

grep -Eo '[0-9]{3,3}|[A-Z]{2,2}' lkp | xargs -n2 | xargs -I {} bash -c 'o=$(echo {}); n=(${o}); id=${n[0]}; nm=${n[1]}; sed -i "s/\($id|\ *\).*|/\1$nm|/" src'

'[0-9]{3,3}|[A-Z]{2,2}': extract id and nm
xargs -n2: create the pairs like 100 QZ
-I {} bash -c: handle the pairs to id and nm variable for sed command
"s/$$id|\ *$.*|/\1$nm|/": only suitable for your file format like | 100| QC|, just replace the nm before |, or you could use the pattern what you want such as [A-Z]

CodePudding user response：

Suggesting awk script:

`script.awk`

BEGIN     {FS = OFS = "|"} # set awk field seperator and output field seperator to |
NR == 1   {footer = $0} # read footer
NR < 4    {header = header ORS $0; next} # read and ignore header
FNR == NR {arr[$2] = $3; next} # read array from 1st file
$2 in arr {arr[$2] = $3} # update arra if found id in file2
END {  # output the formated array
  print header;
  asorti(arr,arrSorted); # sort array indexes in arrSorted
  for (i in arrSorted) { # for each element in sorted array
    if (arrSorted[i] == "") continue; # ignore empty element
    print OFS arrSorted[i] OFS arr[arrSorted[i]] OFS; # print current element
  }
  print footer;
}

running `script.awk`

awk script.awk src_file.txt lkp_file.txt

output:

 ------- ------- 
|call_id|call_nm|
 ------- ------- 
|    100|     QZ|
|    105|     XY|
|    110|     NM|
|    115|     JQ|
 ------- -------

Notice the output is correct and sorted in id order.

script.awk

running script.awk

output:

`script.awk`

running `script.awk`