Home > front end >  sed using array search and replace last column on csv file
sed using array search and replace last column on csv file

Time:11-22

I'm looking a solution with sed command to search and replace strings on last column of csv file and here the search patterns I'm calling from an array. Below script looks for 3rd and 4th column which causes a mismatch in the output.

Here i need your help how i can tell sed to look only on the last column.

file1.txt

QCQP
TXTT
QCQT
YYTH

file2.txt

TTYY
JPEK
QCQC
TTYE

Original output.csv

ID,Name,Class,Context,Code
1,jack,6,QCQT,QCQP
2,john,5,QCQP,TXTT
3,jake,3,TTXX,QCQT
4,jone,3,TXTT,YYTH

Below is my script which I used for this setup, but here this sed command search for all occurrence instead of looking for the last column separated by comma.

filein=file1.txt
fileout=file2.txt
pre=$(cat $filein)
post=$(cat $fileout)
typeset -p pre post

for (( i=0; i<${#pre[@]};   i )); do
sed -i -e 's/'"${pre[$i]}"'/'"${post[$i]}"'/g' output.csv
done

Expected result

output.csv

ID,Name,Class,Context,Code
1,jack,6,QCQT,TTYY
2,john,5,QCQP,JPEK
3,jake,3,TTXX,QCQC
4,jone,3,TXTT,TTYE

Using awk command I'm able to figure out similiar occurance, but the below works with a single variable, also not with comma seperator but with array this fails.

awk -F "," '{gsub(c,d,$(NF)); print}' c=$a d=$b file.txt

Please advise

Thanks Jay

CodePudding user response:

I'd use for this. With GNU awk:

gawk '
  BEGIN {FS = OFS = ","}
  ARGIND == 1 {f1[FNR] = $1; next}
  ARGIND == 2 {map[f1[FNR]] = $1; next}
  {$NF = map[$NF]; print}
' file1.txt file2.txt original.csv
ID,Name,Class,Context,
1,jack,6,QCQT,TTYY
2,john,5,QCQP,JPEK
3,jake,3,TTXX,QCQC
4,jone,3,TXTT,TTYE

But with sed, you can dynamically build up a sed program using file1 and file, and than apply that to the original csv

sed "$(paste -d " " file1.txt file2.txt | sed 's/^/s:,/; s/ /$:,/; s/$/:/')" original.csv

Execute that piece-by-piece to see how it all fits together.

CodePudding user response:

This sed should work:

$ cat f
ID,Name,Class,Context,Code
1,jack,6,QCQT,QCQP
2,john,5,QCQP,TXTT
3,jake,3,TTXX,QCQT
4,jone,3,TXTT,YYTH

$ sed -r 's/,(QCQP|TXTT|QCQT|YYTH)$/,aaa/' f f
ID,Name,Class,Context,Code
1,jack,6,QCQT,aaa
2,john,5,QCQP,aaa
3,jake,3,TTXX,aaa
4,jone,3,TXTT,aaa
ID,Name,Class,Context,Code
1,jack,6,QCQT,aaa
2,john,5,QCQP,aaa
3,jake,3,TTXX,aaa
4,jone,3,TXTT,aaa

But IMHO awk is better for this task.

  • Related