I am trying to rearrange the rows in a csv based on the key given in 5th column of my data. My data looks like this (test.csv):
Col A,Col B,Col C,Col D,Col E
A,Data 1,Category 1,Name 1,C
B,Data 2,Category 2,Name 2,C
C,Data 3,Category 3,Name 3,C
D,Data 4,Category 4,Name 4,C
E,Data 5,Category 5,Name 5,C
F,Data 6,Category 6,Name 6,C
I am trying to rearrange it so that the row containing the key value in first column is at top (in this case the key value is C)
Desired output :
Col A,Col B,Col C,Col D,Col E
C,Data 3,Category 3,Name 3,C
A,Data 1,Category 1,Name 1,C
B,Data 2,Category 2,Name 2,C
D,Data 4,Category 4,Name 4,C
E,Data 5,Category 5,Name 5,C
F,Data 6,Category 6,Name 6,C
I have written the below code and also getting the desired result doing so i am generating two temporary files , just wondering if there is a better solution :
sed 1d test.csv > input.csv
key=`awk -F"," -v 'OFS=,' '{ print $5}' input.csv | uniq`
awk -F"," -v 'OFS=,' '{if($1 == "'$key'") print}' input.csv > temp.csv
cat temp.csv input.csv > temp2.csv
awk '!seen[$0] ' temp2.csv > output.csv
sed -i '1iCol A,Col B,Col C,Col D,Col E' output.csv
Please help !
CodePudding user response:
Using GNU sed
$ sed -Ez 's/^([^\n]*\n)(.*\n)(([[:alpha:]])[^\n]*\4\n)/\1\3\2/woutput.csv' input_file
$ cat output.csv
Col A,Col B,Col C,Col D,Col E
C,Data 3,Category 3,Name 3,C
A,Data 1,Category 1,Name 1,C
B,Data 2,Category 2,Name 2,C
D,Data 4,Category 4,Name 4,C
E,Data 5,Category 5,Name 5,C
F,Data 6,Category 6,Name 6,C
CodePudding user response:
You could do something like this:
awk -F ',' '
!seen[$0] {
if ( $1 == $5 || NR == 1 )
print
else
arr[ n] = $0
}
END { for (i = 1; i <= n; i ) print arr[i] }
' input.csv > output.csv
That'll bufferize the non-matching lines and output them at the end.
output
Col A,Col B,Col C,Col D,Col E
C,Data 3,Category 3,Name 3,C
A,Data 1,Category 1,Name 1,C
B,Data 2,Category 2,Name 2,C
D,Data 4,Category 4,Name 4,C
E,Data 5,Category 5,Name 5,C
F,Data 6,Category 6,Name 6,C