Home > Blockchain >  Rearrange rows of a csv based on a column value
Rearrange rows of a csv based on a column value

Time:11-30

I am trying to rearrange the rows in a csv based on the key given in 5th column of my data. My data looks like this (test.csv):

Col A,Col B,Col C,Col D,Col E
A,Data 1,Category 1,Name 1,C
B,Data 2,Category 2,Name 2,C
C,Data 3,Category 3,Name 3,C
D,Data 4,Category 4,Name 4,C
E,Data 5,Category 5,Name 5,C
F,Data 6,Category 6,Name 6,C

I am trying to rearrange it so that the row containing the key value in first column is at top (in this case the key value is C)

Desired output :

Col A,Col B,Col C,Col D,Col E
C,Data 3,Category 3,Name 3,C
A,Data 1,Category 1,Name 1,C
B,Data 2,Category 2,Name 2,C
D,Data 4,Category 4,Name 4,C
E,Data 5,Category 5,Name 5,C
F,Data 6,Category 6,Name 6,C

I have written the below code and also getting the desired result doing so i am generating two temporary files , just wondering if there is a better solution :

sed 1d test.csv > input.csv 
key=`awk -F"," -v 'OFS=,' '{ print $5}' input.csv | uniq`
awk -F"," -v 'OFS=,' '{if($1 == "'$key'") print}' input.csv > temp.csv
cat temp.csv input.csv > temp2.csv

awk '!seen[$0]  ' temp2.csv > output.csv

sed -i '1iCol A,Col B,Col C,Col D,Col E' output.csv

Please help !

CodePudding user response:

Using GNU sed

$ sed -Ez 's/^([^\n]*\n)(.*\n)(([[:alpha:]])[^\n]*\4\n)/\1\3\2/woutput.csv' input_file
$ cat output.csv
Col A,Col B,Col C,Col D,Col E
C,Data 3,Category 3,Name 3,C
A,Data 1,Category 1,Name 1,C
B,Data 2,Category 2,Name 2,C
D,Data 4,Category 4,Name 4,C
E,Data 5,Category 5,Name 5,C
F,Data 6,Category 6,Name 6,C

CodePudding user response:

You could do something like this:

awk -F ',' '
    !seen[$0]   {
        if ( $1 == $5 || NR == 1 )
            print
        else
            arr[  n] = $0
    }
    END { for (i = 1; i <= n; i  ) print arr[i] }
' input.csv > output.csv

That'll bufferize the non-matching lines and output them at the end.

output

Col A,Col B,Col C,Col D,Col E
C,Data 3,Category 3,Name 3,C
A,Data 1,Category 1,Name 1,C
B,Data 2,Category 2,Name 2,C
D,Data 4,Category 4,Name 4,C
E,Data 5,Category 5,Name 5,C
F,Data 6,Category 6,Name 6,C
  • Related