Home > Software engineering >  Using awk and sort last column in descending order in Linux
Using awk and sort last column in descending order in Linux

Time:02-21

I have a file contains both name and numbers like : data.csv

2016,Bmw,M2,2 Total score:24

1998,Subaru,Legacy,23 Total score:62

2012,Volkswagen,Golf,59 Total score:28

2001,Dodge,Viper,42 Total score:8

2014,Honda,Accord,83 Total score:112

2015,Chevy,Camaro,0 Total score:0

2008,Honda,Accord,88 Total score:48

Total score is last column I did :

awk -F","  'NR>1{{for(i=4;i<=6;  i)printf $i""FS }
                                         {sum=0; for(g=8;g<=NF;g  )
                                         sum =$g
                                         print $i,"Total score:"sum ; print ""}}' data.csv

when i try

 awk -F","  'NR>1{{for(i=4;i<=6;  i)printf $i""FS }
                                          {sum=0; for(g=8;g<=NF;g  )
                                         sum =$g
                                          print $i,"Total score:"sum ; print "" | "sort -k1,2n"}}' data.csv

It gave me error, I only want to sort total score column, is there anything I did it wrong? Any helps are appreciated

CodePudding user response:

First, assuming there are really not blank lined in between each line of data in data.csv, all you need is sort, you don't need awk at all. For example, since there is only ':' before the total score you want to sort descending by, you can use:

sort -t: -k2,2rn data.csv

Where -t: tells sort to use ':' as the field separator and then the KEYDEF -k2,2rn tell sort to use the 2nd field (what's after the ':' to sort by) and the rn says use a reverse numeric sort on that field.

Example Use/Output

With your data (without blank lines) in data.csv, you would have:

$ sort -t: -k2,2rn data.csv
2014,Honda,Accord,83 Total score:112
1998,Subaru,Legacy,23 Total score:62
2008,Honda,Accord,88 Total score:48
2012,Volkswagen,Golf,59 Total score:28
2016,Bmw,M2,2 Total score:24
2001,Dodge,Viper,42 Total score:8
2015,Chevy,Camaro,0 Total score:0

Which is your sort by Total score in descending order. If you want ascending order, just remove the r from -k2,2rn.

If you do have blank lines, you can remove them before the sort with sed -i '/^$/d' data.csv.

Sorting by number Before "Total score"

If you want to sort by the number that begins the XX Total score: yy field (e.g. XX), you can use sort with the field separator being a ',' and then your KEYDEF would be -k4.1,4.3rn which just says sort using the 4th field character 1 through character 3 by the same reverse numeric, e.g.

sort -t, -k4.1,4.3rn data.csv

Example Use/Output

In this case, sorting by the number before Total score in descending order would result in:

$ sort -t, -k4.1,4.3rn data.csv
2008,Honda,Accord,88 Total score:48
2014,Honda,Accord,83 Total score:112
2012,Volkswagen,Golf,59 Total score:28
2001,Dodge,Viper,42 Total score:8
1998,Subaru,Legacy,23 Total score:62
2016,Bmw,M2,2 Total score:24
2015,Chevy,Camaro,0 Total score:0

After posting the original solution I noticed it was ambiguous as which of the numbers on the 4th field you intended to sort on. In either case, here are both solutions. Let me know if you have further questions.

  • Related