I have a file contains both name and numbers like : data.csv
2016,Bmw,M2,2 Total score:24
1998,Subaru,Legacy,23 Total score:62
2012,Volkswagen,Golf,59 Total score:28
2001,Dodge,Viper,42 Total score:8
2014,Honda,Accord,83 Total score:112
2015,Chevy,Camaro,0 Total score:0
2008,Honda,Accord,88 Total score:48
Total score is last column I did :
awk -F"," 'NR>1{{for(i=4;i<=6; i)printf $i""FS }
{sum=0; for(g=8;g<=NF;g )
sum =$g
print $i,"Total score:"sum ; print ""}}' data.csv
when i try
awk -F"," 'NR>1{{for(i=4;i<=6; i)printf $i""FS }
{sum=0; for(g=8;g<=NF;g )
sum =$g
print $i,"Total score:"sum ; print "" | "sort -k1,2n"}}' data.csv
It gave me error, I only want to sort total score column, is there anything I did it wrong? Any helps are appreciated
CodePudding user response:
First, assuming there are really not blank lined in between each line of data in data.csv
, all you need is sort
, you don't need awk
at all. For example, since there is only ':'
before the total score you want to sort descending by, you can use:
sort -t: -k2,2rn data.csv
Where -t:
tells sort to use ':'
as the field separator and then the KEYDEF -k2,2rn
tell sort to use the 2nd field (what's after the ':'
to sort by) and the rn
says use a reverse numeric sort on that field.
Example Use/Output
With your data (without blank lines) in data.csv
, you would have:
$ sort -t: -k2,2rn data.csv
2014,Honda,Accord,83 Total score:112
1998,Subaru,Legacy,23 Total score:62
2008,Honda,Accord,88 Total score:48
2012,Volkswagen,Golf,59 Total score:28
2016,Bmw,M2,2 Total score:24
2001,Dodge,Viper,42 Total score:8
2015,Chevy,Camaro,0 Total score:0
Which is your sort by Total score
in descending order. If you want ascending order, just remove the r
from -k2,2rn
.
If you do have blank lines, you can remove them before the sort with sed -i '/^$/d' data.csv
.
Sorting by number Before "Total score"
If you want to sort by the number that begins the XX Total score: yy
field (e.g. XX
), you can use sort with the field separator being a ','
and then your KEYDEF would be -k4.1,4.3rn
which just says sort using the 4th field character 1 through character 3 by the same reverse numeric, e.g.
sort -t, -k4.1,4.3rn data.csv
Example Use/Output
In this case, sorting by the number before Total score
in descending order would result in:
$ sort -t, -k4.1,4.3rn data.csv
2008,Honda,Accord,88 Total score:48
2014,Honda,Accord,83 Total score:112
2012,Volkswagen,Golf,59 Total score:28
2001,Dodge,Viper,42 Total score:8
1998,Subaru,Legacy,23 Total score:62
2016,Bmw,M2,2 Total score:24
2015,Chevy,Camaro,0 Total score:0
After posting the original solution I noticed it was ambiguous as which of the numbers on the 4th field you intended to sort on. In either case, here are both solutions. Let me know if you have further questions.