I have a space delimited list that has an uneven amount of spaces in what would be the first column. I want to reverse sort this by the first number that appears after its string. I need to do this using bash commands.
Example:
Pontiac Firebird 19.0 6 250.0 100.0 3282. 15.0 71 US
Pontiac J2000 SE Hatchback 31.0 4 112.0 85.00 2575. 16.2 82 US
Oldsmobile Delta 88 Royale 12.0 8 350.0 160.0 4456. 13.5 72 US
Oldsmobile Omega 11.0 8 350.0 180.0 3664. 11.0 73 US
AMC Gremlin 20.0 6 232.0 100.0 2914. 16.0 75 US
AMC Gremlin 21.0 6 199.0 90.00 2648. 15.0 70 US
Pontiac Lemans V6 21.5 6 231.0 115.0 3245. 15.4 79 US
Would turn into:
Oldsmobile Omega 11.0 8 350.0 180.0 3664. 11.0 73 US
Oldsmobile Delta 88 Royale 12.0 8 350.0 160.0 4456. 13.5 72 US
Pontiac Firebird 19.0 6 250.0 100.0 3282. 15.0 71 US
AMC Gremlin 20.0 6 232.0 100.0 2914. 16.0 75 US
AMC Gremlin 21.0 6 199.0 90.00 2648. 15.0 70 US
Pontiac Lemans V6 21.5 6 231.0 115.0 3245. 15.4 79 US
Pontiac J2000 SE Hatchback 31.0 4 112.0 85.00 2575. 16.2 82 US
I've tried doing sort -nr
to see what happens and it reverse sorts the list, but respective to it's alphabetized order. I want to sort based on all values.
The trick is that I must keep it space delimited. What's the best way to do this using bash?
CodePudding user response:
I must keep it space delimited
You mean, the result has to be space delimited again, right? During processing, you can transform the input however you like.
Assuming you know a character that never appears in your file otherwise, delimit the value you want to sort with by that character using sed
, then sort by that value, then remove the additional delimiters again. (This process is basically a Schwartzian transform.)
Here we use the bell character \a
to delimit the key for sorting. It is very unlikely that that character is in a text file.
sed -E 's/ ([0-9] \.[0-9] ) / \a\1\a /' | sort -t $'\a' -k2,2n | tr -d \\a
CodePudding user response:
here's a short ruby program:
ruby -e '
puts IO.readlines(ARGV.shift, chomp: true)
.map {|line|
fields = line.split
[fields[0..(fields.size - 9)].join(" ")] fields[-8 .. -1]
}
.sort_by {|row| row[1]}
.map {|row| row.join(" ")}
.join("\n")
' file
CodePudding user response:
I would use GNU AWK
for this as follows, let file.txt
content be
Pontiac Firebird 19.0 6 250.0 100.0 3282. 15.0 71 US
Pontiac J2000 SE Hatchback 31.0 4 112.0 85.00 2575. 16.2 82 US
Oldsmobile Delta 88 Royale 12.0 8 350.0 160.0 4456. 13.5 72 US
Oldsmobile Omega 11.0 8 350.0 180.0 3664. 11.0 73 US
AMC Gremlin 20.0 6 232.0 100.0 2914. 16.0 75 US
AMC Gremlin 21.0 6 199.0 90.00 2648. 15.0 70 US
Pontiac Lemans V6 21.5 6 231.0 115.0 3245. 15.4 79 US
then
awk 'BEGIN{FPAT="[0-9]*[.][0-9]*";PROCINFO["sorted_in"]="@ind_num_asc"}{arr[$1]=$0}END{for(i in arr){print arr[i]}}' file.txt
output
Oldsmobile Omega 11.0 8 350.0 180.0 3664. 11.0 73 US
Oldsmobile Delta 88 Royale 12.0 8 350.0 160.0 4456. 13.5 72 US
Pontiac Firebird 19.0 6 250.0 100.0 3282. 15.0 71 US
AMC Gremlin 20.0 6 232.0 100.0 2914. 16.0 75 US
AMC Gremlin 21.0 6 199.0 90.00 2648. 15.0 70 US
Pontiac Lemans V6 21.5 6 231.0 115.0 3245. 15.4 79 US
Pontiac J2000 SE Hatchback 31.0 4 112.0 85.00 2575. 16.2 82 US
Explanation: I inform GNU AWK
that field is 0 or more digits followed by literal dot ([.]
) followed by 0 or more digits (note: I assume that there will always be dot in first number and never dot in column with name) and that array traversal should be treat-indices-as-numbers-ascending which is one of Predefined Array Scanning Orders. For each line I add to array pair with key being first number ($1
) and value being whole line ($0
). After going through all lines I print
values from array arr
with order which observe selected array traversal.
(tested in gawk 4.2.1)