I have the following file called st.txt:
Item Type Amount Date
Petrol expense -160 2020-01-23
Electricity expense -200 2020-03-24
Electricity expense -200 2020-04-24
Trim line expense -50 2020-05-30
Martha Burns income 150 2021-03-11
Highbury shops income 300 2021-03-14
I want to sort the data by date and print all data except the first line. The following command works:
awk -F '\t' 'NR>1{print $4"\t"$1"\t"$2"\t"$3}' st.txt | sort -t"-" -n -k1 -k2 -k3
The output then is:
2020-01-23 Petrol expense -160
2020-03-24 Electricity expense -200
2020-04-24 Electricity expense -200
2020-05-30 Trim line expense -50
2021-03-11 Martha Burns income 150
2021-03-14 Highbury shops income 300
How can I write this command so I do not have to rearrange the columns so the date field remains at $4? I tried the following but it does not work:
awk -F '\t' 'NR>1{print $0}' st.txt | sort -t"-" -n -k 4,1 -k 4,2 -k 4,3
The dates are not sorted with this command.
The output should be:
Petrol expense -160 2020-01-23
Electricity expense -200 2020-03-24
Electricity expense -200 2020-04-24
Trim line expense -500 2020-05-30
Martha Burns income 150 2021-03-11
Highbury shops income 300 2021-03-14
CodePudding user response:
With GNU awk:
awk -F '\t' 'NR>1{a[$4]=$0} END{PROCINFO["sorted_in"] = "@ind_str_asc"; for(i in a){print a[i]}}' file
Output:
Petrol expense -160 2020-01-23 Electricity expense -200 2020-03-24 Electricity expense -200 2020-04-24 Trim line expense -50 2020-05-30 Martha Burns income 150 2021-03-11 Highbury shops income 300 2021-03-14
CodePudding user response:
Assuming the fields in your input file are tab-separated as your code suggests they are:
$ tail -n 2 file | sort -t$'\t' -k4
Petrol expense -160 2020-01-23
Electricity expense -200 2020-03-24
Electricity expense -200 2020-04-24
Trim line expense -50 2020-05-30
Martha Burns income 150 2021-03-11
Highbury shops income 300 2021-03-14