I have a scenario where I have below data in file which need to be sorted based on date column , first line is the headers it should not be sorted
NAME|AGE|COURSE|DATES
v1|31|MC|12 JUL 2019
v2|33|MB|4 JUL 2019
v3|12|GG|13 JUL 2019
v4|21|JJ|7 JUL 2019
My code :
sort -n -k k4 /d/file.txt
This above code does not sort my data
Expected Output :
NAME|AGE|COURSE|DATES
v4|21|JJ|7 JUL 2019
v2|33|MB|4 JUL 2019
v1|31|MC|12 JUL 2019
v3|12|GG|13 JUL 2019
CodePudding user response:
The way do to this is with Command Grouping where you can extract the header from an input stream, print it, and consume the remaining data:
{
IFS= read -r header
echo "$header"
sort ...
} < file.txt
However, sorting dates with that format is tricky. Here's how you have to do it so the output is sorted chronologically. This assumes GNU sort:
$ cat file.txt # I added a couple of extra records
NAME|AGE|COURSE|DATES
v1|31|MC|12 JUL 2019
v2|33|MB|4 JUL 2019
v3|12|GG|13 JUL 2019
v4|21|JJ|7 JUL 2019
11|22|33|1 JUL 2020
aa|bb|cc|10 AUG 2019
$ {
IFS= read -r header
echo "$header"
sort -t'|' -n -s -k4 | sort -M -s -k 2,2 | sort -n -s -k 3,3
} < file.txt
NAME|AGE|COURSE|DATES
v2|33|MB|4 JUL 2019
v4|21|JJ|7 JUL 2019
v1|31|MC|12 JUL 2019
v3|12|GG|13 JUL 2019
aa|bb|cc|10 AUG 2019
11|22|33|1 JUL 2020
That uses the GNU sort "stable" option so you sort first by day, then by month, then by year.
CodePudding user response:
Borrowing @glennjackman's sample input, this will work using any versions of the mandatory Unix tools awk, sort, and cut:
$ awk '
BEGIN { FS="|"; OFS="\t" }
{
split($NF,d," ")
mthNr = (index("JANFEBMARAPRMAYJUNJULAUGSEPOCTNOVDEC",d[2]) 2)/3
print (NR>1), d[3], mthNr, d[1], NR, $0
}
' file.txt |
sort -k1,1n -k2,2n -k3,3n -k4,4n -k5,5n |
cut -f6-
NAME|AGE|COURSE|DATES
v2|33|MB|4 JUL 2019
v4|21|JJ|7 JUL 2019
v1|31|MC|12 JUL 2019
v3|12|GG|13 JUL 2019
aa|bb|cc|10 AUG 2019
11|22|33|1 JUL 2020