here's the full pipeline im developing so far:
cut -d ";" -f 1,2,3,5 merge_of_raw_data.csv | sort -t";" -k4 -r | cut -d ";" -f 1-3 | uniq
so the last cut command will give something like that:
1 2 3
2 3 4
3 4 5
1 2 3
1 2 5
1 3 3
but then i would like to keep only the uniq lines based on the first two field
using:
sort -k1,1 -k2,2 --unique
doesn't solve what i want since i need to keep the first occurrence has its already sorted by date.
the expected output for this example would be:
1 2 3
2 3 4
3 4 5
1 3 3
CodePudding user response:
Your input after the last cut is producing your expected output with:
cat input.txt | sort -k1,1 -k2,2 --unique
However, the order is different than your expected output:
1 2 3
1 3 3
2 3 4
3 4 5
The output is sorted based on the k1, then k2 columns.