I have a file with rows like following, where 3rd column has multiple numeric values which I need to sort:
file: h1.csv
Class S101-T1;3343-1-25310;3344-1-25446 3345-1-25691 3348-1-27681 3347-1-28453
Class S101-T2;3343-2-25310;3344-2-25446 3345-2-25691
Class S101-T1;3343-3-25310;3345-3-25691 3343-3-25314
Class S101-T2;3343-4-25310;3345-4-25691 3343-4-25314 3344-4-25314
Class S102-T1;3343-5-25310;3344-5-25446 3345-5-25691
So, expected output is:
Class S101-T1;3343-1-25310;3344-1-25446 3345-1-25691 3347-1-28453 3348-1-27681
Class S101-T2;3343-2-25310;3344-2-25446 3345-2-25691
Class S101-T1;3343-3-25310;3343-3-25314 3345-3-25691
Class S101-T2;3343-4-25310;3343-4-25314 3344-4-25314 3345-4-25691
Class S102-T1;3343-5-25310;3344-5-25446 3345-5-25691
My idea was to capture 3rd column with awk and the sort it, and finally print output, but I have arrived only to capture the column. I have not succeeded in sorting it, nor printing disired output.
Here's the code I've got so far...
cat h1.csv | awk -F';' '{ gsub(" ","\n",$3); print $0 }'
I have tried (and some others giving error):
cat h1.csv | awk -F';' '{ gsub(" ","\n",$3); print $3 | "sort -u" }'
cat h1.csv | awk -F';' '{ gsub(" ","\n",$3); sort -u; print $3 }'
So, is it possible to do so, how?, any help! Thanks...
CodePudding user response:
One option could be to split the 3rd column on a space, and then sort the values.
Then concatenate the first 2 fields and the splitted and sorted fields again.
awk '
BEGIN{FS=OFS=";"}
{
n=split($3, a, " ")
asort(a)
res = $1 OFS $2 OFS
for (i = 1; i <= n; i ) {
res = res " " a[i]
}
print res
}' file
Output
Class S101-T1;3343-1-25310; 3344-1-25446 3345-1-25691 3347-1-28453 3348-1-27681
Class S101-T2;3343-2-25310; 3344-2-25446 3345-2-25691
Class S101-T1;3343-3-25310; 3343-3-25314 3345-3-25691
Class S101-T2;3343-4-25310; 3343-4-25314 3344-4-25314 3345-4-25691
Class S102-T1;3343-5-25310; 3344-5-25446 3345-5-25691
CodePudding user response:
In GNU awk
, with your shown samples, please try following awk
code.
awk '
BEGIN{
FS=OFS=";"
PROCINFO["sorted_in"] = "@val_num_asc"
}
{
nf=val=""
delete value
num=split($NF,arr," ")
for(i=1;i<=num;i ){
split(arr[i],arr2,"-")
value[arr2[1]]=arr[i]
}
for(i in value){
nf=(nf?nf " ":"")value[i]
}
$NF=nf
}
1
' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here.
FS=OFS=";" ##Setting FS, OFS as ; here.
PROCINFO["sorted_in"] = "@val_num_asc" ##Setting PROCINFO using sorted_in to make sure array values are sorted by values in ascending order only.
}
{
nf=val="" ##Nullifying variables here.
delete value ##Deleting value array here.
num=split($NF,arr," ") ##Splitting last field into arr with separator as space here.
for(i=1;i<=num;i ){ ##Traversing through all elements of array arr.
split(arr[i],arr2,"-") ##Splitting first value of arr into arr2 by delimiter of - to make sure to get only first value eg: 3344, 3345 etc.
value[arr2[1]]=arr[i] ##Assigning value array value to arr value with index of arr2 value whose index of 1st.
}
for(i in value){ ##Traversing through array value here.
nf=(nf?nf " ":"")value[i] ##Concatenating all values to nf here.
}
$NF=nf ##Assigning last field value to nf here.
}
1 ##printing edited/non-edited line here.
' Input_file ##Mentioning Input_file name here.