Home > OS >  How to rearrange the columns using awk?
How to rearrange the columns using awk?

Time:08-09

I have a file with 120 columns. A part of it is here with 12 columns.

A1      B1     C1      D1       A2      B2     C2      D2       A3      B3      C3      D3     
4       4       5       2       3       3       2       1       9       17      25      33
5       6       4       6       8       2       3       5       3       1       -1      -3
7       8       3       10      13      1       4       9       -3      -15     -27     -39
9       10      2       14      18      0       5       13      -9      -31     -53     -75
11      12      1       18      23      -1      6       17      -15     -47     -79     -111
13      14      0       22      28      -2      7       21      -21     -63     -105    -147
15      16      -1      26      33      -3      8       25      -27     -79     -131    -183
17      18      -2      30      38      -4      9       29      -33     -95     -157    -219
19      20      -3      34      43      -5      10      33      -39     -111    -183    -255
21      22      -4      38      48      -6      11      37      -45     -127    -209    -291

I would like to rearrange it by bringing all A columns together (A1 A2 A3 A4) and similarly all Bs (B1 B2 B3 B4), Cs (C1 C2 C3 C4), Ds (D1 D2 D3 D4) together.

I am looking to print the columns as

A1 A2 A3 A4 B1 B2 B3 B4 C1 C2 C3 C4 D1 D2 D3 D4
 

My script is:

#!/bin/sh
sed -i '1d' input.txt
for i in {1..4};do
    j=$(( 1   $(( 3 * $((  i - 1 )) ))  ))
awk '{print $'$j'}' input.txt >> output.txt
done
for i in {1..4};do
    j=$(( 2   $(( 3 * $((  i - 1 )) ))  ))
awk '{print $'$j'}' input.txt >> output.txt
done
for i in {1..4};do
    j=$(( 3   $(( 3 * $((  i - 1 )) ))  ))
awk '{print $'$j'}' input.txt >> output.txt
done

It is printing all in one column.

CodePudding user response:

Is it just A,B,C,D,A,B,C,D all the way across? Something like this should work:

awk '{
    for (i=0; i<4;   i) {  # i=0:A, i=1:B,etc.
       for (j=0; 4*j i<NF;   j) {
         printf "%s%s", $(4*j i 1), OFS;
       }
    }
    print ""
}'

CodePudding user response:

Here are two Generic approach solutions, without hard-coding the field numbers from Input_file, values can come in any order and it will sort them automatically. Written and tested in GNU awk with shown samples.

1st solution: Traverse through all the lines and their respective fields and then sort by values to perform indexing on headers.

awk '
FNR==1{
  for(i=1;i<=NF;i  ){
     arrInd[i]=$i
  }
  next
}
{
  for(i=1;i<=NF;i  ){
     value[FNR,arrInd[i]]=$i
  }
}
END{
  PROCINFO["sorted_in"]="@val_num_asc"
  for(i in arrInd){
     printf("%s%s",arrInd[i],i==length(arrInd)?ORS:OFS)
  }
  for(i=2;i<=FNR;i  ){
     for(k in arrInd){
        printf("%s%s",value[i,arrInd[k]],k==length(arrInd)?ORS:OFS)
     }
  }
}
'   Input_file

OR in case you want to get output in tabular format, then small tweak in above solution.

awk '
BEGIN { OFS="\t" }
FNR==1{
  for(i=1;i<=NF;i  ){
    arrInd[i]=$i
  }
  next
}
{
  for(i=1;i<=NF;i  ){
    value[FNR,arrInd[i]]=$i
  }
}
END{
  PROCINFO["sorted_in"]="@val_num_asc"
  for(i in arrInd){
    printf("%s%s",arrInd[i],i==length(arrInd)?ORS:OFS)
  }
  for(i=2;i<=FNR;i  ){
    for(k in arrInd){
       printf("%s%s",value[i,arrInd[k]],k==length(arrInd)?ORS:OFS)
    }
  }
}
' Input_file | column -t -s $'\t'


2nd solution: Almost same concept of 1st solution, here traversing through array within conditions rather than explicitly calling it in END block of this program.

awk '
BEGIN { OFS="\t" }
FNR==1{
  for(i=1;i<=NF;i  ){
    arrInd[i]=$i
  }
  next
}
{
  for(i=1;i<=NF;i  ){
    value[FNR,arrInd[i]]=$i
  }
}
END{
  PROCINFO["sorted_in"]="@val_num_asc"
  for(i=1;i<=FNR;i  ){
    if(i==1){
       for(k in arrInd){
          printf("%s%s",arrInd[k],k==length(arrInd)?ORS:OFS)
       }
    }
    else{
       for(k in arrInd){
          printf("%s%s",value[i,arrInd[k]],k==length(arrInd)?ORS:OFS)
       }
    }
  }
}
' Input_file | column -t -s $'\t'
  • Related