Aligning rows and columns of a file but skip certain lines; using Perl-CodePudding

I have a file my_file.txt. File content:

$cat my_file.txt

unrelated first line
unrelated 2nd line

---------------------------------------------------------------------------------
Type Bird   Mammal   Fish   Reptile  Amphibian 
---------------------------------------------------------------------------------
T1   Age   Weight  Age   Weight   Age   Weight   Age   Weight   Age   Weight
    2   2.5            4 4.5           6  6.5              8  8.5        10  10.5
---------------------------------------------------------------------------------
T2   Age   Weight  Age   Weight   Age   Weight   Age   Weight   Age   Weight
         3500  3.3               134  4.4           59  5.5          6 6.6         7  7.7

I want to print it like this:

unrelated first line
unrelated 2nd line

---------------------------------------------------------------------------------
Type    Bird          Mammal        Fish         Reptile      Amphibian 
---------------------------------------------------------------------------------
T1   Age   Weight   Age  Weight   Age  Weight   Age Weight   Age   Weight
     2       2.5     4   4.5       6    6.5      8   8.5      10    10.5
---------------------------------------------------------------------------------
T2   Age   Weight   Age  Weight   Age  Weight  Age   Weight  Age   Weight
     3500   3.3     134    4.4     59    5.5    6     6.6     7     7.7

Basically:

Want to leave the first column (Type, T1, T2) untouched First 3 lines of the code untouched.

Note: this is file is an output of some other script which I have no control over. Hence my only way is to post-process my_file.txt I also cannot copy-paste from work computer, sorry. Any help is greatly appreciated. Thanks!

CodePudding user response：

It's a little bit vague from just the example you're giving to guess the real intention -- but I guess you intend to make all the numbers align better with their headings above, or alternatively, tweak the amount of whitespaces in between those 10 numbers when printing the lines with 10 numbers and leave every other lines alone.

If that is the case, try a normal printf() with some pre-definied positions, and of course you'll need to find just the right amount of whitespaces in between those %ds. Something like this (assuming @nums contains 10 numbers)

printf("    -     %2.2f    -  %2.2f      -  %2.2f   M  %2.2f      -    %2.2f    -    %2.2f\n", @nums);

For detecting and capturing 10 numbers on the same line, we could use some basic regular expressions. I guess some skeleton like the following would be enough.


# A re-usable regex pattern for capturing a number
my $numRegex = qr/\b [0123456789] (?:\.[0123456789] )? \b/x;

while(my $line = <>) {
    # Detecthing if the line contains just numbers with whitespaces in between and nothing else
    if ($line =~ m/^(?: \s   $numRegex )  \s* $/x ) {
        # Etrxact all numbers (assuming the amount of them is correct.
        my @numbers = $line =~ m/\b ( $numRegex ) \b/g;

        # Print those numbers with desired format
        printf(..., @numbers);

    } else {
        # Other lines are not subject to re-fomatting.

        print($line);
    }
}

CodePudding user response：

An idea using Perl formats:

#!/usr/bin/perl -a
use strict;
use warnings;

format div =
---------------------------------------------------------------------------------
.
format head1 =
@<<<<  @|||||||||||   @|||||||||||   @|||||||||||   @|||||||||||   @|||||||||||
@F
.
format head2 =
@<     @<<   @<<<<<   @<<   @<<<<<   @<<   @<<<<<   @<<   @<<<<<   @<<   @<<<<<
@F
.
format data =
     @>>>>  @>>>>>> @>>>>  @>>>>>> @>>>>  @>>>>>> @>>>>  @>>>>>> @>>>>  @>>>>>>
@F
.

   if ($. < 4)   { print }
elsif (/^---/)   { $~ = "div";   write }
elsif (/^Type/)  { $~ = "head1"; write }
elsif (/^T\d/)   { $~ = "head2"; write }
elsif (/^\s*\d/) { $~ = "data";  write }
else             { print }

-a fills @F from split input lines
$. is line number
$~ is current format

With input as question, outputs:

unrelated first line
unrelated 2nd line

---------------------------------------------------------------------------------
Type       Bird          Mammal          Fish         Reptile       Amphibian
---------------------------------------------------------------------------------
T1     Age   Weight   Age   Weight   Age   Weight   Age   Weight   Age   Weight
         2      2.5     4      4.5     6      6.5     8      8.5    10     10.5
---------------------------------------------------------------------------------
T2     Age   Weight   Age   Weight   Age   Weight   Age   Weight   Age   Weight
      3500      3.3   134      4.4    59      5.5     6      6.6     7      7.7