Match a float and multiply by 100-CodePudding

I would like to extract a float from a coverage report and multiply it with 100 to report it as percentage.

I am able to match the float using sed or grep and then multiply it using awk, but I was wondering if there is a more elegant single tool solution, maybe with perl or awk only?

The line looks like this:

<coverage line-rate="0.34869999999999995" branch-rate="0.2777" >

My current solution:

sed -n 's/^<coverage line-rate="\([0-9\.]*\)".*$/\1/p' a.txt | awk '{print (100 * $1)}'
34.87

CodePudding user response：

With your shown samples, please try following awk program. Written and tested in GNU awk.

awk '
match($0,/coverage line-rate="([^"]*)"/,arr){
  printf("%0.2f\n",arr[1] * 100)
}
' Input_file

Explanation: Using awk program's match function, in match function using regex coverage line-rate="([^"]*)" which is matching string coverage line-rate=" till next occurrence of " and saving capturing group's value into array arr. Then using printf to print value where using %0.2f\n to get value till 2 floating points only.

CodePudding user response：

A way to extract a number and process (multiply) it, with Perl

my $num = 100 * ( /^<coverage line-rate="([0-9]*\.[0-9]*)/ )[0];

The match operator (/../) need be in a list context in order to return matches thus the () around it, and then we pick the first (and only) match with [0]. Another way shown below.

This easily fits into a one-liner but it's unclear to me what you want with it. If yuo need to process a file with lines like this, and merely print results, for example

perl -wnE'say 100 * (/^<coverage line-rate="([0-9]*\.[0-9]*)"/)[0]' file

or, perhaps cleaner

perl -wnE'say 100 * $_ for /^<coverage line-rate="([0-9]*\.[0-9]*)"/' file

Here the list context is provided by the for loop and then we print all matches (the one, in this case), suitably multiplied.

In case you need to actually change the input string, as done in the question, but inside a program, and know what precision you want

# Keep four decimal places
s/^<coverage line-rate="([0-9]*\.[0-9]*).*/sprintf("%.4f", 100*$1)/e;

or if you don't care what precision it is and you'd leave it to the interpreter

s/^<coverage line-rate="([0-9]*\.[0-9]*).*/100*$1/e;

In both cases the /e modifier makes the replacement part be evaluated as code, so we can do some processing there. In the first case I use sprintf to format the replacement string as desired while in the other case it's just multiplied.

There could be a potential complication here, about how many digits you need to keep. If it should be the same number as in the original then we may need to first detect how many that was (in your original example it seems to go all the way but that's not a given I take it)

use warnings;
use strict;
use feature 'say';

# I shorten the number for demonstration
my $str = shift // q(<coverage line-rate="0.348691" branch-rate="0.2777" >);

sub process_num {
    my ($number, $frac) = @_;

    $number *= 100;
    my $frac_len = length($frac) - 2;  # keeps same digits

    return sprintf "%.${frac_len}f", $number;
}

$str =~ s/^<coverage line-rate="([0-9]*\.([0-9]*)).*/process_num($1, $2)/e;

say $str;

Running this without arguments prints 34.8691 instead of the original line with 0.348691

CodePudding user response：

Use " as field separator:

awk -F '"' '{print $2*100}' file

Output:

34.87

CodePudding user response：

Here is a gnu-awk solution:

awk '/^<coverage /{
   print 100 * gensub(/.* line-rate="([0-9.] )".*/, "\\1", "1")}' file

34.87