Home > Software design >  Extracting different types of data from a file and save these values in variables
Extracting different types of data from a file and save these values in variables

Time:07-11

I have a file like

water
{
    nu              1.69e-3;
    rho             8;
}
vapour
{
    rho             2;
}
right
{
    type            zeroGradient 6;
    value           uniform (40 10 0);  

}

left
{
    value           uniform (0 5 0);    
}

and I want to extract the values 1.69e-3, 8,2, 40, 5 from it and save these values in variables separately. To extract 8 and 2 and save them in variables I use the following commands,

rhol=`grep rho file | awk '{print $NF}' | sed -e 's/;//g' | head -1`
rhog=`grep -m2 rho file | awk '{print $NF}' | sed -e 's/;//g' | tail -n1`

But to get the scientific value of 1.69e-3, and two other values 40 and 5 I have problem to get the values.

CodePudding user response:

This is what I think you want from looking at your sample:

  • lines that do not have the word uniform, the number in the line (what precedes the ;).
  • lines that have uniform, the number(s) between parenthesis, if not equal to 0.

You can do this with awk:

/uniform/ {
    numbers = gensub(/.*\((.*)\);/, "\\1", "g", $0)
    split(numbers, numbersarray, " ")
    for (i in numbersarray) {
        if (numbersarray[i] != 0) {
            print numbersarray[i]
        }
    }
}
/;$/ && ! /zeroGradient;/ {
    t=length($2)
    print substr($2,0,t-1)
}

Save this to file.awk and run using awk -f file.awk inputfile.txt


For lines with /uniform/

  • extract the numbers. They are what is seen between ( ) in the line. gensub is used here, it works similar to sed 's///'. The \( are the ones found in the text, the ( are used to define the value of \\1 used later.
  • then split the numbers string into an array.
  • loop on the array, any number that != 0 is printed to the screen.

gensub details:

  • /.*\((.*)\);/
  • // delimit a pattern
  • .* will match anything (0 or more of any char)
  • \( the parenthesis after uniform
  • \); the closing parenthesis with the ;
  • (.*) anything between parenthesis, and set that in field no.1 for the replacement.

For lines that end with ; AND not the lines with zeroGradient

  • print the second field in the line, removing the ; at the end.

The output using this is:

1.69e-3
8
2
40
5

CodePudding user response:

Here's a simple Awk parser which outputs assignments you can eval. You will want to make very sure the output is what you expect before you actually do that.

awk 'BEGIN { insec=0 }
/[^{]/ && NF==1 && !insec { sec=$1; next }
/[{]/ && sec { insec=1; next }
/[}] && insec { sec=""; insec=0; next }
insec && !/^[ \t]*value/ { sub(/;$/, ""); printf "%s_%s=%s\n", sec, $1, $NF }
insec && /^[ \t]*value/ {printf "%s=%s\n", sec, $4 }' file

Save this as a script; running it on the file should produce something like

water_nu=1.69e-3
water_rho=8
vapour_rho=2
right=10
left=5

The value for right is clearly different from what you want, but you have not explained how we would know how to parse that, so I leave it to you to implement that logic.

Once the results are what you want, you'd run it like

eval "$(yourscript)"

to have the shell evaluate the assignments it outputs.

As ever, be very paranoid before actually using eval. There is probably a much better way to do what you ask; I would suggest implementing the rest of the logic in Awk too if feasible.

  • Related