Home > Software engineering >  get the first 2 numbers after text in only specific lines for multiplication
get the first 2 numbers after text in only specific lines for multiplication

Time:01-28

I have a file where I am getting data and thinning it out so that I only have what I need. However, I have lines with numbers that I either need to grab and put in another file possibly so I can multiply them or multiply in place and output to a .csv. It might help to put into proper columns as well.

This is a sample of lines and I am going to do this on 42000 lines give or take. and that is a Trumpf machine. :)

ELQADDXP.DAT-*test ADDXP 20GA ASTM A1011 0
ELQADDXP.DAT- 7.75000 14.00000
ELQADDXP.DAT- TRUMP 59.6517 0 3 4
ELQADDXQ.DAT-*1140242-0 ADDXQ 20GA ASTM A1011
ELQADDXQ.DAT- 7.75000 14.00000
ELQADDXQ.DAT- TRUMP 59.6517 0 3 4
ELQADDXR.DAT-*1140242-0A ADDXR 16GA ASTM A1011 0
ELQADDXR.DAT- 7.75000 14.00000
ELQADDXR.DAT- TRUMP 59.6517 0 3 4
ELQADDXS.DAT-*1139977-0 ADDXS 16GA ASTM A1011
ELQADDXS.DAT- 4.00000 8.64848
ELQADDXS.DAT- TRUMP 24.1015 0 3 4
ELQADDXT.DAT-*1137679-0 ADDXT 16GA ASTM A1011
ELQADDXT.DAT- 8.00000 15. .
ELQADDXT.DAT- TRUMP 71.1517 0 3 4
ELQADDXU.DAT-*1139617-0 ADDXU 11GA ASTM A1011
ELQADDXU.DAT- 6.37500 7.63330
ELQADDXU.DAT- TRUMP 30.1449 1 3 1044 0
ELQADDXV.DAT-*1140569-0 ADDXV 11GA ASTM A1011
ELQADDXV.DAT- 6.94190 35.50000
ELQADDXV.DAT- TRUMP 168.3770 1 3 1060 0
ELQADDXW.DAT-*1075665-9 ADDXW 11GA ASTM A1011 0
ELQADDXW.DAT- 10.60339 36.74345
ELQADDXW.DAT- TRUMP 335.6440 1 3 1060 0

The lines with only 2 numbers need to be multiplied by each other and I need the result included in the .csv

I tried grep -A1 - but this gets more than I need since - is in every line. find . -regex '.*/[0-9] \myfile but I don't need other numbers. I assume there might be an easy way I just have not discovered it yet.

I need all of the other data for the csv file but I would like it to look something like

ELQADDXP.DAT-*test ADDXP 20GA ASTM A1011 0
ELQADDXP.DAT- 7.75000 14.00000 108.500
ELQADDXP.DAT- TRUMP 59.6517

CodePudding user response:

As indicated by Barmar, awk is best for what you are trying to do, and it is very straightforward (modified).

#!/bin/bash

input="input.txt"

cat >"${input}" <<"EnDoFiNpUt"
ELQADDXP.DAT-*test ADDXP 20GA ASTM A1011 0
ELQADDXP.DAT- 7.75000 14.00000
ELQADDXP.DAT- TRUMP 59.6517 0 3 4
ELQADDXQ.DAT-*1140242-0 ADDXQ 20GA ASTM A1011
ELQADDXQ.DAT- 7.75000 14.00000
ELQADDXQ.DAT- TRUMP 59.6517 0 3 4
ELQADDXR.DAT-*1140242-0A ADDXR 16GA ASTM A1011 0
ELQADDXR.DAT- 7.75000 14.00000
ELQADDXR.DAT- TRUMP 59.6517 0 3 4
ELQADDXS.DAT-*1139977-0 ADDXS 16GA ASTM A1011
ELQADDXS.DAT- 4.00000 8.64848
ELQADDXS.DAT- TRUMP 24.1015 0 3 4
ELQADDXT.DAT-*1137679-0 ADDXT 16GA ASTM A1011
ELQADDXT.DAT- 8.00000 15. .
ELQADDXT.DAT- TRUMP 71.1517 0 3 4
ELQADDXU.DAT-*1139617-0 ADDXU 11GA ASTM A1011
ELQADDXU.DAT- 6.37500 7.63330
ELQADDXU.DAT- TRUMP 30.1449 1 3 1044 0
ELQADDXV.DAT-*1140569-0 ADDXV 11GA ASTM A1011
ELQADDXV.DAT- 6.94190 35.50000
ELQADDXV.DAT- TRUMP 168.3770 1 3 1060 0
ELQADDXW.DAT-*1075665-9 ADDXW 11GA ASTM A1011 0
ELQADDXW.DAT- 10.60339 36.74345
ELQADDXW.DAT- TRUMP 335.6440 1 3 1060 0
EnDoFiNpUt

awk '{
    if( NF == 3 ){
        printf("%s %.5f %.5f %.5f\n", $1, $2, $3, $2*$3 ) ;
    }else{
        if( NF == 4 && $4 == "." ){
            printf("%s %.5f %.5f %.5f\n", $1, $2, $3, $2*$3 ) ;
        }else{
            print $0 ;
        } ;
    } ;
}' "${input}"

The output (modified) looks like this:

ELQADDXP.DAT-*test ADDXP 20GA ASTM A1011 0
ELQADDXP.DAT- 7.75000 14.00000 108.50000
ELQADDXP.DAT- TRUMP 59.6517 0 3 4
ELQADDXQ.DAT-*1140242-0 ADDXQ 20GA ASTM A1011
ELQADDXQ.DAT- 7.75000 14.00000 108.50000
ELQADDXQ.DAT- TRUMP 59.6517 0 3 4
ELQADDXR.DAT-*1140242-0A ADDXR 16GA ASTM A1011 0
ELQADDXR.DAT- 7.75000 14.00000 108.50000
ELQADDXR.DAT- TRUMP 59.6517 0 3 4
ELQADDXS.DAT-*1139977-0 ADDXS 16GA ASTM A1011
ELQADDXS.DAT- 4.00000 8.64848 34.59392
ELQADDXS.DAT- TRUMP 24.1015 0 3 4
ELQADDXT.DAT-*1137679-0 ADDXT 16GA ASTM A1011
ELQADDXT.DAT- 8.00000 15.00000 120.00000
ELQADDXT.DAT- TRUMP 71.1517 0 3 4
ELQADDXU.DAT-*1139617-0 ADDXU 11GA ASTM A1011
ELQADDXU.DAT- 6.37500 7.63330 48.66229
ELQADDXU.DAT- TRUMP 30.1449 1 3 1044 0
ELQADDXV.DAT-*1140569-0 ADDXV 11GA ASTM A1011
ELQADDXV.DAT- 6.94190 35.50000 246.43745
ELQADDXV.DAT- TRUMP 168.3770 1 3 1060 0
ELQADDXW.DAT-*1075665-9 ADDXW 11GA ASTM A1011 0
ELQADDXW.DAT- 10.60339 36.74345 389.60513
ELQADDXW.DAT- TRUMP 335.6440 1 3 1060 0

Also, if the field count might encounter conflicts, then you can always have extra conditionals, such as

if( NF == 3 && $0 !~ /ASTM/ && $0 !~ /TRUMP/ ){

CodePudding user response:

I went a different route and used an awk script

{
fc=substr($0,1,1)
if (fc == "@")
{ 
getline
print $1" "$3" "$4
getline 
rint $2" "$3, $2 * $3 
getline p
print $3
}
}

gives

$ awk -f grabq.awk ELQADDXT.DAT *1137679-0 16GA ASTM 8.00000 15.00000 120 71.1517 I just need a line to remove the * at the beginning.

  • Related