I have an excel file with 40 components and I converted (online) it to txt file for doing the command line functions. I want to extract part numbers (it is 6 or 7 digits number) from it. Some follow a specific pattern. I want to extract and save it in txt file My code:
list.txt
Product number 1 ac162049-2/slid||product|1971904|pgrid|119732683897|ptaid 1
Product number 2, its accessories 1-82/pcrid|5194541117|pkw|product|3418376|-SHOPPING 10
Product number 3 dip-40/dp/9761446 2
Expected output:
productnumber.txt
1971904
3418376
9761446
My code:
grep -Po '/\K.[0-9] [1-9]' hardware\ components_prashant.txt > serialnumber.txt
Present output:
9761446
CodePudding user response:
From looking at your sample data, I believe the column delimiter is the pipe?
Assuming part number is column 1, QTY is column 8, you can do this to get it out
cat list.txt | awk -F| '{ print $1, $8 }' > quantity.txt
CodePudding user response:
Is it just any six-or-seven digits with non-alphanumerics before and behind?
grep -Eo '\b[0-9]{6,7}\b' productnumber.txt
1971904
3418376
9761446
In -E
xtended pattern matching, \b
is a "word boundary". c.f. this tutorial. You could also use \<
and \>
as I did below.
[...]
is a character class matching anything in the given set. a dash (-
) indicates a range, so [0-9]
is anything from zero to nine, inclusive. {...}
specifies length limits, so {6,7}
says a series of digits no less than six, and no more than seven.
If you wanted the fields you mentioned before, (...)
is storage groupings, and ^
is negation in a character class, so:
sed -E 's/^ *([^0-9] [0-9] ).*\<([0-9]{6,7})\>.* ([0-9] ) *$/\1|\2|\3/' productnumber.txt
Product number 1|1971904|1
Product number 2|3418376|10
Product number 3|9761446|2