I need to extract the lines of text without header and upto the line ,,,,,Total
INPUT
Sr.No ,Description of Goods ,HSN ,,City ,Unit ,,Unit Price ,Amount (in BP)
1 ,29550762R3C0-Top- ST22E360 ,84439990 ,,1.00 ,Numbers ,,5000.00 ,5000.00
2 ,29550762R3C0-Bot- ST22E361 ,84439990 ,,1.00 ,Numbers ,,5000.00 ,5000.00
3 ,2945486609-G2 - Bot Stencil - ST22F199 ,84439990 ,,1.00 ,Numbers ,,5000.00 ,5000.00
4 ,2945457600-S1 - Top Stencil - ST22F224 ,84439990 ,,1.00 ,Numbers ,,5000.00 ,5000.00
5 ,2945457600-G2 - Bot Stencil - ST22F225 ,84439990 ,,1.00 ,Numbers ,,5000.00 ,5000.00
,,,,,Total ,,,25000.00 IN
,,,,,Discount ,,,0.00 3
,,,,Total Amount ,Before Tax ,,,"₹ 25,000.00 "
,,,,,IGST 0% ,,,₹ 0.00
,,,,Total Amount ,After Tax: ,,,"₹ 25,000.00 "
EXPECTED OUTPUT
1 ,29550762R3C0-Top- ST22E360 ,84439990 ,,1.00 ,Numbers ,,5000.00 ,5000.00
2 ,29550762R3C0-Bot- ST22E361 ,84439990 ,,1.00 ,Numbers ,,5000.00 ,5000.00
3 ,2945486609-G2 - Bot Stencil - ST22F199 ,84439990 ,,1.00 ,Numbers ,,5000.00 ,5000.00
4 ,2945457600-S1 - Top Stencil - ST22F224 ,84439990 ,,1.00 ,Numbers ,,5000.00 ,5000.00
5 ,2945457600-G2 - Bot Stencil - ST22F225 ,84439990 ,,1.00 ,Numbers ,,5000.00 ,5000.00
I have tried with: =REGEXEXTRACT(A10,"(\n.*)")
, The number of lines are not known in advance and can vary from cell to cell. Some may have one and some have 5 to 6.
CodePudding user response:
With your shown samples Only, please try following solutions. Here is the
Formula in A3
:
=QUERY(FLATTEN(SPLIT(A1,CHAR(10))),"where Col1 matches '^\d. '")
CodePudding user response:
I would simply match the lines which start with digits,space,comma till the end of line:
/^\d *,.*$/gm
^ Start of line
\d One or more digits
* Zero or more spaces
, Comma
.* Any character
$ End of line
/gm Flags: Global; Multi line