There are many, many questions and quality answers on Stackoverflow regarding how to extract numbers from a string, however but I am not able to find for the one looking for, so i have a spreadsheet where in a column descriptions are there along with specific codes in each cell.
These are 6-digit codes which should be extracted to a separate column. I can do that with the help of filterxml function if there is only 1 code inside a cell. But what formula should I use if there are more than 1 code in each cell?
Refer below, I need to extract both codes.
Advance payment based on contract # 990|33 dated 19.04.22 Invoice # HF 450|39 324500 526 324719 965
Using this formula
=FILTERXML("<t><s>"&SUBSTITUTE(C3," ","</s><s>")&"</s></t>","//s[string-length()=6]")
so it needs to be 324500 & 324719 but it spills anything with string length of 6
CodePudding user response:
You may try anyone of the following,
• Formula used in cell B3
=AGGREGATE(15,6,MID(SUBSTITUTE(A1," ","x"),SEQUENCE(LEN(A1)-5),6) 0,{1,2})
Or, If you are using O365 & presently in Office Insiders, Beta Channel Version, then may try this as well
• Formula used in cell B2
=AGGREGATE(14,6,--TEXTSPLIT(A1," "),{2,1})
CodePudding user response:
One way would be to change the xPath argument to include a parameter for only numeric data.
//s[number(.)=. and string-length()=6]
eg:
=FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[number(.)=. and string-length()=6]")
If any of the 6-digit numbers might have one or more leading zero's, then use the TEXT
function also:
=TEXT(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[number(.)=. and string-length()=6]"),"000000")