Home > Net >  RegEx Match Digits only between 2 strings across mutiple lines
RegEx Match Digits only between 2 strings across mutiple lines

Time:12-04

I have the text file below:

data:<SupplierParty 
data:xmlns="xxx">
data:  <cbc:CustomerAssignedAccountID schemeID="vendor-id">
data:    20750
data:  </cbc:CustomerAssignedAccountID>
data:  <cbc:AdditionalAccountID schemeID="cashflow:v1">151</cbc:AdditionalAccountID>

data:<SupplierParty 
data:xmlns="xxx">
data:  <cbc:CustomerAssignedAccountID schemeID="vendor-id">
data:    20751
data:  </cbc:CustomerAssignedAccountID>
data:  <cbc:AdditionalAccountID schemeID="cashflow:v1">151</cbc:AdditionalAccountID>

data:<SupplierParty 
data:xmlns="xxx">
data:  <cbc:CustomerAssignedAccountID schemeID="vendor-id">
data:    20752
data:  </cbc:CustomerAssignedAccountID>
data:  <cbc:AdditionalAccountID schemeID="cashflow:v1">151</cbc:AdditionalAccountID>

And I only want to extract the values:

20750
20751
20752 

From the file.

The closest I got to was:

(?<=vendor-id"\>)(.*?)(?=\<\/cbc:CustomerAssignedAccountID)

But this extracts:

data:    20751
data:  

I want digits only.

How do I do this?

CodePudding user response:

I dont know the language you are using but you can try the below regex

(data:\s*<cbc:.*?>\s*)data:\s*(\d )\s*(?=data:\s*</cbc:.*?>)

Below are the matches

data:  <cbc:CustomerAssignedAccountID schemeID="vendor-id">
data:    20750
data:  <cbc:CustomerAssignedAccountID schemeID="vendor-id">
data:    20751
data:  <cbc:CustomerAssignedAccountID schemeID="vendor-id">
data:    20752

now the brackets () i have added to create group

(\d )  this group will give you the number which you need

now i dont know which language you are using but you can easily extract that number by using group

CodePudding user response:

I'd do it like this:

vendor-id">[^<]*?(\d )

The matches will be in matching group 1.
Important is the ? after the [^<]* so that it matches non-greedy.

https://regex101.com/r/e3eR6y/1

  • Related