I have two types of text from that i need to extract the name
1 #PRESTIGE COOKTOP 3B 6995.00/
3 #PRESTIGE SS DLX ALPHA 2250.00/
By using following expression I am able to extract the first one but not the second one.
I use
(?'SrNo'\d )\s (?'Itemname'#([A-Z\s.*-]*)[\d]{1}[A-Z\s.*-]*)\s (?'MRP'[0-9.]*)
CodePudding user response:
You can use
(?'SrNo'\d )\s (?'Itemname'#(\D*?)(?:\s (\d[A-Z\s.*-]*?))?)\s (?'MRP'\d[\d.]*)
See the regex demo. Details:
(?'SrNo'\d )
- Group "SrNo": one or more digits\s
- one or more whitespaces (use\h
if you want to stay on the same line)(?'Itemname'#(\D*?)(?:\s (\d[A-Z\s.*-]*?))?)
- Group "Itemname":#
, Group 3: zero or more non-digit char as few as possible, then an optional occurrence of one or more whitespaces and then Group 4: a digit and then zero or more uppercase letters, whitespace,.
,*
or-
chars as few as possible (maybe.*
should be removed if you meant to match any text, or just replace[...]
here with.*?
to match any text)\s
- one or more whitespaces (use\h
if you want to stay on the same line)(?'MRP'\d[\d.]*)
- Group "MRP": a digit and then zero or more digits or dots.
CodePudding user response:
You have to remove ]{1}[
or else you have 2 character classes where the first character has to be a digit and in the second string ALPHA
does not start with a digit.
(?'SrNo'\d )\s (?'Itemname'#([A-Z\s.*-]*)[\d]{1}[A-Z\s.*-]*)\s (?'MRP'[0-9.]*)
^^^^^
The updated pattern:
(?'SrNo'\d )\s (?'Itemname'#([A-Z\s.*-]*)[\dA-Z\s.*-]*)\s (?'MRP'[0-9.]*)