Home > Net >  regex: Match null or space from start as optional
regex: Match null or space from start as optional

Time:04-15

I want to match null or space as optional from the start of the line. The line is as follow:

 Date       Description  Amount
 
 null 12/05/2016 Asian Paints 2,150.65

   13/05/2016 Nerolac GEB 5.86 22,512.65 Cr

 14/05/2016 Hydra 12,412

The regex that I used is :

regex_null = re.compile(r"^(?:null)?\s (\d{2}/\d{2}/\d{4})\s (.*?)\s (\d[\d,]*\.\d{2}\s (?:Cr)?)$", re.M)

And what I got is:

 null 12/05/2016 Asian Paints 2,150.65

     13/05/2016 Nerolac GEB 5.86 22,512.65 Cr

So the null is not optional. It is currently considered compulsory. Can you please help me with this?

CodePudding user response:

You may use this regex with optional groups:

^\s*(?:null)?\s*(\d{2}/\d{2}/\d{4})\s (.*?)\s (\d[\d,]*(?:\.\d{2})?(\s Cr)?)$

RegEx Demo

RegEx Details:

  • ^\s*(?:null)?\s*: Match optional null with 0 or more whitespaces on both sides
  • (\d{2}/\d{2}/\d{4}): Match date string in capture group #1
  • \s : Match 1 whitespaces
  • (.*?): Math 0 or more characters in capture group #2
  • \s : Match 1 whitespaces
  • (\d[\d,]*: Match a digit followed by 0 or more digit/comma characters
  • (?:\.\d{2})?: Match optional dot and digits
  • (\s Cr)?): Match optional 1 whitespaces followed by Cr
  • $: End

CodePudding user response:

You may apply a regex pattern in multiline mode which makes the first, sixth, and seventh values optional in the line.

inp = """ null 12/05/2016 Asian Paints 2,150.65

   13/05/2016 Nerolac GEB 5.86 22,512.65 Cr

 14/05/2016 Hydra 12,412"""

lines = re.findall(r'^\s*(null)?\s*(\d{1,2}/\d{1,2}/\d{4}) (\w (?: \w )*) (\d{1,3}(?:,\d{3})*(?:\.\d )?)?(?: (\d{1,3}(?:,\d{3})*(?:\.\d )?))?(?: (\w ))?', inp, flags=re.M)
print(lines)

This prints:

[('null', '12/05/2016', 'Asian Paints', '2,150.65', '', ''),
 ('', '13/05/2016', 'Nerolac GEB', '5.86', '22,512.65', 'Cr'),
 ('', '14/05/2016', 'Hydra', '12,412', '', '')]
  • Related