Say, I have a string
1. ACTNOWQUICK3 1234.56 1234.98 HYE912630964589376 PLUTO THEATRE OTHER WUN Cool Beans KIng
2. Cash WithdrawalATM 50.00 ABC 1111.22 23523455A
3. ACTNOWQUICK 76.53 653.24 HYE91234234589376 WiN OTHR JOHNKLING
I need to extract pattern from this such that, I get everything before the first numerical value, everything after it and also the two numerical values . Note that its guranteed that there will be only 2 numeric int/decimal values in the string with space before and after
this is what I have tried but its not giving me the expected output :
pattern = '(.*)([0-9]*[,.][0-9]*).*([0-9]*[,.][0-9]*)(.*)'
What was expected :
1. "ACTNOWQUICK3", 1234.56, 1234.98, "HYE912630964589376 PLUTO THEATRE OTHER WUN Cool Beans KIng"
2. "Cash WithdrawalATM", 50.00, 1111.22, "23523455A"
3. "ACTNOWQUICK", 76.53, 653.24, "HYE91234234589376 WiN OTHR JOHNKLING"
CodePudding user response:
You're using a greedy quantifier. As Michael recommends, just change the first two .*
to lazy adding a ?
after it. And add a white space in the first and last parenthesis.
pattern = '(.*?) ([0-9] [,.][0-9] ).*?([0-9] [,.][0-9] ) (.*)'
This works because you want to repeat the first patterns as few as possible.
Test here: https://regex101.com/r/PVR6bd/1