I am trying to get the character on a new line after a specific letter using regex. My raw data looks like the below:
Total current charges (please see Current account details) $38,414.69
ID Number
1001166UNBEB
ACCOUNT SUMMARY
SVL0
BALANCE OVERDUE - PLEASE PAY IMMEDIATELY $42,814.80
I want to get the ID Number
My attempt is here:
ID_num = re.compile(r'[^ID Number[\r\n] ([^\r\n] )]{12}')
The length of ID num is always 12, and always after ID Number
which is why I am specifying the length in my expression and trying to detect the elements after that.
But this is not working as desired.
Would anyone help me, please?
CodePudding user response:
Your regex is not working because of the use of [ ]
at the beginning of the pattern, these are used for character sets.
So replace it with ( )
.
Your pattern would look like: r'^ID Number[\r\n] ([^\r\n] ){12}'
But you can simplify your pattern to: ID Number[\s] (\w )
\r\n
will be matched in \s
and numbers and alpha chars in \w
.
import re
s = """
Total current charges (please see Current account details) $38,414.69
ID Number
1001166UNBEB
ACCOUNT SUMMARY
SVL0
BALANCE OVERDUE - PLEASE PAY IMMEDIATELY $42,814.80
"""
print(re.findall(r"ID Number[\s] (\w )", s))
# ['1001166UNBEB']