I need to extract only numbers with a decimal point from the following string. I used re module but faced a problem with a number of commas(there can be no commas or more than 1). Another problem is decimal numbers followed by words (i.e. 1,513,971.63Savings ). As I extracted the string from PDF files so I can't change the format.
sample string:
Date: 01-Mar-2022BETKA Br (0225)LIABILITIESCUSTOMER DEPOSITS 19,858,700.86Current Deposit12102010010165 350,745,799.38Saving Deposits12102010050170 174,381.98SB Bidhaba Bhata12102010060171 1,125,990.66SB Bayaska Bhata12102010070172 131,647.15SB Pratibandhy
output:
19,858,700.86
350,745,799.38
174,381.98
1,125,990.66
131,647.15
Anyone help?
CodePudding user response:
I guess you missed the 174,381.98. If so, use (\d (?:[,.]\d ) )
pattern to fetch the expected numbers.
import re
string = """Date: 01-Mar-2022BETKA Br (0225)LIABILITIESCUSTOMER DEPOSITS 19,858,700.86Current Deposit12102010010165 350,745,799.38Saving Deposits12102010050170 174,381.98SB Bidhaba Bhata12102010060171 1,125,990.66SB Bayaska Bhata12102010070172 131,647.15SB Pratibandhy"""
print(*re.findall(r"(\d (?:[,.]\d ) )", string), sep="\n")