Python regex for getting string with special characters-CodePudding

So i want to get specific data

So string is an input by the user

"price_to_earning   current_price * 0.8"

It could even be

"price_to_earning*current_price 0.8"

"price_to_earning *current_price/0.8"

How can i extract just "price_to_earning" & "current_price" from the above

currently, I'm using

    words = re.findall(r"\b\S ", raw_query)

but it gets

['price_to_earning', 'current_price', '0.8']

what I want is

['price_to_earning', 'current_price']

CodePudding user response：

Why not use a regex to match words without digits, e.g. [^\d\W] ?

have a look at the demo here https://regex101.com/r/EbNQvm/1

CodePudding user response：

You can specify the characters you want to exclude, and replace everything else with a space, for example -

s1 = "price_to_earning   current_price * 0.8"
s2 = "price_to_earning*current_price 0.8"
s3 = "price_to_earning *current_price/0.8"
for s in [s1, s2, s3]:
    print(re.sub(r'[^a-zA-Z_]', ' ', s).split())

Output

['price_to_earning', 'current_price']
['price_to_earning', 'current_price']
['price_to_earning', 'current_price']

CodePudding user response：

You can try finding only characters and _

words = re.findall(r"[a-zA-Z_] ", raw_query)