So i want to get specific data
So string is an input by the user
"price_to_earning current_price * 0.8"
It could even be
"price_to_earning*current_price 0.8"
or
"price_to_earning *current_price/0.8"
How can i extract just "price_to_earning" & "current_price"
from the above
currently, I'm using
words = re.findall(r"\b\S ", raw_query)
but it gets
['price_to_earning', 'current_price', '0.8']
what I want is
['price_to_earning', 'current_price']
CodePudding user response:
Why not use a regex to match words without digits, e.g. [^\d\W]
?
have a look at the demo here https://regex101.com/r/EbNQvm/1
CodePudding user response:
You can specify the characters you want to exclude, and replace everything else with a space, for example -
s1 = "price_to_earning current_price * 0.8"
s2 = "price_to_earning*current_price 0.8"
s3 = "price_to_earning *current_price/0.8"
for s in [s1, s2, s3]:
print(re.sub(r'[^a-zA-Z_]', ' ', s).split())
Output
['price_to_earning', 'current_price']
['price_to_earning', 'current_price']
['price_to_earning', 'current_price']
CodePudding user response:
You can try finding only characters and _
words = re.findall(r"[a-zA-Z_] ", raw_query)