I'm writing some python code to extract all substrings within a string that start and end with double asterisks. For example "**Hello** world. Here's **some** text"
should return
[**hello**, **some**]
.
To achieve this I've turned to regex. I've created the following expression:
(?<!\*)\*{2}(?!\*)(.*?) (?<!\*)\*{2}(?!\*)
This checks for a string between a pair of double asterisks. It contains a lookahead and lookbehind regex to ensure that it is only 2 asterisks and not more. For example, *** text ***
will not match
As you can see below, the expression works when checking it here:
However, when I use re.findall() it returns a list of empty strings, ['', '']
.
This is the code I am using
text = "**Hello** world. Here's **some** text"
bold_text = re.findall(r"(?<!\*)\*{2}(?!\*)(.*?) (?<!\*)\*{2}(?!\*)", text)
print(bold_text)
Any help would be much appreciated as I am a little lost. Thank you :)
CodePudding user response:
Would you please try the following:
import re
text = "**Hello** world. Here's **some** text **foo*** ***bar**"
bold_text = re.findall(r"(?<!\*)\*\*([^*] )\*\*(?!\*)", text)
print(bold_text)
Output:
['Hello', 'some']
CodePudding user response:
I believe this regex should work: It hardcodes the asteriks though.
\*\*\b. ?\*\*