I'm trying to exctract names enclosed in square brackets and which appear only after a substring. In the example sentence shown below, the substring is "[A]."
"This is [A].[Alpha] and this is [A].[Beta] and this is [A].[Charlie] and so on"
I'm trying to generate a list as shown below:
CodePudding user response:
\[A\].\[([^\]]*)]
https://regex101.com/r/NF526r/1
That should do the trick for you. I'm taking advantage of negated character classes.
Here is a demo in python:
import re
mystring = "This is [A].[Alpha] and this is [A].[Beta] and this is [A].[Charlie] and so on"
values = re.findall("\[A\].\[([^\]]*)]", mystring)
print(values)
results:
['Alpha', 'Beta', 'Charlie']
CodePudding user response:
Try this:
df['col'] = df['col'].str.findall(r"\[A\].\[([^\]]*)]")
df.explode('col')
col
0 Alpha
0 Beta
0 Charlie
Where 'col' is the column with your text.