I have a pandas data frame. The column of my data frame that I am interested contains strings. Within the string there is a section that has brackets. It looks like so:
Some Data (More Info)
Some Data (More Info)
Some Data (More Info)
What I am trying to do is select the data that’s in between the brackets and stick it into a new column.
I was been playing around with split but I cant get it to work because I am left with an extra ‘)’ at the end of the string.
Is there away to select the data with out having the brackets without have this little bracket left over?
I don't think I can split by just spaces alone because my some data
has spaces in it.
I am splitting the data by:
df_split = df_abc['title'].str.split('(', expand=True)
CodePudding user response:
Use str.extract
:
res = df_abc['title'].str.extract(r'\((.*?)\)')
print(res)
Output
0
0 More Info
1 More Info
2 More Info
As an alternative use a named capturing group, to obtain a column name:
res = df_abc['title'].str.extract(r'\((?P<text>.*?)\)')
print(res)
Output
text
0 More Info
1 More Info
2 More Info
It also may be worth to take a look str.extractall
for multiple occurrences of the pattern.