I have a pandas series
list_df = pd.Series(['KingsDuck',
'RangersIslandersDevils',
'Shark',
'Maple Leafs',
'Red Wing'])
display(list_df)
0 KingsDuck
1 RangersIslandersDevils
2 Shark
3 Maple Leafs
4 Red Wing
dtype: object
and I would like to insert a comma between lower character and upper character. (Eg: 'KingsDuck' to 'Kings,Duck' and 'RangersIslandersDevils' to 'Rangers,Islanders,Devils')
I tried an online python regex tools to test my regex and it worked as intended: regextesting
However when I tried the regex in my Jupyter Notebook, the output is not what I expected:
list_df.replace(r'(([a-z])([A-Z]))',r'\1,\2', regex=True)
0 KingsD,suck
1 RangersI,sslandersD,sevils
2 Shark
3 Maple Leafs
4 Red Wing
dtype: object
How do I go about this?
CodePudding user response:
You have too many groups, remove the external parentheses. You have ((a)(b))
so \1
is ab
, \2
is a
, \3
is b
.
list_df.replace(r'([a-z])([A-Z])',
r'\1,\2', regex=True)
Or if you really want to keep the external group:
list_df.replace(r'(([a-z])([A-Z]))',
r'\2,\3', regex=True)
Output:
0 Kings,Duck
1 Rangers,Islanders,Devils
2 Shark
3 Maple Leafs
4 Red Wing
dtype: object