One column of my dataframe contains mac addresses without semicolons. I would like to add a semicolon to every mac address after every 2nd character.
I was looking for a split every nth character option of pd.Series.split.str() so that I could split the mac address in six and concat afterwards but according to the documentation splitting on number of characters is not available. There is the regex option, but regex is not a skill I have.
I assume there is an even easier solution than splitting and concatting but I have not come across that.
Help would be much appreciated, thank you.
mac_address
0 0003E6A584C2
1 0003E6A584CC
2 0003E6A584DA
3 0003E6A584DC
4 0003E6A584E4
CodePudding user response:
How about something like this?
N = 2
df['mac_address'] = df['mac_address'].str[:N] ';' df['mac_address'].str[N:]
Output:
>>> df
mac_address
0 00;03E6A584C2
1 00;03E6A584CC
2 00;03E6A584DA
3 00;03E6A584DC
4 00;03E6A584E4
CodePudding user response:
You're correct that regex is your friend :
df['mac_address'].replace(r'(\w{2})',r'\1-',regex=True).str.strip('-')
Output:
0 00-03-E6-A5-84-C2
1 00-03-E6-A5-84-CC
2 00-03-E6-A5-84-DA
3 00-03-E6-A5-84-DC
4 00-03-E6-A5-84-E4
Name: mac_address, dtype: object
CodePudding user response:
Let us try findall
with map
(..
means N = 2)
df.mac_address.str.findall('..').map(';'.join)
Out[368]:
0 00;03;E6;A5;84;C2
1 00;03;E6;A5;84;CC
2 00;03;E6;A5;84;DA
3 00;03;E6;A5;84;DC
4 00;03;E6;A5;84;E4
Name: mac_address, dtype: object