My question is regarding to the data which is being passed in the re.split() function. I have the following data
Name | Sport |
---|---|
John | Football;NBA,Tennis |
Mary | Squash,Tetris;MMA |
Scott | Cricket,Tennis |
Kim | Rugby,WNBA;Footy |
I am trying to split the strings using ';' and ',' as the delimiters. Initially the data type of the Name and Sports column is 'object'
import numpy as np
import pandas as pd
import re
df = pd.read_excel(r'Filepath\sports.xlsx',sheet_name = 'data')
df[['Name','Sport']] = df[['Name','Sport']].astype('string')
print(df.dtypes)
df[['A']] = re.split(r';,',df['Sport'])
df
After converting to string and then trying to split. I get the following error.
TypeError: expected string or bytes-like object
I tried using
df[['A']] = re.split(r';,',df['Sport'].astype('string'))
But the error is till persisting. Any suggestions?
CodePudding user response:
re is a library that recieves a String type, not a Pandas dataframe column you should use an accessor in this case
df[['A']] = df['Sport'].str.split(r';,')
I hope it resolves your problem