Suppose data frame df
is
d = { 'Title': ['Elden Ring', 'Starcraft 2', 'Terraforming Mars'], 'Genre' : [ 'Fantasy;Videogame', 'Videogame', 'Fantasy;Boardgame'] }
pd.DataFrame(data=d, index=None)
Such that it's
Elden Ring Fantasy;Videogame
Starcraft 2 Videogame
Terraforming Mars Fantasy;Boardgame
My goal is to end with a dataframe that looks like this:
Title Genres Fantasy Videogame Boardgame
Elden Ring [Fantasy, Videogame] 1 1 0
Starcraft 2 [Videogame] 0 1 0
Terraforming Mars [Fantasy, Boardgame] 1 0 1
How is the best way to go about this? I tried doing
from sklearn.preprocessing import MultiLabelBinarizer
df = pd.DataFrame(data=d, index=None)
df.Genre = df.Genre.str.split(';')
binar = MultiLabelBinarizer()
genre_labels = binar.fit_transform( df.Genre )
df[ binar.classes_ ] = genre_labels
This gives me a dataframe:
Title Genre Boardgame Fantasy Videogame
Elden Ring [Fantasy, Videogame] 0 1 1
Starcraft 2 [Videogame] 0 0 1
Terraforming Mars [Fantasy, Boardgame] 1 1 0
This gives me what I want but it felt convoluted to do. Is there a cleaner way to be doing this?
CodePudding user response:
Or use Series.str.get_dummies
:
df.Genre.str.strip('[]').str.get_dummies(sep=', ')
Boardgame Fantasy Videogame
0 0 1 1
1 0 0 1
2 1 1 0
To append to dataframe:
pd.concat([df, df.Genre.str.strip('[]').str.get_dummies(sep=', ')], axis=1)
Title Genre Boardgame Fantasy Videogame
0 Elden Ring [Fantasy, Videogame] 0 1 1
1 Starcraft 2 [Videogame] 0 0 1
2 Terraforming Mars [Fantasy, Boardgame] 1 1 0
If Genre
is started as list type:
df.Genre = df.Genre.str.join(';')
pd.concat([df, df.Genre.str.get_dummies(sep=';')], axis=1)
Title Genre Boardgame Fantasy Videogame
0 Elden Ring Fantasy;Videogame 0 1 1
1 Starcraft 2 Videogame 0 0 1
2 Terraforming Mars Fantasy;Boardgame 1 1 0
CodePudding user response:
.str.get_dummies
was designed specifically for this:
df = pd.concat([df, df['Genre'].str.get_dummies(';')], axis=1)
Output:
>>> df
Title Genre Boardgame Fantasy Videogame
0 Elden Ring Fantasy;Videogame 0 1 1
1 Starcraft 2 Videogame 0 0 1
2 Terraforming Mars Fantasy;Boardgame 1 1 0