Home > Software design >  Creating New columns from other pandas column
Creating New columns from other pandas column

Time:05-13

I would like to create a new Column from the genres column. The genres column contains one or multiple genres and I would like to create a column for each genre name. Then, I would like to fill in 1 and 0 in each column depending on whether they have the genre.

First Image

Dataframe should look like in the image below.

Below

I don't have any clue on this.

Using one hot encoder or pandas dummies function straight away didn't work as I got something like this

Right here

I don't need something like this

CodePudding user response:

It looks like the values in the Genre column were one-hot encoded. One-hot encoding is also know as referred to as creating dummy variables.

Pandas has a function pd.get_dummies() that should enable you one-hot encode the Genre column. Pass in your data frame and use the columns parameter to select the Genre column.

See the function documentation and other options here: https://pandas.pydata.org/docs/reference/api/pandas.get_dummies.html

CodePudding user response:

You can use CategoricalDtype as below:

import pandas as pd
from pandas.api.types import CategoricalDtype

df = pd.DataFrame({'country': ['Brazil', 'Australia', 
'Canada','Brazil','Germany']})

pd.get_dummies(df,prefix=['country'])
  • Related