I have a column that I need to extract and separate all the genres, then add those extract genres back into dataframe. I tried to implement str.extract()
method but don't get anywhere.
Column example:
|title||genres|
|-----||------|
|Cowboy Bebop||['Comedy', 'Dementia', 'Horror', 'Seinen']|
Ideal new column:
|title||genres|
|-----||------|
|Cowboy Bebop||'Comedy'|
|CowBoy Bebop||'Dementia'|
|CowBoy Bebop||'Horror'|
|CowBoy Bebop||'Seinen'|
CodePudding user response:
You need pandas.DataFrame.explode
:
df = df.explode('genres').reset_index(drop=True)
Output:
>>> df
title genres
0 Cowboy Bebop Comedy
1 Cowboy Bebop Dementia
2 Cowboy Bebop Horror
3 Cowboy Bebop Seinen
Note that you might need to convert the values in the genres
column to actual list, because it might just look like a list but actually be a string. If so, run this before the above:
import ast
df['genres'] = df['genres'].apply(ast.literal_eval)
CodePudding user response:
This will give you the desired results with .explode()
data = {'title' : ['Cowboy Bebop'], 'genres' : [['Comedy', 'Dementia', 'Horror', 'Seinen']]}
df = pd.DataFrame(data)
df = df.explode('genres')
df