Home > Blockchain >  Explode dataframe column into rows on numbering instead of comma
Explode dataframe column into rows on numbering instead of comma

Time:02-18

I want my dataframe from this.....

Name Qualities
boba fet 1. Fighting 2. Running 3.swimming
enigma 1. Dodging bullets while running, cooking food 2. Sleep walking

To the below format..

Name Qualities
boba fet Fighting
boba fet Running
boba fet Swimming
enigma Dodging bullets while running, cooking food
enigma Sleep walking

Even if there is comma in text it needs to be exploded into rows on the numberings.

I tried to do

df.assign(Qualities = df.Qualities.str.split('1.')).explode('Qualities') but didn't get the desired result.

CodePudding user response:

You could split on the number and period as the delimiter using regex. You'll end up with a few empty rows and whitespace using this pattern, so you can strip the values and drop empty rows.

import pandas as pd

df = pd.DataFrame({'Name': ['boba fet', 'enigma'],
 'Qualities': ['1. Fighting 2. Running 3.swimming',
  '1. Dodging bullets while running, cooking food 2. Sleep walking']})

df['Qualities'] = df.Qualities.str.split('\d .\s?')
df = df.explode('Qualities')
df['Qualities'] = df['Qualities'].str.strip()
print(df.loc[df['Qualities'].ne('')])
  • Related