I want my dataframe from this.....
Name | Qualities |
---|---|
boba fet | 1. Fighting 2. Running 3.swimming |
enigma | 1. Dodging bullets while running, cooking food 2. Sleep walking |
To the below format..
Name | Qualities |
---|---|
boba fet | Fighting |
boba fet | Running |
boba fet | Swimming |
enigma | Dodging bullets while running, cooking food |
enigma | Sleep walking |
Even if there is comma in text it needs to be exploded into rows on the numberings.
I tried to do
df.assign(Qualities = df.Qualities.str.split('1.')).explode('Qualities')
but didn't get the desired result.
CodePudding user response:
You could split on the number and period as the delimiter using regex. You'll end up with a few empty rows and whitespace using this pattern, so you can strip the values and drop empty rows.
import pandas as pd
df = pd.DataFrame({'Name': ['boba fet', 'enigma'],
'Qualities': ['1. Fighting 2. Running 3.swimming',
'1. Dodging bullets while running, cooking food 2. Sleep walking']})
df['Qualities'] = df.Qualities.str.split('\d .\s?')
df = df.explode('Qualities')
df['Qualities'] = df['Qualities'].str.strip()
print(df.loc[df['Qualities'].ne('')])