My goal below is to create 1 single column of all the individual words of each string in the 'Name' column.
Although I am achieving this, I am losing the column header on df = df['Name'].str.split(' ', expand=True)
. I would like to preserve the header if possible so that I can refer to it later in the script.
I am also ending up with multiple indexes, which is fine, but if there is a way to not have this, it would be great.
Any help is appreciated greatly. Thank you
import pandas as pd
data = {'Name':['Tom Wilson', 'nick snyder', 'krish moham', 'jack oconnell']}
df = pd.DataFrame(data)
df = df['Name'].str.split(' ', expand=True)
df = df.stack(dropna=True)
print(df)
CodePudding user response:
Try this:
data = {'Name': ['Tom Wilson', 'nick snyder', 'krish moham', 'jack oconnell']}
df = pd.DataFrame(data)
df = df['Name'].str.split(' ').explode().to_frame()
print(df)
Prints:
Name
0 Tom
0 Wilson
1 nick
1 snyder
2 krish
2 moham
3 jack
3 oconnell