Splitting series of Strings into Dataframe-CodePudding

I have a big series of strings that I want to split into a dataframe.

The series looks like this:

s = pd.Series({"1":"Name=Marc-Age=48-Car=Tesla", 
    "2":"Name=Ben-Job=Pilot-Car=Porsche", 
    "3":"Name=Tom-Age=24-Car=Ford"})

I want to split this into a dataframe looking like this:

  Name  Age  Job   Car
1 Marc  48   Nan   Tesla
2 Ben   Nan  Pilot Porsche
3 Tom   24   Nan   Ford

I tried to split the strings first by "-" and then by "=" but don't understand how to continue after

df=s.str.split("-", expand=True)
for col in df.columns:
    df[col]=df[col].str.split("=")

I get this:

    `      0    1   2`
    `1  ['Name', 'Marc']    ['Age', '48']   ['Car', 'Tesla']`
    `2  ['Name', 'Ben'] ['Job', 'Pilot']    ['Car', 'Porsche']`
    `3  ['Name', 'Tom'] ['Age', '24']   ['Car', 'Ford']`

I don't know how to continue from here. I can't loop through the rows because my dataset is really big.

Can anyone help on how to go on from here?

CodePudding user response：

If you split then explode and split again you can then use a pivot.

import pandas as pd
s = pd.Series({"1":"Name=Marc-Age=48-Car=Tesla", 
    "2":"Name=Ben-Job=Pilot-Car=Porsche", 
    "3":"Name=Tom-Age=24-Car=Ford"})  

s = s.str.split('-').explode().str.split('=', expand=True).reset_index()
s = s.pivot(index='index', columns=0, values=1).reset_index(drop=True)

Output

   Age      Car    Job  Name
0   48    Tesla    NaN  Marc
1  NaN  Porsche  Pilot   Ben
2   24     Ford    NaN   Tom