Pandas - Create new rows and cols based on dict of a column-CodePudding

Please consider the following simplified Pandas dataframe:

Name	Component
D800465	[{'component': 'comp1', 'version': '1.0.0'}, {'component': 'comp2', 'version': '15.2.5'}]
L932227	[{'component': 'comp1', 'version': '1.0.0'}, {'component': 'comp2', 'version': '15.2.5'}, {'component': 'comp3', 'version': '2.5'}]
L908041	[{'component': 'comp1', 'version': '1.0.0'}]
D797502	[{'component': 'comp1', 'version': '1.0.0'}]

As you understand, the column 'Component' contains lists of dictionnaries, which size may vary. I want to perform 2 actions on this dataframe: create new columns, 1 for 'ComponentName' and one for 'ComponentVersion'. Beside of this, I want to create any number of rows necessary depnding on the size of my list.

The expected output (with the same exemple as above) should be like this:

Name	ComponentName	ComponentVersion
D800465	comp1	1.0.0
D800465	comp2	15.2.5
L932227	comp1	1.0.0
L932227	comp2	15.2.5
L932227	comp3	2.5
L908041	comp1	1.0.0
D797502	comp1	1.0.0

How can I achieve this ? Thank's a lot

CodePudding user response：

You can explode and convert the dictionaries to columns with pandas.json_normalize:

df2 = df.explode('Component')
df2 = (df2[['Name']].reset_index(drop=True)
       .join(pd.json_normalize(df2['Component']))
      )

output:

      Name component version
0  D800465     comp1   1.0.0
1  D800465     comp2  15.2.5
2  L932227     comp1   1.0.0
3  L932227     comp2  15.2.5
4  L932227     comp3     2.5
5  L908041     comp1   1.0.0
6  D797502     comp1   1.0.0