How to explode columns in a dataframe which has the complete list for every row which needs to be ex-CodePudding

I have a dataframe which looks something like this:

ex = [{"A": "Germany", "word": "Danke", "explode1": [["berlin", "frankfurt"], ["Paris"], ["London", "Glassgow"]], "explode2": [["food", "cars"], ["fashion"], ["pound", "cricket"]]}, 
  {"A": "French", "word": "merci", "explode1": [["berlin", "frankfurt"], ["Paris"], ["London", "Glassgow"]], "explode2": [["food", "cars"], ["fashion"], ["pound", "cricket"]]}, 
  {"A": "English", "word": "Thank you", "explode1": [["berlin", "frankfurt"], ["Paris"], ["London", "Glassgow"]], "explode2": [["food", "cars"], ["fashion"], ["pound", "cricket"]]}]

df = pd.DataFrame(ex)

A       word        explode1                                            explode2
Germany Danke       [[berlin, frankfurt], [Paris], [London, Glassgow]]  [[food, cars], [fashion], [pound, cricket]]
French  merci       [[berlin, frankfurt], [Paris], [London, Glassgow]]  [[food, cars], [fashion], [pound, cricket]]
English Thank you   [[berlin, frankfurt], [Paris], [London, Glassgow]]  [[food, cars], [fashion], [pound, cricket]]

I want to explode the dataframe based on the columns "explode1" and "explode2". If you notice they have the same number of list items ie:3, but the list contains all the items for the rows. So I want to explode the dataframe to look something like this:

A       word        explode1    explode2
Germany Danke       [berlin]    [food]
Germany Danke       [frankfurt] [cars]
French  merci       [Paris]     [fashion]
English Thank you   [London]    [pound]
English Thank you   [Glassgow]  [cricket]

How do I do this? Was looking at the pandas.explode option but how we handle the complete list in every row because for eg: I only want the first row to expand based on the first item of the list in the "explode1" and "explode2" column not on the others items in the list.

CodePudding user response：

explode won't really work since the list items have to merged by index first. A possible workflow it to do this first, then explode on to_explode, finally creating a new df using concat:

import pandas as pd 

ex = [{"A": "Germany", "word": "Danke", "explode1": [["berlin", "frankfurt"], ["Paris"], ["London", "Glassgow"]], "explode2": [["food", "cars"], ["fashion"], ["pound", "cricket"]]}, 
  {"A": "French", "word": "merci", "explode1": [["berlin", "frankfurt"], ["Paris"], ["London", "Glassgow"]], "explode2": [["food", "cars"], ["fashion"], ["pound", "cricket"]]}, 
  {"A": "English", "word": "Thank you", "explode1": [["berlin", "frankfurt"], ["Paris"], ["London", "Glassgow"]], "explode2": [["food", "cars"], ["fashion"], ["pound", "cricket"]]}]

df = pd.DataFrame(ex)

df['to_explode'] = df.apply(lambda row: list(zip(row['explode1'][row.name], row['explode2'][row.name])), axis=1)
df = df[['A', 'word', 'to_explode']].explode('to_explode').reset_index(drop=True)

result_df = pd.concat([df[['A', 'word']], pd.DataFrame(df['to_explode'].values.tolist(), columns=['explode1', 'explode2'])], axis=1)

Result:

	A	word	explode1	explode2
0	Germany	Danke	berlin	food
1	Germany	Danke	frankfurt	cars
2	French	merci	Paris	fashion
3	English	Thank you	London	pound
4	English	Thank you	Glassgow	cricket