I have a data frame(in csv file) with two columns each containing lists(of variable length) in string format. I am providing the link to the google drive where I have stored the csv file for reference
CodePudding user response:
IIUC, do some data clean up by remove an intra-string single quote. And, then use library yaml to convert your string to actual list in each pandas dataframe cell with applymap. Lastly, apply explode to your dataframe twice once for each column you want to expand.
import yaml
import pandas as pd
df = pd.read_csv('Downloads/nodes_list.csv', index_col=[0])
df['Opp1'] = df['Opp1'].str.replace("[\'\"]s",'s', regex=True)
df['Opp2'] = df['Opp2'].str.replace("[\'\"]s",'s', regex=True)
df = df.applymap(yaml.safe_load)
df_new = df.explode('Opp1').explode('Opp2').apply(list, axis=1)
df_new
Output:
0 [KingdomofPoland, Georgia]
0 [GrandDuchyofLithuania, Georgia]
1 [NorthernYuanDynasty, Georgia]
2 [SpanishEmpire, ChechenRepublic]
2 [CaptaincyGeneralofChile, ChechenRepublic]
...
3411 [SyrianOpposition, SpanishEmpire]
3412 [UnitedStates, SpanishEmpire]
3412 [UnitedKingdom, SpanishEmpire]
3412 [SaudiArabia, SpanishEmpire]
3413 [Turkey, Russia]
Length: 31170, dtype: object