Home > Blockchain >  Create a new column from two columns of a dataframe where rows of each column contains list in strin
Create a new column from two columns of a dataframe where rows of each column contains list in strin

Time:07-21

I have a data frame(in csv file) with two columns each containing lists(of variable length) in string format. I am providing the link to the google drive where I have stored the csv file for reference enter image description here

CodePudding user response:

IIUC, do some data clean up by remove an intra-string single quote. And, then use library yaml to convert your string to actual list in each pandas dataframe cell with applymap. Lastly, apply explode to your dataframe twice once for each column you want to expand.

import yaml
import pandas as pd

df = pd.read_csv('Downloads/nodes_list.csv', index_col=[0])

df['Opp1'] = df['Opp1'].str.replace("[\'\"]s",'s', regex=True)
df['Opp2'] = df['Opp2'].str.replace("[\'\"]s",'s', regex=True)

df = df.applymap(yaml.safe_load)

df_new = df.explode('Opp1').explode('Opp2').apply(list, axis=1)

df_new

Output:

0                       [KingdomofPoland, Georgia]
0                 [GrandDuchyofLithuania, Georgia]
1                   [NorthernYuanDynasty, Georgia]
2                 [SpanishEmpire, ChechenRepublic]
2       [CaptaincyGeneralofChile, ChechenRepublic]
                           ...                    
3411             [SyrianOpposition, SpanishEmpire]
3412                 [UnitedStates, SpanishEmpire]
3412                [UnitedKingdom, SpanishEmpire]
3412                  [SaudiArabia, SpanishEmpire]
3413                              [Turkey, Russia]
Length: 31170, dtype: object
  • Related