Home > Blockchain >  How to create a parent-child dataframe from an existing pandas dataframe of hierarchical structure a
How to create a parent-child dataframe from an existing pandas dataframe of hierarchical structure a

Time:03-24

How do you convert a given dataframe with a hierarchical structure and arbitrary shape (say, similar to the one below) into a new dataframe with a parent and child column?

Edit: Note that a constraint is that a child cannot be its own parent.

data = {'level1': ['A', 'A', 'B', 'B', 'C'],
        'level2': ['James', 'Robert', 'Patricia', 'Patricia', 'John'],
        'level3': ['Stockholm', 'Denver', 'Moscow', 'Moscow', 'Palermo'],
        'level4': ['red', 'Denver', 'yellow', 'purple', 'blue']
        }

df = pd.DataFrame(data)

  level1    level2     level3  level4
0      A     James  Stockholm     red
1      A    Robert     Denver  Denver
2      B  Patricia     Moscow  yellow
3      B  Patricia     Moscow  purple
4      C      John    Palermo    blue

Desired output is something like this:

       parent      child
0           A      James
1           A     Robert
2           B   Patricia
3           C       John
4       James  Stockholm
5      Robert     Denver
6    Patricia     Moscow
7        John    Palermo
8   Stockholm        red
9      Moscow     yellow
10     Moscow     purple
11    Palermo       blue

CodePudding user response:

What I can think of is using a for loop over the columns of the dataframe:

columns = df.columns
length = len(columns)
parent = []
children = []
for i in range(length):
  if i != length - 1 :
    parent  = df[columns[i]].to_list()
    children  = df[columns[i 1]].to_list()
newDf = pd.DataFrame({"parent":parent, "children":children}).drop_duplicates()
newDf[newDf["parent"] != newDf["children"]]

Output

       parent   children
0           A      James
1           A     Robert
2           B   Patricia
4           C       John
5       James  Stockholm
6      Robert     Denver
7    Patricia     Moscow
9        John    Palermo
10  Stockholm        red
12     Moscow     yellow
13     Moscow     purple
14    Palermo       blue
  • Related