Home > Enterprise >  Pandas python parent-child hierarchy get parent
Pandas python parent-child hierarchy get parent

Time:06-18

So I have the following dataframe

import pandas as pd
df = pd.DataFrame({'ID':    [ 0 ,  1 ,  2 ,  3 ], 
                  'name':   ['A', 'B', 'C', 'D'], 
                  'parent': [ '', 'A', 'A', 'B']})

Now as you see there is a parent-child hierarchy in here. However, unfortunately this is based on strings.

Is it possible to very effectively / neatly to create a new column with the actual numeric parent id?

E.g. the result would be;

df = pd.DataFrame({'ID':      [  0 ,  1 ,  2 ,  3 ], 
                  'name':     [ 'A', 'B', 'C', 'D'], 
                  'parent':   [  '', 'A', 'A', 'B'],
                  'parentID': [None,  0,   0,   1 ]})

CodePudding user response:

Use map on a crafted Series:

df['parentID'] = df['parent'].map(df.set_index('name')['ID'])

output:

   ID name parent  parentID
0   0    A              NaN
1   1    B      A       0.0
2   2    C      A       0.0
3   3    D      B       1.0

CodePudding user response:

You can simply do a merge of your dataframe with itself :

df.merge(
    df[["ID", "name"]].rename(columns={"ID" : "Parent ID", "name" : "parent"}),
    on="parent",
    how="left"
)
  • Related