Home > Blockchain >  Pandas dataframe from array inside dict that is inside a dict
Pandas dataframe from array inside dict that is inside a dict

Time:12-16

i have a dict like this:

{'hostname1': {1 : {'ip1': [a,b,c]}}, 'hostname2': {2 : {'ip2': [d,e,f]}}}

the macro view:

dict={{{}[]}} or dict[hostname][1][ip1][a,b,c]

I'm trying to transform this dict in a dataframe like this:

hostname | id | ip | letter

hostname1  1   ip1   a

hostname1  1   ip1   b

hostname1  1   ip1   c

hostname2  2   ip2   d

...

i'm using pandas: df = pd.DataFrame(dict)

CodePudding user response:

You can use as below flatten function:

def flatten(d):
    out = {}
    for key, val in d.items():
        if isinstance(val, dict):
            val = [val]
        if isinstance(val, list):
            for subdict in val:
                deeper = flatten(subdict).items()
                out.update({key   '_'   key2: val2 for key2, val2 in deeper})
        else:
            out[key] = val
    return out

Code:

d= {'hostname1': {1 : {'ip1': ['a','b','c']}}, 'hostname2': {2 : {'ip2': ['d','e','f']}}}
pd.DataFrame([k.split('_')[:-1] [v] for k, v in flatten(d).items()], columns=['hostname','id','ip','letter'])

Output:

    hostname    id  ip  letter
0   hostname1   1   ip1 a
1   hostname1   1   ip1 b
2   hostname1   1   ip1 c
3   hostname2   2   ip2 d
4   hostname2   2   ip2 e
5   hostname2   2   ip2 f

OR Using JSON

dic = pd.json_normalize(d, sep='_').to_dict(orient='records')[0]

df = pd.concat({k: pd.Series(v) for k, v in dic.items()}).reset_index(level = 0)

df[['hostname','id','ip']]=df.level_0.str.split('_', expand=True)

df.drop('level_0', axis=1)
  • Related