Convert dictionary with both nested and non-nested items into dataframe-CodePudding

I have a following dictionary:

{'instance_1': {'race': {'asian': 0,
   'black': 99,
   'white': 9},
  'dominant_race': 'black',
  'region': {'x': 0, 'y': 0, 'w': 100, 'h': 100}},
 'instance_2': {'race': {'asian': 0,
   'black': 0,
   'white': 89},
  'dominant_race': 'white',
  'region': {'x': 6, 'y': 12, 'w': 79, 'h': 79}}}

and would like to convert it into pandas data frame such that each element within each item is its own column, like

item	asian	black	white	dominant_race	x	y	w	h
instance_1	0	99	9	black	0	0	100	100
instance_2	0	0	89	white	6	12	79	79

Using pd.DataFrame.to_dict() with orient=index yields the following result

pd.DataFrame.from_dict(predictions, orient='index')

            race                                 dominant_race  \
instance_1  {'asian': 0, 'black': 99....         black   
instance_2  {'asian': 0, 'black': 0.....         white   

                                            region  
instance_1    {'x': 0, 'y': 0, 'w': 100, 'h': 100}  
instance_2     {'x': 6, 'y': 12, 'w': 79, 'h': 79}

How do I convert the dictionary so that each element within 'race' and 'region' are its own columns?

CodePudding user response：

.apply(pd.Series) will break the dictionaries within your cells to different columns.

Assuming that the sample above is a good representation of your real data, you can first use pd.DataFrame().T , and then use concat to combine the newly formed columns from region and race:

df = pd.DataFrame(d).T # d being your sample dictionary

res = pd.concat([df['race'].apply(pd.Series),
               df['dominant_race'],
               df['region'].apply(pd.Series)], axis=1)

Which will print:

res
Out[262]: 
            asian  black  white dominant_race  x   y    w    h
instance_1      0     99      9         black  0   0  100  100
instance_2      0      0     89         white  6  12   79   79