Home > Blockchain >  'dict of list of dict' to dataframe
'dict of list of dict' to dataframe

Time:10-30

I have a dict of list of dict

{
'Col1Name': [{'date': '2020', 'value': '1111'},
             {'date': '2019', 'value': '2222'},
             {'date': '2018', 'value': '3333'}],
'Col2Name': [{'date': '2020', 'value': '777'},
             {'date': '2018', 'value': '999'}]
}

How can I import it elegantly into a dataframe, bearing in mind there may be missing values?

The end result should look something like: (or transposed, doesnt matter)

            2020      2019       2018
Col1Name    1111      2222       3333 
Col2Name     777       nan        999

CodePudding user response:

Here is a way using pandas.concat and a small comprehension:

import pandas as pd
pd.concat({c: pd.DataFrame(l).set_index('date').T
           for c,l in d.items()}).droplevel(1)

Output:

date      2020  2019  2018
Col1Name  1111  2222  3333
Col2Name   777   NaN   999

Input:

d = {
'Col1Name': [{'date': '2020', 'value': '1111'},
             {'date': '2019', 'value': '2222'},
             {'date': '2018', 'value': '3333'}],
'Col2Name': [{'date': '2020', 'value': '777'},
             {'date': '2018', 'value': '999'}]
}

CodePudding user response:

Assuming the dictionary d we can use a comprehension to get the dictionary into an appropriate format, then pass to DataFrame.from_dict:

df = pd.DataFrame.from_dict(
    {k: {v['date']: v['value'] for v in lst} for k, lst in d.items()},
    orient='index'
)

df:

          2020  2019  2018
Col1Name  1111  2222  3333
Col2Name   777   NaN   999

We can remove orient='index' if wanting the other way:

df = pd.DataFrame.from_dict(
    {k: {v['date']: v['value'] for v in lst} for k, lst in d.items()}
)

df:

     Col1Name Col2Name
2020     1111      777
2019     2222      NaN
2018     3333      999

Setup:

import pandas as pd

d = {
    'Col1Name': [{'date': '2020', 'value': '1111'},
                 {'date': '2019', 'value': '2222'},
                 {'date': '2018', 'value': '3333'}],
    'Col2Name': [{'date': '2020', 'value': '777'},
                 {'date': '2018', 'value': '999'}]
}
  • Related