Home > Enterprise >  Transform dictionary list into python dataframe
Transform dictionary list into python dataframe

Time:03-24

I have a file with several dictionaries and I want to turn it into a pandas dataframe, but I can't. When I try to get the first value to be in a column, like "Fuel station 1" and everything else is in a second column all together.

{'Fuel station 1': {'00404850000317': {'01-01-2019': {'DataDaVenda': '2019-01-01 19:04:22',
    'Descrição': 'GASOLINA COMUM',
    'Preço': 4.289544},
   '01-01-2020': {'DataDaVenda': '2020-01-01 19:18:09',
    'Descrição': 'GASOLINA C COMUM (b:2)',
    'Preço': 4.49},
   '01-01-2022': {'DataDaVenda': '2021-12-31 19:24:20',
    'Descrição': 'GASOLINA C COMUM (b:1)',
    'Preço': 6.49}}},
{'Fuel station 2': {'00404850000317': {'01-01-2019': {'DataDaVenda': '2019-01-01 19:04:22',
    'Descrição': 'GASOLINA COMUM',
    'Preço': 4.289544},
   '01-01-2021': {'DataDaVenda': '2021-01-01 18:48:55',
    'Descrição': 'GASOLINA C COMUM (b:1)',
    'Preço': 4.59},
   '01-01-2022': {'DataDaVenda': '2021-12-31 19:24:20',
    'Descrição': 'GASOLINA C COMUM (b:1)',
    'Preço': 6.49}}}

Desired output format:

output format

CodePudding user response:

Well your data is nested on multiple levels. So first of all you are going to have to transform it into a format that pandas can handle. One way would be a records format (list of dicts), where each of the keys which belong to multiple entries are their own fields:

import pandas

# slightly fixed your brackets
data = {
    'Fuel station 1': {'00404850000317': {
        '01-01-2019': {'DataDaVenda': '2019-01-01 19:04:22', 'Descrição': 'GASOLINA COMUM', 'Preço': 4.289544},
        '01-01-2020': {'DataDaVenda': '2020-01-01 19:18:09', 'Descrição': 'GASOLINA C COMUM (b:2)', 'Preço': 4.49},
        '01-01-2022': {'DataDaVenda': '2021-12-31 19:24:20', 'Descrição': 'GASOLINA C COMUM (b:1)', 'Preço': 6.49}}},
    'Fuel station 2': {'00404850000317': {
        '01-01-2019': {'DataDaVenda': '2019-01-01 19:04:22', 'Descrição': 'GASOLINA COMUM', 'Preço': 4.289544},
        '01-01-2021': {'DataDaVenda': '2021-01-01 18:48:55', 'Descrição': 'GASOLINA C COMUM (b:1)', 'Preço': 4.59},
        '01-01-2022': {'DataDaVenda': '2021-12-31 19:24:20', 'Descrição': 'GASOLINA C COMUM (b:1)', 'Preço': 6.49}}}}

To flat list of dicts:

reformatted_data = []
for fuel_st, v in data.items():
    for id_, v_ in v.items():
        for date, v__ in v_.items():
            reformatted_data.append({"fuel station": fuel_st, "id": id_, "date": date})
            for k___, v___ in v__.items():
                reformatted_data[-1][k___] = v___
df = pandas.DataFrame.from_records(reformatted_data)

Which returns:

print(df)
>      fuel station              id        date          DataDaVenda               Descrição     Preço
  0  Fuel station 1  00404850000317  01-01-2019  2019-01-01 19:04:22          GASOLINA COMUM  4.289544
  1  Fuel station 1  00404850000317  01-01-2020  2020-01-01 19:18:09  GASOLINA C COMUM (b:2)  4.490000
  2  Fuel station 1  00404850000317  01-01-2022  2021-12-31 19:24:20  GASOLINA C COMUM (b:1)  6.490000
  3  Fuel station 2  00404850000317  01-01-2019  2019-01-01 19:04:22          GASOLINA COMUM  4.289544
  4  Fuel station 2  00404850000317  01-01-2021  2021-01-01 18:48:55  GASOLINA C COMUM (b:1)  4.590000
  5  Fuel station 2  00404850000317  01-01-2022  2021-12-31 19:24:20  GASOLINA C COMUM (b:1)  6.490000
  • Related