Home > Software design >  Pandas: convert a dictionary of dataframes into a single one with keys as columns and fields as new
Pandas: convert a dictionary of dataframes into a single one with keys as columns and fields as new

Time:07-07

This question is related to my previous question, but now I want to know how to convert a dict of dataframes into a single dataframe where keys are columns and dataframe's fields are sub columns:

{'kp3r_busd':    source_id                 timestamp                  open                  high                   low                 close               volume
0       7905  2022-07-05T14:45:00.000Z  131.5500000000000000  131.8400000000000000  130.3800000000000000  130.9400000000000000  90.2900000000000000, 'mln_usdt':    source_id                 timestamp                 open                 high                  low                close              volume
0      19252  2022-07-05T13:00:00.000Z  24.5610000000000000  24.5610000000000000  24.5610000000000000  24.5610000000000000  0.0000000000000000}

The desired output looks like this:

  | kp3r_busd                         | mln_usdt
  | source_id timestamp open high low | source_id timestamp open hi low 
0 | 7905  2022-07-05T14:45:00.000Z  131.5500000000000000  131.8400000000000000  130.3800000000000000  130.9400000000000000  90.2900000000000000 | 19252  2022-07-05T13:00:00.000Z  24.5610000000000000  24.5610000000000000  24.5610000000000000  24.5610000000000000  0.0000000000000000} 

CodePudding user response:

You can create multi index for the columns, then concat them into a single row.

import pandas as pd

dict1 = {
    "exchange": "binance",
    "base": "kp3r",
    "quote": "busd",
    "resolution": 15,
    "ohlcv": [
        {
            "source_id": 7905,
            "timestamp": "2022-07-05T14:45:00.000Z",
            "open": "131.5500000000000000",
            "high": "131.8400000000000000",
            "low": "130.3800000000000000",
            "close": "130.9400000000000000",
            "volume": "90.2900000000000000"
        }]
}

dict2 = {
    "exchange": "kucoin",
    "base": "mln",
    "quote": "usdt",
    "resolution": 15,
    "ohlcv": [
        {
            "source_id": 19252,
            "timestamp": "2022-07-05T13:00:00.000Z",
            "open": "24.5610000000000000",
            "high": "24.5610000000000000",
            "low": "24.5610000000000000",
            "close": "24.5610000000000000",
            "volume": "0.0000000000000000"
        }]
}



df_list = []
for each in [dict1, dict2]:
    name = f'{each["base"]}_{each["quote"]}'
    
    
    y = [(name, col) for col in pd.DataFrame(each['ohlcv'])]
    multiCols = pd.MultiIndex.from_tuples(y)
    temp_df = pd.DataFrame(each['ohlcv'])
    temp_df.columns = multiCols
    
    
    df_list.append(temp_df)
    
df = pd.concat(df_list, axis=1)

Output:

print(df.to_string())
  kp3r_busd                                                                                                                                         mln_usdt                                                                                                                                  
  source_id                 timestamp                  open                  high                   low                 close               volume source_id                 timestamp                 open                 high                  low                close              volume
0      7905  2022-07-05T14:45:00.000Z  131.5500000000000000  131.8400000000000000  130.3800000000000000  130.9400000000000000  90.2900000000000000     19252  2022-07-05T13:00:00.000Z  24.5610000000000000  24.5610000000000000  24.5610000000000000  24.5610000000000000  0.0000000000000000
  • Related