This question is related to my previous question, but now I want to know how to convert a dict of dataframes into a single dataframe where keys are columns and dataframe's fields are sub columns:
{'kp3r_busd': source_id timestamp open high low close volume
0 7905 2022-07-05T14:45:00.000Z 131.5500000000000000 131.8400000000000000 130.3800000000000000 130.9400000000000000 90.2900000000000000, 'mln_usdt': source_id timestamp open high low close volume
0 19252 2022-07-05T13:00:00.000Z 24.5610000000000000 24.5610000000000000 24.5610000000000000 24.5610000000000000 0.0000000000000000}
The desired output looks like this:
| kp3r_busd | mln_usdt
| source_id timestamp open high low | source_id timestamp open hi low
0 | 7905 2022-07-05T14:45:00.000Z 131.5500000000000000 131.8400000000000000 130.3800000000000000 130.9400000000000000 90.2900000000000000 | 19252 2022-07-05T13:00:00.000Z 24.5610000000000000 24.5610000000000000 24.5610000000000000 24.5610000000000000 0.0000000000000000}
CodePudding user response:
You can create multi index for the columns, then concat them into a single row.
import pandas as pd
dict1 = {
"exchange": "binance",
"base": "kp3r",
"quote": "busd",
"resolution": 15,
"ohlcv": [
{
"source_id": 7905,
"timestamp": "2022-07-05T14:45:00.000Z",
"open": "131.5500000000000000",
"high": "131.8400000000000000",
"low": "130.3800000000000000",
"close": "130.9400000000000000",
"volume": "90.2900000000000000"
}]
}
dict2 = {
"exchange": "kucoin",
"base": "mln",
"quote": "usdt",
"resolution": 15,
"ohlcv": [
{
"source_id": 19252,
"timestamp": "2022-07-05T13:00:00.000Z",
"open": "24.5610000000000000",
"high": "24.5610000000000000",
"low": "24.5610000000000000",
"close": "24.5610000000000000",
"volume": "0.0000000000000000"
}]
}
df_list = []
for each in [dict1, dict2]:
name = f'{each["base"]}_{each["quote"]}'
y = [(name, col) for col in pd.DataFrame(each['ohlcv'])]
multiCols = pd.MultiIndex.from_tuples(y)
temp_df = pd.DataFrame(each['ohlcv'])
temp_df.columns = multiCols
df_list.append(temp_df)
df = pd.concat(df_list, axis=1)
Output:
print(df.to_string())
kp3r_busd mln_usdt
source_id timestamp open high low close volume source_id timestamp open high low close volume
0 7905 2022-07-05T14:45:00.000Z 131.5500000000000000 131.8400000000000000 130.3800000000000000 130.9400000000000000 90.2900000000000000 19252 2022-07-05T13:00:00.000Z 24.5610000000000000 24.5610000000000000 24.5610000000000000 24.5610000000000000 0.0000000000000000