Home > Blockchain >  Creating a multi index data frame
Creating a multi index data frame

Time:06-20

I'm currently trying to create a multi index dataframe from an array of data frames, Representing an array of channels which is a data frame which looks like this.

[
                           open    high     low   close    volume
timestamp                                                         
2022-06-17 04:00:00 00:00  271.0  276.62  270.39  270.73  10947530,
                           open    high     low   close    volume
timestamp                                                         
2022-06-17 04:00:00 00:00  271.0  276.62  270.39  270.73  10947530,
]

and array of symbols

["symbol1", "symbols2"]

However now I need to reorganise my data like this " Let us assume, that our raw data raw_df is stored in a pd.DataFrame. There are n_timesteps rows representing different timesteps with the same time frequency but potentially with gaps (due to non-business days etc.). They are indexed by pd.DatetimeIndex. The columns are indexed by pd.MultiIndex where the first level represents the n_assets different assets. The second level then represents the n_channels channels (indicators) like volume or close price. For the rest of the this page we will be using the below example "

this is how it should look like

Any help would be greatly appreciated thank you very much!

CodePudding user response:

Use pd.concat to create your column MultiIndex as expected:

out = pd.concat(dfs, keys=symbols, names=['Channel', 'Asset'], axis=1)
print(out)

# Output
Channel                   symbol1                                   symbol2                                  
Asset                        open    high     low   close    volume    open    high     low   close    volume
timestamp                                                                                                    
2022-06-17 04:00:00 00:00   271.0  276.62  270.39  270.73  10947530   271.0  276.62  270.39  270.73  10947530

Input data

>>> df
[                            open    high     low   close    volume
 timestamp                                                         
 2022-06-17 04:00:00 00:00  271.0  276.62  270.39  270.73  10947530,
                             open    high     low   close    volume
 timestamp                                                         
 2022-06-17 04:00:00 00:00  271.0  276.62  270.39  270.73  10947530]

>>> symbols
['symbol1', 'symbol2']
  • Related