Home > Mobile >  python - Dict of Dict of List using pandas
python - Dict of Dict of List using pandas

Time:01-19

I want to group a pandas DataFrame as below. A Dictionary with stocks, and each stock has its own dictionary to capture date & timestamp. And each of those entries will have the OHLC values for that period. Since this is run in Live market, the date and timestamp needs to be a dictionary to reupdate the values. As the initial DataFrame has around 100,000 entries, creating the Dict of Dict of List manually takes over 30 seconds.

Stock(Dict)     Day(Dict)               Open, High, Low, Close(List)    
A1              2023-01-01 09:15:00     100, 102, 99, 101
                2023-01-02 09:15:20     100, 102, 99, 101
B1              2023-01-01 09:15:00     100, 102, 99, 101
                2023-01-02 09:20:00     100, 102, 99, 101

I am able to create the Dict of Dicts but only populate one value from Open, High, Low, Close

g = df_symbols_all.set_index('day').groupby('symbol').apply(lambda x: x.high.to_dict()).to_dict()

enter image description here

If I try to pass all values, it does not create the date & timestamp dictionary.

k= df_symbols_all.set_index('day').groupby('symbol').apply(lambda x: (x.open.to_dict(), x.high.to_dict(), x.low.to_dict(), x.close.to_dict())).to_dict()

enter image description here

Any help will be appreciated, thanks.

CodePudding user response:

You could create a new column that combines the ('open', 'close','high','low') into a single column (this can be performed fast):

df_symbols_all['open_high_low_close'] = df_symbols_all[['open','close','high','low']].values.tolist()

and then perform the grouping as you suggested

g = df_symbols_all.set_index('day').groupby('symbol').apply(lambda x: x.open_high_low_close.to_dict()).to_dict()
  • Related