How to turn an entire data frame into a dictionary-CodePudding

For example:

df = pd.DataFrame({'col1': [1, 2], 'col2': [0.5, 0.75]}, index=['row1', 'row2'])

>>> df
      col1  col2
row1     1  0.50
row2     2  0.75

How can I generate a dictionary with four entries, like

dict[(row1, col1)] = 1
dict[(row1, col2)] = 0.5
dict[(row2, col1)] = 2
dict[(row2, col2)] = 0.75

i.e., the key is a tuple with first one being the index and the 2nd one being the column name. My data frame is big. So a fast way is going to help a lot. Thanks for the help!

CodePudding user response：

import pandas as pd df = pd.DataFrame({'col1': [1, 2], 'col2': [0.5, 0.75]}, index=['row1', 'row2']) df.to_dict()

CodePudding user response：

Pandas has DataFrame.to_dict().

d = df.to_dict()

print(d)
{'col1': {'row1': 1, 'row2': 2}, 'col2': {'row1': 0.5, 'row2': 0.75}}

print(d['col1']['row1'])
1

To get something like the example you posted, I would implement a specific function to populate the dictionary with a syntax that is close to the one you want.

from collections import defaultdict

d = defaultdict(dict)
def f(row, col, val):
    global d
    d[row][col] = val #invert row/col if you prefer
    
f('row1', 'col1', 1)
f('row1', 'col2', 0.5)
f('row2', 'col1', 2)
f('row2', 'col2', 0.75)

print(dict(d))
{'row1': {'col1': 1, 'col2': 0.5}, 'row2': {'col1': 2, 'col2': 0.75}}

print(pd.DataFrame(d))
      row1  row2
col1   1.0  2.00
col2   0.5  0.75


print(pd.DataFrame(d).T) #translate the df if you prefer
      col1  col2
row1   1.0  0.50
row2   2.0  0.75

But honestly, I would just stick with the standard dictionary and syntax, I don't see any advantage in doing something like that.