For example:
df = pd.DataFrame({'col1': [1, 2], 'col2': [0.5, 0.75]}, index=['row1', 'row2'])
>>> df
col1 col2
row1 1 0.50
row2 2 0.75
How can I generate a dictionary with four entries, like
dict[(row1, col1)] = 1
dict[(row1, col2)] = 0.5
dict[(row2, col1)] = 2
dict[(row2, col2)] = 0.75
i.e., the key is a tuple with first one being the index and the 2nd one being the column name. My data frame is big. So a fast way is going to help a lot. Thanks for the help!
CodePudding user response:
import pandas as pd df = pd.DataFrame({'col1': [1, 2], 'col2': [0.5, 0.75]}, index=['row1', 'row2']) df.to_dict()
CodePudding user response:
Pandas has DataFrame.to_dict().
d = df.to_dict()
print(d)
{'col1': {'row1': 1, 'row2': 2}, 'col2': {'row1': 0.5, 'row2': 0.75}}
print(d['col1']['row1'])
1
To get something like the example you posted, I would implement a specific function to populate the dictionary with a syntax that is close to the one you want.
from collections import defaultdict
d = defaultdict(dict)
def f(row, col, val):
global d
d[row][col] = val #invert row/col if you prefer
f('row1', 'col1', 1)
f('row1', 'col2', 0.5)
f('row2', 'col1', 2)
f('row2', 'col2', 0.75)
print(dict(d))
{'row1': {'col1': 1, 'col2': 0.5}, 'row2': {'col1': 2, 'col2': 0.75}}
print(pd.DataFrame(d))
row1 row2
col1 1.0 2.00
col2 0.5 0.75
print(pd.DataFrame(d).T) #translate the df if you prefer
col1 col2
row1 1.0 0.50
row2 2.0 0.75
But honestly, I would just stick with the standard dictionary and syntax, I don't see any advantage in doing something like that.