I need a key, row index for my Pandas DataFrame where key
is the id column of Pandas DataFrame and data
is the row data.
The data is sparse - I only need to access data for a few keys, but I do not know ahead of time which keys I need to access.
I am currently doing this using iterrows
as:
pair_map = {}
for pair_id, data in df.iterrows():
pair_map[pair_id] = data
However, for a very large number of rows (~100k-1M), this becomes slow. Would there be any faster ways to create sparse key-row indexes for Pandas, so that access to any row arbitrarily would be fast? Even better if the index is sparse and the data pulled out from Pandas on-demand (though I do not think this is possible).
CodePudding user response:
try this:
df.T.to_dict()
I don't know if you can transpose
a df with 1M columns and if you re looking for a dict
with values with type pd.Series
it is not a the solution
CodePudding user response:
I believe you want a dict with "ID" as key and row values as a list values:
pair_map = df.set_index("ID").transpose().to_dict("list")