add a CSV dataframe to a dictionary separating by name-CodePudding

So I am doing a Time series/LSTM assignment and I have a stock dataset: https://www.kaggle.com/camnugent/sandp500

There are like 500 companies with a set of rows for each company, in the dataset, and what I want is to add the companies to a dictionary and set the key as the name of each company.

This is what I have for the moment:

dataframe = pd.read_csv('all_stocks_5yr.csv', parse_dates=['date'])
dataframe['date'] = pd.to_datetime(dataframe['date'])

grouped_df = dataframe.groupby('Name')

for i in grouped_df:
    df_dict = grouped_df[i].to_dict

CodePudding user response：

This would solve your problem:

gp = dataframe.groupby("Name")
my_dict = {} # This is the output you want
for record in gp: # record is a tuple containing the elements of a row
    if record[0] in my_dict: # record[0] will give the name of the company
        my_dict[record[0]].append(record)
    else:
        my_dict[record[0]] = [record]

print(my_dict)

Another way to handle this problem is iterating over the dataframe:

my_dict = {}
for index, record in dataframe.iterrows():
    if record['Name'] in my_dict:
        my_dict[record['Name']].append(record)
    else:
        my_dict[record['Name']] = [record]

print(my_dict)