Home > Enterprise >  how to create a dict of dict of dict of dataframes from list of file paths?
how to create a dict of dict of dict of dataframes from list of file paths?

Time:04-03

I have a list of file paths that I want to convert into data frames.

Here is what the files look like

enter image description here

To better help organize it I would like to have a dict where the key is the dates, and the values are a dict where their keys are the names and they have a dict where keys are results, sales, team, and the values are a dataframe of the file.

I hope I explained it well.

2022-03-23_John_result_data.csv
2022-03-23_John_sales_data.csv
2022-03-23_John_team_data.csv
2022-03-23_Lisa_result_data.csv
2022-03-23_Lisa_sales_data.csv
2022-03-23_Lisa_team_data.csv
2022-03-23_Troy_result_data.csv
2022-03-23_Troy_sales_data.csv
2022-03-23_Troy_team_data.csv
2022-03-25_Bart_result_data.csv
2022-03-25_Bart_sales_data.csv
2022-03-25_Bart_team_data.csv

EDIT

Sorry for the edit but assume the file name could be '2022-03-23_John love [23]_result_data.csv'] forgot to add this case where they could have a space between the names.

CodePudding user response:

You could probably iterate over the file names and do multiple dict.setdefaults (or use a defaultdict), eg:

filenames = ['2022-03-23_John_result_data.csv']

dfs = {}
for filename in filenames:
    date, name, category, _ = filename.split('_', 3)
    dfs.setdefault(date, {}).setdefault(name, {})[category] = pd.read_csv(filename)

Or using a defaultdict...

from collections import defaultdict

dfs = defaultdict(dict)

Then your dfs.setdefault(...) line becomes:

dfs[date][name][category] = pd.read_csv(filename)
  • Related