How to save the variables as different files in a for loop?-CodePudding

I have a list of csv file pathnames in a list, and I am trying to save them as dataframes. How can I do it?

import pandas as pd
import os
import glob

# use glob to get all the csv files
# in the folder
path = "/Users/azmath/Library/CloudStorage/OneDrive-Personal/Projects/LESA/2022 HY/All"
csv_files = glob.glob(os.path.join(path, "*.xlsx"))
  
# loop over the list of csv files
for f in csv_files:
    
    # read the csv file
    df = pd.read_excel(f)  
    display(df)
    print()

The issue is that it only prints. but I dont know how to save. I would like to save all the data frames as variables, preferably as their file names.

CodePudding user response：

By “save” I think you mean store dataframes in variables. I would use a dictionary for this instead of separate variables.

import os


data = {}

for f in csv_files:
    name = os.path.basename(f)

    # read the csv file
    data[name] = pd.read_excel(f)  
    display(data[name])
    print()

Now all your dataframes are stored in the data dictionary where you can iterate them (easily handle all of them together if needed). Their key in the dictionary is the basename (filename) of the input file.

Recall that dictionaries remember insertion order, so the order the files were inserted is also preserved. I'd probably recommend sorting the input files before parsing - this way you get a reproducible script and sequence of operations!

CodePudding user response：

try this:

a = [pd.read_excel(file) for file in csv_files]

Then a will be a list of all your dataframe. If you want a dictionary instead of list:

a = {file: pd.read_csv(file) for file in csv_files}