Home > Enterprise >  Reading all the csv files with Pandas inside a folder Location Python
Reading all the csv files with Pandas inside a folder Location Python

Time:02-23

How would I be able to have to make a python script that can read all the csv files in a folder and dynamically assign variable names to it. The variable names for the datasets will be of the name of the csv files. Example the csv file allocated at the specified folder_location FTX_BTCUSD.csv will be named FTX_BTCUSD as a variable. Pandas will be used to read the csv files using pd.read_csv(). How would I be able to modify the function below to get the expected output.

import pandas as pd 

folder_location = 'C:\\Users\\local\\datasets'

def pandas_datasets():
    data= pd.read_csv(**dataset location**, low_memory=False)
    return data 

Folder csv files enter image description here

Contents of FTX_BTCUSD.csv

unix,date,symbol,open,high,low,close,Volume,Volume USD
1644430740000.0,2022-02-09 18:19:00,BTC/USD,44079.0,44096.0,44076.0,44088.0,0.8561232648339684,37744.7625
1644430680000.0,2022-02-09 18:18:00,BTC/USD,44069.0,44079.0,44055.0,44079.0,3.9830549127702537,175569.0775
1644430620000.0,2022-02-09 18:17:00,BTC/USD,44074.0,44079.0,44055.0,44069.0,7.00427581973723,308671.4311
1644430560000.0,2022-02-09 18:16:00,BTC/USD,44077.0,44078.0,44056.0,44074.0,4.813299484957118,212141.3615
1644430500000.0,2022-02-09 18:15:00,BTC/USD,44033.0,44078.0,44033.0,44077.0,8.620666560791342,379973.12

Contents of Binance_BTCUSD.csv

Unix Timestamp,Date,Symbol,Open,High,Low,Close,Volume,Volume USDT,tradecount
1625531700000,2021-07-06 00:35:00,BTC/USDT,34039.99000000,34053.86000000,34023.50000000,34053.86000000,14.84518900,505301.96192877,271
1625531640000,2021-07-06 00:34:00,BTC/USDT,34025.88000000,34049.42000000,34020.11000000,34039.99000000,22.81862800,776568.09766191,404
1625531580000,2021-07-06 00:33:00,BTC/USDT,34050.32000000,34062.77000000,34017.25000000,34025.89000000,16.08606300,547582.96995436,418
1625531520000,2021-07-06 00:32:00,BTC/USDT,34014.66000000,34058.40000000,34001.64000000,34058.39000000,26.11695400,888581.95856325,593

CodePudding user response:

You can create variables through globals() althought this is not recommended. It is much better to use a dictionary with the keys as the file names.

Try:

import os
data = dict()
for file in os.listdir(folder_location):
    if file.endswith(".csv"):
        data[file.replace(".csv","")] = pd.read_csv(os.path.join(folder_location, file))

If you absolutlely must create dynamic variables with the file names, use the below instead:

import os
for file in os.listdir(folder_location):
    if file.endswith(".csv"):
        globals()[file.replace(".csv","")] = pd.read_csv(os.path.join(folder_location, file))
  • Related