Home > Back-end >  Import CSV files to database by date
Import CSV files to database by date

Time:11-08

I have list of CSV files named by date like :

2020-01-01, 2020-01-02... 2020-01-31

I need to import them into an SQLite database in batch. My script:

file_name = '2020-01-{}.csv'
test = pd.concat([pd.read_csv(file_name.format(i)) for i in range(1, 9)])
test.to_sql('test', conn, if_exists='append', index=False)

But I have to manually add first number of day:

file_name = '2020-01-0{}.csv', file_name = '2020-01-1{}.csv', file_name = '2020-01-2{}.csv'...

How can I import all files into database, just giving years and month, making it reusable (so that if I add a new CSV file, code only imports new file)?

CodePudding user response:

You can do this process by following the steps

  1. Make a list of all the files
  2. Run a loop and add them into db

So the working solution should be like

import pandas as pd


def read_csv_file(file_name) -> pd.DataFrame:
    """Read csv file and return a dataframe"""
    df = pd.read_csv(file_name)
    return df


def get_all_csv_files() -> list:
    """Get all csv files in the directory"""
    import glob

    files = glob.glob("*.csv")

    return files  # Output : ['2020-01-01.csv', ... '2020-01-31.csv']


for file in get_all_csv_files():
    """Loop through all CSV files and do something"""
    df = read_csv_file(file)
    print(df)
    # Do something with the dataframe here like adding into the database

    # To move them into another directory/folder Create a folder "Processed" in the same dir
    import shutil

    shutil.move(file, "Processed")
  • Related