Home > Back-end >  move multiple csv files by column value using python
move multiple csv files by column value using python

Time:10-13

I have thousands of csv files under uncategorised parent folder, all the files have one date column containing same date for all the rows. I want to check the date value of each files and move/copy to month wise folder using python.

I have tried key = df.iloc[0]['Date'] but not able to use key.endswith or key.contains

CodePudding user response:

Here I am looping through the files, reading the first row of the date column. I have created new folders in the directory previously, with the names being each month of the year. Once I have read the date, I convert it to words (e.g. April, May). I then look through folders in the directory and is the date name and folder name match, I move the file to the folder.

import os
import pandas as pd
import datetime

files = os.listdir()
for file in files:
    if ".csv" in file:
        df = pd.read_csv(file)
        dates = df['date']
        date = dates[0]
        date = datetime.datetime.strptime(date, "%d/%m/%y")
        date = date.strftime("%B")
        for folder in files:
            if date.lower() == folder.lower():
                os.rename(file, folder "\\" file)

CodePudding user response:

If all CSV files have the same structure, for example:

date, col_1, col_2, ..., col_n
20210923, 123, 456, ..., 999

You can try:

from os import mkdir
from os.path import join, exists, isdir
from glob import glob
from shutil import move
from datetime import datetime
import csv


FOLDER_CSV = 'data/csv'
FOLDER_MONTHS = 'data/csv_by_month'
DATE_FORMAT = '%Y%m%d'

if not exists(FOLDER_MONTHS):
    mkdir(FOLDER_MONTHS)
    
for file in glob(join(FOLDER_CSV, '*.csv')):
    with open(file, newline='') as ftx:
        reader = csv.reader(ftx)
        row = reader.__next__()
        row = reader.__next__()
    
    # Replaces the row[0] Index for index in the date column in the CSV
    month = datetime.strptime(row[0], DATE_FORMAT).strftime('%B').lower()
    
    if not exists(join(FOLDER_MONTHS, month)):
        mkdir(join(FOLDER_MONTHS, month))

    try:
        if isdir(join(FOLDER_MONTHS, month)):
            move(file, join(FOLDER_MONTHS, month))
    except OSError:
        print(f'{file} file is already in the {month} directory.')
  • Related