Home > Enterprise >  How to replace the header of all CSV files in a directory?
How to replace the header of all CSV files in a directory?

Time:06-13

I have a folder of CSV files, and I need to simple replace the current header (first row), of the csv, with a different header. As an example, ever CSV has: A, B, C, D, E as the first first row header, but I need to be able to change that to whatever I want; i.e., Apple, Orange, Lemon, Pear, Peach || or, || 1, j, er, fd, j5

All the data in each CSV needs to be retained besides the header, and the replacement header will make all headers of all CSVs in the folder identical, per what is indicated in the code.

import shutil
import glob

files = glob.glob("/home/robert/Testing/D1/*.csv")

for i in range(len(files)):
    from_file = open(files[i]) 

    to_file = open(files[i], mode="w")
    to_file.write("id,t,s,p,date,e")
    shutil.copyfileobj(from_file, to_file)

I used this code, however, it deleted all of the other data in the CSV files, which I needed to keep, and only left/created the headers

CodePudding user response:

from glob import glob
from pathlib import Path


def update_folder(folder: Path):
    for fname in folder.glob('*.csv'):
        with open(fname) as fin:
            lines = fin.readlines()  # element 0 is A,B,C...
            lines[0] = 'Apple,Orange,Lemon\n'
        with open(fname, 'w') as fout:
            fout.write(''.join(readlines))

CodePudding user response:

I would suggest using the Python's tempfile module to create a temporary file with the changes in it and then, after they're made, it can simply be renamed to replaced the original file. I would also using its csv module to read the original and write the updated version because it fast, debugged, and can handle many varieties of CSV.

Using the combination make the task relatively easy:

import csv
import os
from pathlib import Path
from tempfile import NamedTemporaryFile


CSV_FOLDER = Path('/home/robert/Testing/D1')
NEW_HEADER = 'id,t,s,p,date,e'.split(',')

for filepath in CSV_FOLDER.glob('*.csv'):
    with open(filepath, 'r', newline='') as csv_file, \
         NamedTemporaryFile('w', newline='', dir=filepath.parent, delete=False) \
            as tmp_file:
        reader = csv.reader(csv_file)
        writer =csv.writer(tmp_file)
        _ = next(reader)  # Skip.
        writer.writerow(NEW_HEADER)  # Replacement.
        writer.writerows(reader)  # Copy remaining rows of original file.

    os.replace(tmp_file.name, filepath)  # Replace original file with updated version.

print('CSV files updated')

  • Related