Home > Enterprise >  Python - copying the contents of several files to one
Python - copying the contents of several files to one

Time:04-03

I need help. I have in directory some files:

9000_1.txt
9000_2.txt
7000_1.txt
7000_2.txt
7000_3.txt

I would like to save the contents of the files according to:

9000.txt as sum files 9000_1.txt and 9000_2.txt
7000.txt as sum files 7000_1.txt and 7000_2.txt and 7000_3.txt
and ect

Now I'm at moment where I have:

import os
import re

folderPath = r'C:/Users/a/Desktop/OD'

if os.path.exists(folderPath):
    files = []
    for name in os.listdir(folderPath):
        if os.path.isfile(os.path.join(folderPath, name)):
            files.append(os.path.join(folderPath, name))
    print(files)

    for ii in files:
        current = os.path.basename(ii).split("_")[0]

could anyone advise on a simple way to go about it?

CodePudding user response:

Sure - use glob.glob to conveniently find all matching files and our good friend collections.defaultdict to group the files up, and loop over those groups:

import glob
import os
import shutil
from collections import defaultdict


folder_path = os.path.expanduser("~/Desktop/OD")

# Gather files into groups
groups = defaultdict(set)

for filename in glob.glob(os.path.join(folder_path, "*.txt")):
    # Since `filename` will also contain the path segment,
    # we'll need `basename` to just take the filename,
    # and then we split it by the underscore and take the first part.
    prefix = os.path.basename(filename).split("_")[0]

    # Defaultdict takes care of "hydrating" sets, so we can just
    groups[prefix].add(filename)

# Process each group, in sorted order for sanity's sake.
for group_name, filenames in sorted(groups.items()):
    # Concoct a destination name based on the group name.
    dest_name = os.path.join(folder_path, f"{group_name}.joined")
    with open(dest_name, "wb") as outf:
        # Similarly, sort the filenames here so we always get the
        # same result.
        for filename in sorted(filenames):
            print(f"Adding {filename} to {dest_name}")
            with open(filename, "rb") as inf:
                # You might want to do something else such as
                # write line-by-line, but this will do a straight up
                # merge in sorted order.
                shutil.copyfileobj(inf, outf)

This outputs

Adding C:\Users\X/Desktop/OD\7000_1.txt to C:\Users\X/Desktop/OD\7000.joined
Adding C:\Users\X/Desktop/OD\7000_2.txt to C:\Users\X/Desktop/OD\7000.joined
Adding C:\Users\X/Desktop/OD\7000_3.txt to C:\Users\X/Desktop/OD\7000.joined
===
Adding C:\Users\X/Desktop/OD\9000_1.txt to C:\Users\X/Desktop/OD\9000.joined
Adding C:\Users\X/Desktop/OD\9000_2.txt to C:\Users\X/Desktop/OD\9000.joined
  • Related