I need help. I have in directory some files:
9000_1.txt
9000_2.txt
7000_1.txt
7000_2.txt
7000_3.txt
I would like to save the contents of the files according to:
9000.txt as sum files 9000_1.txt and 9000_2.txt
7000.txt as sum files 7000_1.txt and 7000_2.txt and 7000_3.txt
and ect
Now I'm at moment where I have:
import os
import re
folderPath = r'C:/Users/a/Desktop/OD'
if os.path.exists(folderPath):
files = []
for name in os.listdir(folderPath):
if os.path.isfile(os.path.join(folderPath, name)):
files.append(os.path.join(folderPath, name))
print(files)
for ii in files:
current = os.path.basename(ii).split("_")[0]
could anyone advise on a simple way to go about it?
CodePudding user response:
Sure - use glob.glob
to conveniently find all matching files and our good friend collections.defaultdict
to group the files up, and loop over those groups:
import glob
import os
import shutil
from collections import defaultdict
folder_path = os.path.expanduser("~/Desktop/OD")
# Gather files into groups
groups = defaultdict(set)
for filename in glob.glob(os.path.join(folder_path, "*.txt")):
# Since `filename` will also contain the path segment,
# we'll need `basename` to just take the filename,
# and then we split it by the underscore and take the first part.
prefix = os.path.basename(filename).split("_")[0]
# Defaultdict takes care of "hydrating" sets, so we can just
groups[prefix].add(filename)
# Process each group, in sorted order for sanity's sake.
for group_name, filenames in sorted(groups.items()):
# Concoct a destination name based on the group name.
dest_name = os.path.join(folder_path, f"{group_name}.joined")
with open(dest_name, "wb") as outf:
# Similarly, sort the filenames here so we always get the
# same result.
for filename in sorted(filenames):
print(f"Adding {filename} to {dest_name}")
with open(filename, "rb") as inf:
# You might want to do something else such as
# write line-by-line, but this will do a straight up
# merge in sorted order.
shutil.copyfileobj(inf, outf)
This outputs
Adding C:\Users\X/Desktop/OD\7000_1.txt to C:\Users\X/Desktop/OD\7000.joined
Adding C:\Users\X/Desktop/OD\7000_2.txt to C:\Users\X/Desktop/OD\7000.joined
Adding C:\Users\X/Desktop/OD\7000_3.txt to C:\Users\X/Desktop/OD\7000.joined
===
Adding C:\Users\X/Desktop/OD\9000_1.txt to C:\Users\X/Desktop/OD\9000.joined
Adding C:\Users\X/Desktop/OD\9000_2.txt to C:\Users\X/Desktop/OD\9000.joined