Home > Software design >  Loop through many files in 2 directories
Loop through many files in 2 directories

Time:09-29

Given: Folder 1 with A.txt and B.txt and Folder 2 with A.txt. and B.txt How would I be able to run them concurrently such as file A.txt from folder 1 should run with file from folder 2 A.txt and so on. What I have so far loops through all of the second folders files and then loops through the first folders files, which throws it out of order. Some stuff will be done such as merging parts of the files together (which has been done so far). My main question is how would I be able to run through 2 directories simultaneously and do stuff inside them. Note there are many files in Folder 1 and Folder 2 so I need to find a way that utilizes directory schema of some sort patha=/folder1 pathb=/folder2

import os,glob
for filename in glob.glob(os.path.join(patha,'*.txt'):
 for filenamez in glob.glob(os.path.join(pathb,'*.txt'):
     MY FUNCTION THAT DOES OTHER STUFF

CodePudding user response:

Could you improve your question please.

CodePudding user response:

Your for loops are nested so it would not be concurrent. To run them concurrent, each for loop will have to be separate.

CodePudding user response:

Is zip what you're looking for?

import glob
import os

files_a = glob.glob(os.path.join(path_a, "*.txt")
files_b = glob.glob(os.path.join(path_b, "*.txt")
for file_a, file_b in zip(files_a, files_b):
    pass

CodePudding user response:

You could maybe do something like this:

from threading import Thread
import os,glob

def dir_iterate(path: str):
    for filename in glob.glob(os.path.join(path,'*.txt'):
        # Other stuff ..


path1 = "./directory1"
path2 = "./directory2"
Thread(target = dir_iterate, args=(path1,)).start()
Thread(target = dir_iterate, args=(path2,)).start()

CodePudding user response:

You can open files with the same name in both folders simultaneously using context managers and do whatever needs to be done from both input streams:

import os

my_folders = ['Folder1', 'Folder2']

common_files = set(os.listdir('Folder1')) & set(os.listdir('Folder2'))
non_common_files = set(os.listdir('Folder1')) ^ set(os.listdir('Folder2'))

print(f'common_files" {common_files}')
print(f'files without matches: {non_common_files}')

for f_name in common_files:
    with open(os.path.join(my_folders[0], f_name)) as src_1:
        with open(os.path.join(my_folders[1], f_name)) as src_2:
            # do the stuff on both sources... for instance print first line of each:
            print(f'first line of src_1: {src_1.readline()}')
            print(f'first line of src_2: {src_2.readline()}')

Output

common_files" {'A.txt'}
files without matches: set()
first line of src_1: some txt

first line of src_2: text in folder 2's A

CodePudding user response:

This should work,

import glob
import os

files_a = sorted(glob.glob(os.path.join(path_a, "*.txt")))
files_b = sorted(glob.glob(os.path.join(path_b, "*.txt")))

for file_a, file_b in zip(files_a, files_b):
    # Add code to concat
  • Related