Home > Back-end >  Iterate through files ending with various extensions
Iterate through files ending with various extensions

Time:07-26

I'm using this line of code to iterate through files ending with the .tar extension, using what I believe to be a regex character '*'.

for f in glob.glob('{}/{}/Compressed_Files/*.tar'.format(path, site_id)): 

How can I do this same thing but also include files ending in the csv.gz extension? Using a regex or operator maybe?

CodePudding user response:

glob doesn't support patterns that can match multiple strings like that. Just combine two globs.

g1 = glob.glob('{}/{}/Compressed_Files/*.tar'.format(path, site_id))
g2 = glob.glob('{}/{}/Compressed_Files/*.tar.gz'.format(path, site_id))
for f in g1   g2:
    # code

If there are lots of matches, it may be better to use glob.iglob(), which is an iterator. Then use itertools.chain() to combine them.

CodePudding user response:

with generator, will not cache all results, maybe there can be many files

def glob_patterns(patterns: list[str]):
    for pattern in patterns:
        for path in glob.iglob(pattern):
            yield path

for path in glob_patterns([
    '{}/{}/Compressed_Files/*.tar'.format(path, site_id),
    '{}/{}/Compressed_Files/*.tar.gz'.format(path, site_id)
]):
    print(path)
  • Related