I'm using this line of code to iterate through files ending with the .tar extension, using what I believe to be a regex character '*'.
for f in glob.glob('{}/{}/Compressed_Files/*.tar'.format(path, site_id)):
How can I do this same thing but also include files ending in the csv.gz extension? Using a regex or operator maybe?
CodePudding user response:
glob
doesn't support patterns that can match multiple strings like that. Just combine two globs.
g1 = glob.glob('{}/{}/Compressed_Files/*.tar'.format(path, site_id))
g2 = glob.glob('{}/{}/Compressed_Files/*.tar.gz'.format(path, site_id))
for f in g1 g2:
# code
If there are lots of matches, it may be better to use glob.iglob()
, which is an iterator. Then use itertools.chain()
to combine them.
CodePudding user response:
with generator, will not cache all results, maybe there can be many files
def glob_patterns(patterns: list[str]):
for pattern in patterns:
for path in glob.iglob(pattern):
yield path
for path in glob_patterns([
'{}/{}/Compressed_Files/*.tar'.format(path, site_id),
'{}/{}/Compressed_Files/*.tar.gz'.format(path, site_id)
]):
print(path)