I'm trying to iterate over all directories in directory and find all .html files there. So far I've this code:
def find_path():
"""
:return: List
"""
paths = []
for filename in os.listdir(DIRECTORY):
if filename.endswith('.html'):
fname = os.path.join(DIRECTORY, filename)
with open(fname, 'r') as f:
soup = BeautifulSoup(f.read(), 'html.parser')
path = soup.select_one('#tree > li > span').contents[-1]
paths.append(path)
return paths
But it only works if all .html files are in one directory. What I need is to iterate over all .html files in this directory and save it, but for every directory in that directory there are also .html files that I need to have access to. So ideally, I need to open all of these directories in my parent directory and save whatever I need from .html files. Is there a way to do it?
Thanks!
CodePudding user response:
You can use the below sample snippet both #1 or #2 works:
import os
path = "."
for (root, dirs, files) in os.walk(path, topdown=True):
for file in files:
if file.endswith(".html"):
print(root "/" file) #1
print(os.path.join(root "/" file)) #2
CodePudding user response:
os.walk()
can help you
import os
def find_path(dir_):
for root, folders, names in os.walk(dir_):
for name in names:
if name.endswith(".html"):
# Your code
pass