I wan to create a python script that print out a directory tree. I'm aware there are tons of information about the topic, and many ways to achieve it. Still, my problem really is about recursion.
In order to face the problem i choosed a OOP way: Create a Class TreeNode Store some props and methods calling in the os.walk function (ya i know I can use pathlib or other libs.) recursively create parent-child relationship of folders/files
First, the Class TreeNode: properties: data, children, parent methods: add_child(), get_level(), to get the level of the parent/child relation in order to print it later print_tree(), to actually print the tree (desired result shown above code)
class Treenode:
def __init__(self, data):
self.data = data
self.children = []
self.parent = None
def add_child(self,child):
child.parent = self
self.children.append(child)
def get_level(self):
level = 0
p = self.parent
while p:
level = 1
p = p.parent
return level
def print_tree(self):
spaces = " " * self.get_level() * 3
prefix = spaces "|__" if self.parent else ""
print(prefix self.data)
for child in self.children:
child.print_tree()
Second, the probelm. Function to creating the tree
def build_tree(dir_path):
for root,dirs,files in os.walk(dir_path):
if dir_path == root:
for d in dirs:
directory = Treenode(d)
tree.add_child(directory)
for f in files:
file = Treenode(f)
tree.add_child(file)
working_directories = dirs
else:
for w in working_directories:
build_tree(os.path.join(dir_path,w))
return tree
Finally, the main method:
if __name__ == '__main__':
tree = Treenode("C:/Level0")
tree = build_tree("C:/Level0")
tree.print_tree()
pass
The output of this code would be:
C:/Level0
|__Level1
|__0file.txt
|__Level2
|__Level2b
|__1file1.txt
|__1file2.txt
|__Level3
|__2file1.txt
|__LEvel4
|__3file1.txt
|__4file1.txt
|__2bfile1.txt
The desired output should be:
C:/Level0
|__Level1
|__Level2
|__Level3
|__LEvel4
|__4file1.txt
|__3file1.txt
|__2file1.txt
|__Level2b
|__2bfile1.txt
|__1file1.txt
|__1file2.txt
|__0file.txt
The problem lays in the tree.add_child(directory), since everytime the code get there it add the new directory (or file) as child of the same "root tree". Not in tree.children.children..etc So here's the problem. How do i get that. The if else statement in the build_tree() function is probably unecessary, i was trying to work my way around but no luck.
I know it's a dumb problem, coming from a lack of proper study of algorithms and data structures.. If you will to help though, i'm here to learn ^^
CodePudding user response:
This will do what you want:
def build_tree(parent, dir_path):
child_list = os.listdir(dir_path)
child_list.sort()
for child in child_list:
node = Treenode(child)
parent.add_child(node)
child_path = os.path.join(dir_path, child)
if os.path.isdir(child_path):
build_tree(node, child_path)
Then, for your main code, use:
if __name__ == '__main__':
root_path = "C:/Level0"
tree = Treenode(root_path)
build_tree(tree, root_path)
tree.print_tree()
The main change was to use os.listdir
rather than os.walk
. The problem with os.walk
is that it recursively walks the entire directory tree, which doesn't work well with the recursive build_tree
, which wants to operate on a single level at a time.
CodePudding user response:
You can use os.walk
, but then don't use recursion, as you don't want to repeat the call to os.walk
: one call gives all the data you need already. Instead use a dictionary to keep track of the hierarchy:
def build_tree(dir_path):
helper = { dir_path: Treenode(dir_path) }
for root, dirs, files in os.walk(dir_path, topdown=True):
for item in dirs files:
node = helper[os.path.join(root, item)] = Treenode(item)
helper[root].add_child(node)
return helper[dir_path]
if __name__ == "__main__":
tree = build_tree("C:/Level0")
tree.print_tree()