Home > Net >  How to reuse result of looping through and filtering file names instead of filtering again each time
How to reuse result of looping through and filtering file names instead of filtering again each time

Time:07-24

This script generates a table of contents .md file for VS Code extension "Notes" (Dion Munk) users and works as intended.

However, get_body() is highly inefficient, in that it re-scans all files for every key in the cats (categories) dictionary.

How can I loop through files only once and still achieve desired result?

_toc.py

"""
Python script to generate a table of contents .md file for VSCode "Notes" users
Run script in Notes.notesLocation to generate, then open _toc.md in preview mode
User must prefix note file names with corresponding values in cats (categories)
    i.e. dj_admin_model.md, py_polymorphism.md, st_ascii.md 
User could put a top link in every note to quickly return to table of contents
    e.g. [< content](_toc.md)
"""
import os

dbug = True
path = '.'
ftyp = '.md'
file = '_toc.md'
cats = {
   'Config'    : '_',
   'Django'    : 'dj_',
   'Markdown'  : 'md_',
   'Python'    : 'py_',
   'Standard'  : 'st_',
   'VSCode'    : 'vs_',
}

def get_files():
    for _, _, files in os.walk(path):
        return (f for f in files if f.lower().endswith(ftyp.lower()))

def get_body():
   body = \
      f'["{file}" generated by running "{__file__}".]: #\n\n# Content\n\n'

   for key, val in cats.items():
      body  = f'### {key}\n'
      # relooping files for each key in cats inefficient - TODO
      for f in get_files():
         if f.startswith(val):
            body  = \
               f"- [{f.replace('.md', '').split('_', 1)[1]}]({f})\n"
      body  = '\n'
   return body

def write_toc():
   with open(file, mode='wt') as f:
      f.write(get_body())

def print_toc():
   with open(file) as f:
      print(f.read())

def main():
   write_toc()
   if dbug:
      print('_'*60)
      print_toc()

if __name__ == '__main__':
   main()

_toc.md (example desired output)

["_toc.md" generated by running "_toc.py".]: #

# Content

### Config
- [toc](_toc.md)

### Django
- [admin_model](dj_admin_model.md)

### Markdown
- [syntax](md_syntax.md)

### Python
- [file_stream](py_file_stream.md)
- [operators](py_operators.md)
- [pip](py_pip.md)
- [polymorphism](py_polymorphism.md)
- [venv](py_venv.md)

### Standard
- [ascii](st_ascii.md)

### VSCode
- [keyboard](vs_keyboard.md)

_toc.md (mock preview)

Content

Django

Markdown

Python

Standard

VSCode

CodePudding user response:

f_list solution by Jarvis - get_files() now only called once!

Revised get_body():

def get_body():
   f_list = list(get_files())
   body = \
      f'["{file}" generated by running "{__file__}".]: #\n\n# Content\n\n'

   for key, val in cats.items():
      body  = f'### {key}\n'
      for f in f_list:
         if f.startswith(val):
            body  = \
               f"- [{f.replace('.md', '').split('_', 1)[1]}]({f})\n"
      body  = '\n'
   return body
  • Related