Home > Software engineering >  Get directories only with glob pattern using pathlib
Get directories only with glob pattern using pathlib

Time:09-10

I want to use pathlib.glob() to find directories with a specific name pattern (*data) in the current working dir. I don't want to explicitly check via .isdir() or something else.

Input data

This is the relevant listing with three folders as the expected result and one file with the same pattern but that should be part of the result.

ls -ld *data
drwxr-xr-x 2 user user 4,0K  9. Sep 10:22 2021-02-11_68923_data/
drwxr-xr-x 2 user user 4,0K  9. Sep 10:22 2021-04-03_38923_data/
drwxr-xr-x 2 user user 4,0K  9. Sep 10:22 2022-01-03_38923_data/
-rw-r--r-- 1 user user    0  9. Sep 10:24 2011-12-43_3423_data

Expected result

[
    '2021-02-11_68923_data/', 
    '2021-04-03_38923_data/',
    '2022-01-03_38923_data/'
]

Minimal working example

from pathlib import Path
cwd = Path.cwd()

result = cwd.glob('*_data/')
result = list(result)

That gives me the 3 folders but also the file.

Also tried the variant cwd.glob('**/*_data/').

CodePudding user response:

glob is insufficient here. From the filesystem's perspective, the directory's name really is "2021-02-11_68923_data", not "2021-02-11_68923_data/". Since glob only looks at names, it cannot differentiate between "regular" files and directories, and you'd have to add some additional check, such as isdir that you mentioned.

CodePudding user response:

The trailing path separator certainly should be respected in pathlib glob patterns. This is the expected behaviour in shells on all platforms, and is also how the glob module works:

If the pattern is followed by an os.sep or os.altsep then files will not match.

So, as a work-around, you can use the glob module to get the behaviour you want:

>>> import glob
>>> glob.glob('*')
['html', 'images', 'test.py']
>>> glob.glob('*/')
['html/', 'images/']

The issue with pathlib was fixed in bpo-22276, and merged in Python-3.11.0rc1 (see what's new: pathlib). So if you want to stick with pathlib, please test it out and report any issues.

  • Related