I have the following setup in a directory named apartments
:
apartments:
|_Blue
|__apartmentBlue1.xml
apartmentBlue2.xml
apartmentBlue3.xml
nonsense.txt
|_Red
|_apartmentRed1.xml
apartmentRed2.xml
apartmentRed3.xml
|_nonsense
I'm trying to get the file path for every file in every directory if that file ends with .xml
This is my code:
source: c:\data\desktop\buildingX\appartments
for root, dirs, files in os.walk(source):
for file in files:
for diro in dirs:
if file.endswith('.xml'):
file_path = os.path.join(source, diro, file)
print(file_path)
This gives me the desired output but I'm worried about the fact that my for loop is to nested, I would like to do more things with those paths but I feel like the further I nest it the more problems I will get. Are there any other ways to get it the file paths in a more compact way?
CodePudding user response:
you can do like this:
import glob
mydir = r '/home/'
for filename in glob.iglob(mydir '**/*.py', recursive = True):
print(filename)
CodePudding user response:
You can use the glob
built-in module
I have this directory structure
tree input_folder/
input_folder
└── year=2020
├── month=08
│ ├── day=02
│ │ ├── hour=03
│ │ │ └── input.txt
│ │ ├── hour=04
│ │ │ └── input.txt
│ │ ├── hour=05
│ │ │ └── input.txt
│ │ └── hour=06
│ │ └── input.txt
│ └── day=03
│ ├── hour=03
│ │ └── input.txt
│ ├── hour=04
│ │ └── input.txt
│ ├── hour=05
│ │ └── input.txt
│ └── hour=06
│ └── input.txt
└── month=09
└── day=02
├── hour=03
│ └── input.txt
└── hour=04
└── input.txt
You can get all of these files using this
In [1]: from glob import glob
In [2]: glob("input_folder/**/*.txt", recursive=True)
Out[2]:
['input_folder/year=2020/month=09/day=02/hour=03/input.txt',
'input_folder/year=2020/month=09/day=02/hour=04/input.txt',
'input_folder/year=2020/month=08/day=03/hour=05/input.txt',
'input_folder/year=2020/month=08/day=03/hour=03/input.txt',
'input_folder/year=2020/month=08/day=03/hour=04/input.txt',
'input_folder/year=2020/month=08/day=03/hour=06/input.txt',
'input_folder/year=2020/month=08/day=02/hour=05/input.txt',
'input_folder/year=2020/month=08/day=02/hour=03/input.txt',
'input_folder/year=2020/month=08/day=02/hour=04/input.txt',
'input_folder/year=2020/month=08/day=02/hour=06/input.txt']
In your case it should be
glob(os.path.join(source,"**", "*.xml"), recursive=True)