I have the following code:
folder_names = []
spreadsheet_contents = []
all_data = pd.DataFrame()
current_directory = Path.cwd()
for folder in current_directory.iterdir():
folder_names.append(folder.name)
file_n = '*.csv'
spreadsheet_path = folder / file_n
spreadsheet_contents.append(pd.read_excel(spreadsheet_path, skiprows = 1, header = None, usecols = [5]))
The problem is that the .csv files in each folder are named differently. The '*.csv' method does not work. Does anyone have an idea how to open the .csv file for each subfolder even though they are all named differently?
CodePudding user response:
for the sake of simplicity im not writing the complete code,
import glob
replace file_n = '*.csv'
with file_n = glob.glob('*.csv')
and loop over the list of file names.
CodePudding user response:
You can use glob:
import glob, os
os.chdir("/mydir")
for file in glob.glob("*.txt"):
print(file)
or simply os.listdir:
import os
for file in os.listdir("/mydir"):
if file.endswith(".txt"):
print(os.path.join("/mydir", file))
or if you want to traverse directory, use os.walk:
import os
for root, dirs, files in os.walk("/mydir"):
for file in files:
if file.endswith(".txt"):
print(os.path.join(root, file))
source : stackoverflow
Also check this out : Working with CSV file for Data Science
We can also use regex !