I need to read in a csv file daily but certain numbers in the file name will change each day. The filename with directory included is C:\siglocal\pairoffs\\logs_20220804_084056_9500_capped_delta_for_singlestockdelta.csv
I have tried the below where I enter an asterisk after the _08 on the first row of the file path here. There are 9 digits after this part of the file name that change daily and then the last part of the file name (_capped_delta_for_singlestockdelta.csv) stays the same.
Any ideas what I need to do here?
df = pd.read_csv(r'C:\siglocal\pairoffs\\logs_20220804_08*' '_capped_delta_for_singlestockdelta.csv')
CodePudding user response:
I do not see how this is a pandas problem. If I understand correctly you are looking for a possibility to build a string with variables. Here you can use the .format()
statements:
r'C:\siglocal\pairoffs\\logs_20220804_08{0}_capped_delta_for_singlestockdelta.csv'.format(day)
CodePudding user response:
Perhaps use os.walk(...) and a regular expression to evaluate the files in the folder. Here's one possible implementation:
import os
import re
# define the folder where the files are located
src_folder = r"C:\_temp"
# define the regular expression to filter the files
file_regex = "logs_20220804_08([0-9][0-9][0-9][0-9]_[09][0-9][0-9][0-9])" \
"_capped_delta_for_singlestockdelta.csv"
for dir_path, dir_names, file_names in os.walk(src_folder):
# Each iteration contains:
# dir_path - current folder for the iteration
# dir_names - list of folders in the dir_path.
# file_names - list of files in the dir_path.
for file_name in file_names:
print("Evaluating file({}) in folder({})"
.format(file_name, dir_path))
match_obj = re.match(file_regex, file_name, re.M | re.I)
# match_obj will be None if there isn't a match
if match_obj:
print("{}File({}) matches our regular expression."
.format(" " * 5, file_name))
print("{}Changing number value is: {}"
.format(" " * 5, match_obj.group(1)))
else:
print("{}No match for file ({})"
.format(" " * 5, file_name))