I am trying to write a faster way to read in a group of CSV files. The format of the files is I have a common partial path which leads to a group of subfolders, which are identified by some identifier, and then a file name that starts with the identifier, and then ends with a common end phrase.
For example, lets say I have group names A, B, C. The file paths would be:
C:\Users\Name\Documents\A\A-beginninggroup.csv
C:\Users\Name\Documents\A\A-middlegroup.csv
C:\Users\Name\Documents\A\A-endinggroup.csv
C:\Users\Name\Documents\B\B-beginninggroup.csv
C:\Users\Name\Documents\B\B-middlegroup.csv
C:\Users\Name\Documents\B\B-endinggroup.csv
C:\Users\Name\Documents\C\C-beginninggroup.csv
C:\Users\Name\Documents\C\C-middlegroup.csv
C:\Users\Name\Documents\C\C-endinggroup.csv
I am trying to write a code where I can just change the name of the subgroup without having to change it in each read_csv line. The following code shows the logic, but not sure how to make it work/if its possible.
intro='C:\Users\Name\Documents\'
subgroup='C'
ending1='-endinggroup.csv'
ending2='-middlegroup.csv'
ending3='-beginninggroup.csv'
filename_1=intro subgroup '\' subgroup ending1
filename_2=intro subgroup '\' subgroup ending2
filename_3=intro subgroup '\' subgroup ending3
file1=pd.read_csv(filename_1)
file2=pd.read_csv(filename2)
file3=pd.read_csv(filename3)
CodePudding user response:
I am not sure exactly where you are after, but you can use an F-string in this case.
You first define your variable (names in your case):
location = 'somewhere\anywhere'
group = 'A'
csv = 'A-beginninggroup.csv'
Now you combine these variables in an F-string:
file_location = f"{location}\{group}\{csv}"
And pass the file_location to your pandas csv reader. You can freely change the group variable and the csv variable.
CodePudding user response:
I can just change the name of the subgroup without having to change it in each read_csv line.
You can define a function to handle the logic of joining path:
import os
intro='C:\\Users\\Name\\Documents\\'
subgroup='C'
ending1='-endinggroup.csv'
ending2='-middlegroup.csv'
ending3='-beginninggroup.csv'
def read_file(subgroup, ending):
csv_path = os.join(intro, subgroup, subgroup ending)
df = pd.read_csv(csv_path)
return df
file1 = read_file('A', ending1)
file2 = read_file('A', ending2)
file3 = read_file('B', ending1)