Home > Software engineering >  Pandas Reading CSV With Common Path but Different Names
Pandas Reading CSV With Common Path but Different Names

Time:03-24

I am trying to write a faster way to read in a group of CSV files. The format of the files is I have a common partial path which leads to a group of subfolders, which are identified by some identifier, and then a file name that starts with the identifier, and then ends with a common end phrase.

For example, lets say I have group names A, B, C. The file paths would be:

C:\Users\Name\Documents\A\A-beginninggroup.csv

C:\Users\Name\Documents\A\A-middlegroup.csv

C:\Users\Name\Documents\A\A-endinggroup.csv

C:\Users\Name\Documents\B\B-beginninggroup.csv

C:\Users\Name\Documents\B\B-middlegroup.csv

C:\Users\Name\Documents\B\B-endinggroup.csv

C:\Users\Name\Documents\C\C-beginninggroup.csv

C:\Users\Name\Documents\C\C-middlegroup.csv

C:\Users\Name\Documents\C\C-endinggroup.csv

I am trying to write a code where I can just change the name of the subgroup without having to change it in each read_csv line. The following code shows the logic, but not sure how to make it work/if its possible.

intro='C:\Users\Name\Documents\'
subgroup='C'
ending1='-endinggroup.csv'
ending2='-middlegroup.csv'
ending3='-beginninggroup.csv'

filename_1=intro subgroup '\' subgroup ending1
filename_2=intro subgroup '\' subgroup ending2
filename_3=intro subgroup '\' subgroup ending3

file1=pd.read_csv(filename_1)
file2=pd.read_csv(filename2)
file3=pd.read_csv(filename3)

CodePudding user response:

I am not sure exactly where you are after, but you can use an F-string in this case.

You first define your variable (names in your case):

location = 'somewhere\anywhere'
group = 'A'
csv = 'A-beginninggroup.csv'

Now you combine these variables in an F-string:

file_location = f"{location}\{group}\{csv}"

And pass the file_location to your pandas csv reader. You can freely change the group variable and the csv variable.

CodePudding user response:

I can just change the name of the subgroup without having to change it in each read_csv line.

You can define a function to handle the logic of joining path:

import os


intro='C:\\Users\\Name\\Documents\\'
subgroup='C'
ending1='-endinggroup.csv'
ending2='-middlegroup.csv'
ending3='-beginninggroup.csv'


def read_file(subgroup, ending):
    csv_path = os.join(intro, subgroup, subgroup ending)
    df = pd.read_csv(csv_path)
    return df

file1 = read_file('A', ending1)
file2 = read_file('A', ending2)
file3 = read_file('B', ending1)
  • Related