Home > Blockchain >  How ro read a lot of csv indexed files
How ro read a lot of csv indexed files

Time:03-14

I have 38 .csv files, named Raw_data_unique_1 to Raw_data_unique_38, and all have the same column structure.

I would like to read these files using the instruction I know pd.read_csv and then append all of them in 1 unique file, say, data_unique.

So, I create a list of indexes: lst = list(range(1,39)) and thougt I could just run:

for i in lst:
    data_i = pd.read_csv('C:/.../Raw_data_unique_i.csv', sep=',', header=0)

But there is a misunderstanding on my part: the name Raw_data_unique_i.csv is not recognized: "[Errno 2] No such file or directory", which means no number is assigned to index i...

Could you explain me what I did wrong?

CodePudding user response:

You don't need Pandas for this operation

import shutil

with open('data_unique.csv', 'w') as out:
    for i in range(1, 40):
        with open(f'Raw_data_unique_{i}.csv') as inp:
            shutil.copyfileobj(inp, out)

Here I used f-strings to evaluate i inside the string. If you use Python < 3.6, use str.format method:

with open('Raw_data_unique_{i}.csv'.format(i=i)) as inp:

CodePudding user response:

Your "i" isn't a variable in the name like this.

Maybe a quick ficx would be declaring:

folder = 'C:/.../Raw_data_unique_'
format = csv

and then concatenate

file_name = folder i format
data_i = pd.read_csv(file_name, sep=',', header=0)

But remember that if you need to work each data_i during the loop iteration because it would be rewritten everytime. If you dont want that, you should look into the structure you have and find a way to better store it.

ps.: if you only want to limit your for iteration:

for i in range(1, 39):

CodePudding user response:

Use an f-string to make i inside the string take the actual value of the variable i:

for i in lst:
    data_i = pd.read_csv(f'C:/.../Raw_data_unique_{i}.csv', sep=',', header=0)

Note the f before the string and the {i} inside the string.

  • Related