I have 38 .csv
files, named Raw_data_unique_1
to Raw_data_unique_38
, and all have the same column structure.
I would like to read these files using the instruction I know pd.read_csv
and then append all of them in 1
unique file, say, data_unique
.
So, I create a list of indexes: lst = list(range(1,39))
and thougt I could just run:
for i in lst:
data_i = pd.read_csv('C:/.../Raw_data_unique_i.csv', sep=',', header=0)
But there is a misunderstanding on my part: the name Raw_data_unique_i.csv
is not recognized: "[Errno 2] No such file or directory", which means no number is assigned to index i
...
Could you explain me what I did wrong?
CodePudding user response:
You don't need Pandas for this operation
import shutil
with open('data_unique.csv', 'w') as out:
for i in range(1, 40):
with open(f'Raw_data_unique_{i}.csv') as inp:
shutil.copyfileobj(inp, out)
Here I used f-strings
to evaluate i
inside the string. If you use Python < 3.6, use str.format
method:
with open('Raw_data_unique_{i}.csv'.format(i=i)) as inp:
CodePudding user response:
Your "i" isn't a variable in the name like this.
Maybe a quick ficx would be declaring:
folder = 'C:/.../Raw_data_unique_'
format = csv
and then concatenate
file_name = folder i format
data_i = pd.read_csv(file_name, sep=',', header=0)
But remember that if you need to work each data_i during the loop iteration because it would be rewritten everytime. If you dont want that, you should look into the structure you have and find a way to better store it.
ps.: if you only want to limit your for iteration:
for i in range(1, 39):
CodePudding user response:
Use an f-string to make i
inside the string take the actual value of the variable i
:
for i in lst:
data_i = pd.read_csv(f'C:/.../Raw_data_unique_{i}.csv', sep=',', header=0)
Note the f
before the string and the {i}
inside the string.