I want to read out data from different files in one folder. The files have the names: "1.csv", "2.csv", "3.csv" ... "96.csv". But instead of reading them in from the top to the bottom, it reads in "1.csv", "10.csv", "11.csv"... "2.csv", "21.csv". Anyone knows how to fix this problem?
Thanks!
def csv_readout_folder(path):
os.chdir(path)
files = glob.glob(path '/' '*.csv')
all_data = pd.DataFrame()
for f in files:
data = csv_readout(path,f)
all_data = pd.concat([all_data, data])
return all_data
CodePudding user response:
In you code for f in files:
should read the files in the order they appear in the list. You can try sort functions but it may be easier to make a new list like this:
file_lst=[]
for k in range(1,97):
file_lst.append(f'{str(k)}.csv')
s1=pd.Series(file_lst)
def csv_readout_folder(path):
os.chdir(path)
files = glob.glob(path '/' '*.csv')
all_data = pd.DataFrame()
for f in list(s1[s1.isin(file_lst)]):
data = csv_readout(path,f)
all_data = pd.concat([all_data, data])
return all_data
CodePudding user response:
You can do something like
files = [f'{path}/{i}.csv' for i in range(1, 22)]
instead of
files = glob.glob(path '/' '*.csv')
UPD:
def csv_readout_folder(path):
os.chdir(path)
n_files = len([el for el in os.scandir(path) if el.is_file()])
files = [f'{path}/{i}.csv' for i in range(1, n_files 1)]
all_data = pd.DataFrame()
for f in files:
data = csv_readout(path,f)
all_data = pd.concat([all_data, data])
return all_data