I have ten datasets which I have split into training and test sets:
names=["df1","df2","df3", "df4", "df5", "df6", "df7", "df8", "df9", "df10"]
dataset_list = []
for i in range(len(names)):
datasets = pd.read_csv(f"{fulldatafolder}/" names[i] "_Full_Dataset.csv")
dataset_list.append(datasets)
training_set_list=list()
test_set_list=list()
for dataset in dataset_list:
training_sets, test_sets=np.split(dataset, [int(.90*len(dataset))])
training_set_list.append(training_sets)
test_set_list.append(test_sets)
However, if I try to save all these datasets to their respective folders as follows:
for names, dataset in enumerate(training_set_list):
dataset.to_csv(f"{trainingfolder}/{format(names)}_Training_Set.csv", index=False, sep=",")
for names, dataset in enumerate(test_set_list):
dataset.to_csv(f"{testfolder}/{format(names)}_Test_Set.csv", index=False, sep=",")
I get the .csv files with a number (0,...,9) in front of "_Training_Set.csv" and "_Test_Set.csv" instead of their names "df1",...,"df10" specified in the list names
. How can I fix this?
CodePudding user response:
When using enumerate it returns counter and value. Basically, what you did here is introducing a new, local names
variable (instead of the previous one, with the list of names) with a counter from the enumerate. I guess you thought this would loop through the original variable.
If you want to loop through both lists, you could use zip()
:
for name, dataset in zip(names, training_set_list):
dataset.to_csv(f"{trainingfolder}/{format(name)}_Training_Set.csv", index=False, sep=",")
for name, dataset in zip(names, test_set_list):
dataset.to_csv(f"{testfolder}/{format(name)}_Test_Set.csv", index=False, sep=",")
In addition to that, I would change your first loop from:
for i in range(len(names)):
datasets = pd.read_csv(f"{fulldatafolder}/" names[i] "_Full_Dataset.csv")
to:
for name in names:
datasets = pd.read_csv(f"{fulldatafolder}/{name}_Full_Dataset.csv")
As you can see, no need to create a range, when you can directly loop through the list. Secondly, since you already use "f"
to format a string, better to use that variable directly in the string, instead of concatenating the string with the " "
sign.