I Have a for loop which is performing some preprocessing and at the end of the loop I would like to output to csv. I can get it to output, however, it overwrites each time. I want a unique file each time. Thank you for the help in advance.
for filename in os.scandir(directory):
df = pd.read_csv(filename, index_col=('Full_Name'))
df = df[(df.Draft_Year>2003) & (df.Draft_Year<2022)]
df = df.drop(['Position','College','Draft_Year'], axis=1)
scaler = MinMaxScaler()
df = pd.DataFrame(scaler.fit_transform(df), columns = df.columns, index=df.index)
imputer = KNNImputer(n_neighbors=5)
df = pd.DataFrame(imputer.fit_transform(df),columns = df.columns, index=df.index)
df = df.to_csv(r"D:\Model Data\Exports\NFL Draft Model\processed.csv", index=True, header=True)`
CodePudding user response:
to use the original file name in your output :
for filename in os.scandir(directory):
df = pd.read_csv(filename, index_col=('Full_Name'))
df = df[(df.Draft_Year>2003) & (df.Draft_Year<2022)]
df = df.drop(['Position','College','Draft_Year'], axis=1)
scaler = MinMaxScaler()
df = pd.DataFrame(scaler.fit_transform(df), columns = df.columns, index=df.index)
imputer = KNNImputer(n_neighbors=5)
df = pd.DataFrame(imputer.fit_transform(df),columns = df.columns, index=df.index)
df = df.to_csv(fr"D:\Model Data\Exports\NFL Draft Model\{filename.name.split('.')[0]}_processed.csv", index=True, header=True)
CodePudding user response:
Write different file names each time. One way could be as follows:
i = 1
for filename in os.scandir(directory):
df = pd.read_csv(filename, index_col=('Full_Name'))
df = df[(df.Draft_Year>2003) & (df.Draft_Year<2022)]
df = df.drop(['Position','College','Draft_Year'], axis=1)
scaler = MinMaxScaler()
df = pd.DataFrame(scaler.fit_transform(df), columns = df.columns, index=df.index)
imputer = KNNImputer(n_neighbors=5)
df = pd.DataFrame(imputer.fit_transform(df),columns = df.columns, index=df.index)
df = df.to_csv(f"D:\Model Data\Exports\NFL Draft Model\processed{i}.csv", index=True, header=True)
i =1