How do i make my df.to_excel function write to an output path? After my script runs, I do not see the files in the output_path directory i have defined.
import pandas as pd
from openpyxl import load_workbook
import os
import datetime
output_path = 'C:/Users/g/Desktop/autotranscribe/python/Processed'
path = 'C:/Users/g/Desktop/autotranscribe/python/Matching'
cols_to_drop = ['PSI ID','PSIvet Region','PSIvet region num','Fax','County']
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State'}
for file in os.listdir("C:/Users/g/Desktop/autotranscribe/python/Matching"):
if file.startswith("PSI") and "(updated headers)" not in file:
dfs = pd.read_excel(file, sheet_name=None,skiprows=5)
output = dict()
for ws, df in dfs.items():
if ws.startswith("Cancelled Members"): df = df.drop('Active date', axis=1)
if any(ws.startswith(x) for x in ["New Members","PVCC"]):
continue
#if ws in ["New Members 03.22","PVCC"]: #sheetstoavoid
temp = df
dt = pd.to_datetime(os.path.getctime(os.path.join(path,file)),unit="s").replace(nanosecond=0)
output[ws] = temp
writer = pd.ExcelWriter(f'{file.replace(".xlsx","")} (updated headers).xlsx')
for ws, df in output.items():
df.to_excel(writer, index=None, sheet_name=ws)
writer.save()
writer.close()
I tried df.to_excel(writer,output_path, index=None, sheet_name=ws)
But i get an error File "", line 36, in df.to_excel(writer,output_path, index=None, sheet_name=ws)
TypeError: to_excel() got multiple values for argument 'sheet_name'.
CodePudding user response:
A few comments:
- The function
os.listdir()
only returns "unqualified" file names, so before usingfile
, we need to prependpath
using something likeinput_file_name = f'{path}/{file}'
. - Similarly,
pd.ExcelWriter()
will need a qualified file name (that is, including the path as well as the "unqualified" file name), which we can get by doing this:output_file_name = f'{output_path}/{file.replace(".xlsx","")} (updated headers).xlsx'
. - There are some elements of the code in your question that may not be getting used, but rather than comment on or change those, I provide a working version with minimal changes below.
I created directories named Matching
and Processed
. I placed a file named PSI 123.xlsx
in Matching
with a tab named Cancelled Members
containing the following:
will skip
will skip
will skip
will skip
will skip
Col1 Col2 Col3 Active date
xx NY 110 789
I then ran the following modification to your code (note the changes to output_path
and path
for testing purposes in my environment):
import pandas as pd
from openpyxl import load_workbook
import os
import datetime
#output_path = 'C:/Users/g/Desktop/autotranscribe/python/Processed'
#path = 'C:/Users/g/Desktop/autotranscribe/python/Matching'
output_path = './Processed'
path = './Matching'
cols_to_drop = ['PSI ID','PSIvet Region','PSIvet region num','Fax','County']
column_name_update_map = {'Account name': 'Company Name','Billing address':'Address','Billing city':'City','Billing State':'State'}
for file in os.listdir(path):
if file.startswith("PSI") and "(updated headers)" not in file:
input_file_name = f'{path}/{file}'
dfs = pd.read_excel(input_file_name, sheet_name=None,skiprows=5)
output = dict()
for ws, df in dfs.items():
if ws.startswith("Cancelled Members") and 'Active date' in df.columns: df = df.drop('Active date', axis=1)
if any(ws.startswith(x) for x in ["New Members","PVCC"]):
continue
#if ws in ["New Members 03.22","PVCC"]: #sheetstoavoid
temp = df
dt = pd.to_datetime(os.path.getctime(os.path.join(path,file)),unit="s").replace(nanosecond=0)
output[ws] = temp
output_file_name = f'{output_path}/{file.replace(".xlsx","")} (updated headers).xlsx'
writer = pd.ExcelWriter(output_file_name)
for ws, df in output.items():
df.to_excel(writer, index=None, sheet_name=ws)
writer.save()
writer.close()
After running, the code had created a new file in Processed
named PSI 123 (updated headers).xlsx
with sheets named as in the input. The sheet Cancelled Members
contained the following:
Address State Zip Status Status.1 Date Partner
Col1 Col2 Col3
xx NY 110