Home > Mobile >  KeyError: "Not all names specified in 'columns' are found"
KeyError: "Not all names specified in 'columns' are found"

Time:03-03

I have a dataframe like as shown below

Date,cust,region,Abr,Number,,,dept
12/01/2010,Company_Name,Somecity,Chi,36,136,NaN,sales
12/02/2010,Company_Name,Someothercity,Nyc,156,NaN,41,mfg

tf = pd.read_clipboard(sep=',')

I am trying to do some manipulation to the excel file

writer = pd.ExcelWriter('duck_data.xlsx',engine='xlsxwriter')
for (cust,reg), v in df.groupby(['cust','region']):
    v.to_excel(writer, sheet_name=f"DATA_{cust}_{reg}",index=False, columns = modified_col_list)
writer.save()

but the problem is when the file is written, it uses unnamed:5, unnamed:6 to represent empty column names. So, I created a modified_col_list as shown below and passed it as input to to_excel function

ordiginal_col_list = ['Date','cust','region','Abr','Number',nan,nan,'dept']

modified_col_list = ['Date','cust','region','Abr','Number',' ',' ','dept']

But my objective is to have empty column names as is during excel write itself. But this resulted in below error

KeyError: "Not all names specified in 'columns' are found"

I expect my output to be like as shown below (you can see the column names are empty)

enter image description here

CodePudding user response:

You can use:

writer = pd.ExcelWriter('duck_data.xlsx',engine='xlsxwriter')
for (cust,reg), v in df.groupby(['cust','region']):
    #if columns name has `Unnamed` replace by empty string
    #v.columns = ['' if 'Unnamed' in x else x for x in v.columns]
    #if columnshas missing values replace them to empty string
    v.columns = v.columns.to_series().fillna('')
    #removed columns parameter
    v.to_excel(writer, sheet_name=f"DATA_{cust}_{reg}",index=False)
writer.save()

Another idea is change columns names before groupby:

#if columns name has `Unnamed` replace by empty string
#df.columns = ['' if 'Unnamed' in x else x for x in df.columns]
#if columnshas missing values replace them to empty string
df.columns = df.columns.to_series().fillna('')

writer = pd.ExcelWriter('duck_data.xlsx',engine='xlsxwriter')
for (cust,reg), v in df.groupby(['cust','region']):
    #removed columns parameter
    v.to_excel(writer, sheet_name=f"DATA_{cust}_{reg}",index=False)
writer.save()
  • Related