Append new data into existing excel file pandas python-CodePudding

hello everyone I'm attempting to add new data (columns and values) to an already existing excel spreadsheet. I have Order_Sheet.xlsx saved with data as such:

 Item:         Quantity:      Price: 
disposable cups   7000         $0.04

I wanted to add a set of new data that I've pulled from my website using this code which works perfectly:

soup = BeautifulSoup(html_doc, "html.parser")

def correct_tag(tag):
return tag.name == "span" and tag.get_text(strip=True) in {
    "Order Number",
    "Location",
    "Date",
}

order_list = []
order_info = {}

for t in soup.find_all(correct_tag):
order_info[t.text] = t.find_next_sibling(text=True).strip()

order_list.append(order_info)
order_df = pd.DataFrame(order_list)
order_df.head()

datatoexcel = pd.ExcelWriter('Order_List.xlsx')
order_df.to_excel(datatoexcel)
datatoexcel.save()
print('DataFrame is Written to Excel Successfully.')

Output :

  Order Number    Location    Date
 0 A-21-897274     Ohio        07/01/2022

But I want to take this info printed: Order Number, Location, Date and the values and add them to the existing excel sheet Order_Sheet.xlsx instead of creating a new excel. so that it would look like :

 Item:         Quantity:      Price:   Order Number:    Location:    Date:
 disposable cups   7000         $0.04   A-21-897274      Ohio     07/01/2022

Is there an easy way to append new data to an existing excel or possibly combine two excel files?

CodePudding user response：

I find this in the documentation.

  {storage_options}
        .. versionadded:: 1.2.0
    if_sheet_exists : {{'error', 'new', 'replace', 'overlay'}}, default 'error'
        How to behave when trying to write to a sheet that already
        exists (append mode only).
        * error: raise a ValueError.
        * new: Create a new sheet, with a name determined by the engine.
        * replace: Delete the contents of the sheet before writing to it.
        * overlay: Write contents to the existing sheet without removing the old
          contents.
        .. versionadded:: 1.3.0

Here is the full link: https://github.com/pandas-dev/pandas/blob/main/pandas/io/excel/_base.py#L800-L814

CodePudding user response：

Working only with pandas 1.4 . The following code assumes that the order of the row are the same between the first and the second write. It also assumes that you exactly know the number of existing columns.

import pandas as  pd


df2 = pd.DataFrame({"c": [3, 5], "d": [8, 9]})
df3 = pd.DataFrame({"c": [9, 10], "d": [-1, -9]})
df4 = pd.DataFrame({"a": [1, 2], "b": [3, 4]})

with pd.ExcelWriter('./Order_List.xlsx', mode='w') as writer:
    df2.to_excel(writer, index=False)

with pd.ExcelWriter('./Order_List.xlsx', mode="a", if_sheet_exists="overlay") as writer:
    df3.to_excel(writer, startrow=3, header=False, index=False)
    df4.to_excel(writer, startrow=0, startcol=2, header=True, index=False)

Link to the 1.4 documentation.