hello everyone I'm attempting to add new data (columns and values) to an already existing excel spreadsheet. I have Order_Sheet.xlsx saved with data as such:
Item: Quantity: Price:
disposable cups 7000 $0.04
I wanted to add a set of new data that I've pulled from my website using this code which works perfectly:
soup = BeautifulSoup(html_doc, "html.parser")
def correct_tag(tag):
return tag.name == "span" and tag.get_text(strip=True) in {
"Order Number",
"Location",
"Date",
}
order_list = []
order_info = {}
for t in soup.find_all(correct_tag):
order_info[t.text] = t.find_next_sibling(text=True).strip()
order_list.append(order_info)
order_df = pd.DataFrame(order_list)
order_df.head()
datatoexcel = pd.ExcelWriter('Order_List.xlsx')
order_df.to_excel(datatoexcel)
datatoexcel.save()
print('DataFrame is Written to Excel Successfully.')
Output :
Order Number Location Date
0 A-21-897274 Ohio 07/01/2022
But I want to take this info printed: Order Number, Location, Date and the values and add them to the existing excel sheet Order_Sheet.xlsx instead of creating a new excel. so that it would look like :
Item: Quantity: Price: Order Number: Location: Date:
disposable cups 7000 $0.04 A-21-897274 Ohio 07/01/2022
Is there an easy way to append new data to an existing excel or possibly combine two excel files?
CodePudding user response:
I find this in the documentation.
{storage_options}
.. versionadded:: 1.2.0
if_sheet_exists : {{'error', 'new', 'replace', 'overlay'}}, default 'error'
How to behave when trying to write to a sheet that already
exists (append mode only).
* error: raise a ValueError.
* new: Create a new sheet, with a name determined by the engine.
* replace: Delete the contents of the sheet before writing to it.
* overlay: Write contents to the existing sheet without removing the old
contents.
.. versionadded:: 1.3.0
Here is the full link: https://github.com/pandas-dev/pandas/blob/main/pandas/io/excel/_base.py#L800-L814
CodePudding user response:
Working only with pandas 1.4 . The following code assumes that the order of the row are the same between the first and the second write. It also assumes that you exactly know the number of existing columns.
import pandas as pd
df2 = pd.DataFrame({"c": [3, 5], "d": [8, 9]})
df3 = pd.DataFrame({"c": [9, 10], "d": [-1, -9]})
df4 = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
with pd.ExcelWriter('./Order_List.xlsx', mode='w') as writer:
df2.to_excel(writer, index=False)
with pd.ExcelWriter('./Order_List.xlsx', mode="a", if_sheet_exists="overlay") as writer:
df3.to_excel(writer, startrow=3, header=False, index=False)
df4.to_excel(writer, startrow=0, startcol=2, header=True, index=False)
Link to the 1.4 documentation.