Hello everyone I have my web scrape almost done I'm trying to figure out the last step. Which is to perform the steps; web scrape, save to data frame, and finally save to excel. Over a setlist.
for example, here's the code:
driver.get("website")
wait = WebDriverWait(driver, 20)
search_box = wait.until(EC.visibility_of_element_located((By.ID,"Search"))).send_keys("BC-9700021-1")
driver.switch_to.default_content()
submit_box = wait.until(EC.visibility_of_element_located((By.ID,"Submit"))).click()
order_list = []
order_info = {}
soup = BeautifulSoup(driver.page_source,'html.parser')
def correct_tag(tag):
return tag.name == "span" and tag.get_text(strip=True) in {
"Order Amount",
"Item Name",
"Date",
"Warehouse Number",
}
for t in soup1.find_all(correct_tag):
order_info[t.text] = t.find_next_sibling(text=True).strip()
order_list.append(order_info)
order_df1 = pd.DataFrame(order_list)
datatoexcel = pd.ExcelWriter('Order_sheet.xlsx')
order_df1.to_excel(datatoexcel)
datatoexcel.save()
output:
Order Amount: 7000
Item Name: Plastic Cup
Date: 7/1/2022
Warehouse Number: 000718
But at the very top where I type in the search box "BC-9700021-1" I want to be able to pull from a list saved in excel for the specific search. so the excel sheet would have a list as such:
BC-9700021-1
BC-9700024-1
BC-9700121-2
ETC.
ETC.
How could I get my program to perform the same steps as the first search but for the rest of the values without having to manually change the send key every time?
Any help would be greatly appreciated.
CodePudding user response:
Are you not familiar with for
loops? Just iterate through each of those search items.
Also, you can use Selenium here, but there's a good chance you can get the data via an api. But won't know unless you share the url/site.
a_list = ['BC-9700021-1', 'BC-9700024-1', 'BC-9700121-2']
order_list = []
order_info = {}
for eachId in a_list:
driver.get("website")
wait = WebDriverWait(driver, 20)
search_box = wait.until(EC.visibility_of_element_located((By.ID,"Search"))).send_keys(eachId)
driver.switch_to.default_content()
submit_box = wait.until(EC.visibility_of_element_located((By.ID,"Submit"))).click()
soup = BeautifulSoup(driver.page_source,'html.parser')
def correct_tag(tag):
return tag.name == "span" and tag.get_text(strip=True) in {
"Order Amount",
"Item Name",
"Date",
"Warehouse Number",
}
for t in soup1.find_all(correct_tag):
order_info[t.text] = t.find_next_sibling(text=True).strip()
order_list.append(order_info)
order_df1 = pd.DataFrame(order_list)
datatoexcel = pd.ExcelWriter('Order_sheet.xlsx')
order_df1.to_excel(datatoexcel)
datatoexcel.save()