A scraper that goes through each URL extracts price and appends to the dataframe.
The nested for loop which does the same but extracts multiple prices from same URL (if the try:
statement is True) and concatenate with df.
Output from code below skips prices for outer loop.
df = pd.DataFrame(columns=['Price'])
# Start scrapping
for url in card_link_list:
wd.get(url)
try:
# Check for No element such exception element
pack_sizes_elements = wd.find_elements(By.XPATH , "//div[@class='_1LiCn']/div")
for element in pack_sizes_elements:
element.click()
#Price
price = element.find_element(By.XPATH, ".//span[@class ='_2j_7u']").text.strip()
# concat with df
data_dict = {'PRICE':price}
df_dict = pd.DataFrame([data_dict])
df = pd.concat([df, df_dict])
except NoSuchElementException:
# Price
price = wd.find_element(By.XPATH, "//td[@data-qa ='productPrice']").text.strip()
# concat with df
data_dict = {'PRICE':price}
df_dict = pd.DataFrame([data_dict])
df = pd.concat([df, df_dict])
Edit: Even after fixing indentation of last line the issue is unsolved.
Output dataframe consist of rows only from nested loop and not the outer loop(where try: is False).
What change needs to be done or suggest a better way to append results in a dataframe.
CodePudding user response:
python is very sensitive to indentation.
your code:
except NoSuchElementException:
# Price
price = wd.find_element(By.XPATH, "//td[@data-qa ='productPrice']").text.strip()
# concat with df
data_dict = {'PRICE':price}
df_dict = pd.DataFrame([data_dict])
df = pd.concat([df, df_dict])
correct indendation below, keep an eye on the very last row:
except NoSuchElementException:
# Price
price = wd.find_element(By.XPATH, "//td[@data-qa ='productPrice']").text.strip()
# concat with df
data_dict = {'PRICE':price}
df_dict = pd.DataFrame([data_dict])
df = pd.concat([df, df_dict])
in your case df = pd.concat([df, df_dict])
falls out from except
scope.