Im trying to get the location for sotre from google maps but my code sometimes gets it for a store and sometimes it doesnt get it for another. here the link to the google colab
https://colab.research.google.com/drive/1ncrffQMGyeudUkMiGSrCfssifVScfYa-?usp=sharing
you can see in the end that it gets it for "blaze" and not for "apple" or "ferrari"
why and how can this be?
NOTE: it is NOT about the page having to load, i made it wait up until 20 seconds and it still does not work.
i expect to get the location for each link i give it to it
CodePudding user response:
You are using Xpath to find your element, so depending on the structure of the page it could change. I have completed some test with your data using BeautifulSoup library with Selenium.
I think it is more reliable to find address with CSS Selector. To help you, consider this documentation : https://saucelabs.com/resources/articles/selenium-tips-css-selectors
try this :
from selenium import webdriver
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup
import time
blaze = 'https://www.google.com/maps/place/Blaze Pizza/@24.5014283,54.3896917,17z/data=!3m1!4b1!4m5!3m4!1s0x3e5e676982d20b17:0xe2c5b69e67e4c85d!8m2!3d24.5014283!4d54.3896917'
apple = 'https://www.google.com/maps/place/Apple Yas Mall/@24.4881123,54.6064438,17z/data=!3m1!4b1!4m5!3m4!1s0x3e5e457d92f94e27:0x5c1646b499917d03!8m2!3d24.4881123!4d54.6086325?authuser=0&hl=en'
ansam='https://www.google.com/maps/place/Ansam Building 3/@24.4833165,54.6020795,17z/data=!4m5!3m4!1s0x3e5e45db58e6a423:0x23953eb0c87dfd3c!8m2!3d24.4834477!4d54.5999224?authuser=0&hl=en'
ferrari='https://www.google.com/maps/place/Ferrari World Abu Dhabi/@24.4836388,54.6059205,17z/data=!4m5!3m4!1s0x3e5e457e2d394a05:0x6076df4876c470a9!8m2!3d24.4837634!4d54.6070066?authuser=0&hl=en'
yas='https://www.google.com/maps/place/Yass winter carnival/@24.4886382,54.6183841,17z/data=!4m5!3m4!1s0x3e5e4f9134f9bac3:0x68162aeae1d91d21!8m2!3d24.4898629!4d54.6217851?authuser=0&hl=en'
yas1='https://www.google.com/maps/place/Yas Links Abu Dhabi/@24.4756507,54.6019735,14.83z/data=!4m5!3m4!1s0x3e5e4582ecaaecab:0xb3e0f29a13cc00d5!8m2!3d24.4783288!4d54.5999317?authuser=0&hl=en'
links = [blaze, apple, ansam, ferrari, yas, yas1]
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("--incognito")
options.add_argument('--start-maximized')
options.add_argument('--start-fullscreen')
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options = options)
def get_location(links):
address_list = []
for link in links:
driver.get(link)
page_html= driver.page_source
soup = BeautifulSoup(page_html, 'lxml')
address = soup.select_one('div.rogA2c div.fontBodyMedium').string
address_list.append(address)
time.sleep(5)
return address_list
Best regards,
Benjamin
CodePudding user response:
absolute xpath
is always fragile, instead use relative xpath
Instead of this
location = driver.find_element('xpath','//*[@id="QA0Szd"]/div/div/div[1]/div[2]/div/div[1]/div/div/div[11]/div[3]/button/div[1]/div[2]/div[1]').text
Try this
location = driver.find_element('xpath','(//div[@]//div[contains(@class,"fontBodyMedium")])[1]').text
CodePudding user response:
Each page has different page structure, so you need to point the element using a relative xpath. So, change this line
location = driver.find_element('xpath','//*[@id="QA0Szd"]/div/div/div[1]/div[2]/div/div[1]/div/div/div[11]/div[3]/button/div[1]/div[2]/div[1]').text
with this one
location = driver.find_element('xpath','//button[@data-item-id="address"]').text