I have been trying to web scraping hotel reviews but on multiple page jumps, the url of the webpage doesn't change. So I am using webdriver from selenium to work this out. But I cannot use it in google collab in the first place. Any quick help will be really appreciated. Thanks!
Code :
from selenium import webdriver
import requests
from bs4 import BeautifulSoup
import pandas as pd
# install chromium, its driver, and selenium
!apt-get update
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
!pip install selenium
# set options to be headless, ..
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
# open it, go to a website, and get results
wd = webdriver.Chrome('chromedriver',options=options)
driver = webdriver.chrome()
driver.get("https://www.goibibo.com/hotels/highland-park-hotel-in-trivandrum-1383427384655815037/?hquery={"ci":"20211209","co":"20211210","r":"1-2-0","ibp":"v15"}&hmd=766931490eb7863d2f38f56c6185a1308de782c89dfeeea59d262b827ca15441bf50472cbfdc1ee84aeed8af756809a2e89cfd6eaea0fa308c1ca839e8c313d016ac0f5948658353cf30f1cd83050fd8e6adb2e55f2a5470cadeb0c28b7becc92ac44d81966b82408effde826d40fbff47525e09b5f145e321fe6d104e12933c066323798e33a911e0cbed7312fc1634f8f92fe502c8602556c9a02f34c047d04ff1400c995799156776c1a04e218d6486493edad5b0f7e51a5ea25f5f1cb4f5ed497ee9368137f6ec73b3b1166ee7c1a885920b90c98542e0270b4fa9004005cfe87a4d1efeaedc8e33a848f73345f09bec19153e8bf625cc7f9216e692a1bcc313e7f13a7fc091328b1fb43598bd236994fdc988ab35e70cf3a5d1856c0b0fa9794b23a1a958a5937ac6d258d121a75b7ce9fc70b9a820af43a8e9a3f279be65b5c6fbfff2ba20bfb0f3e3ee425f0b930bf671c50878a540c6a9003b197622b6ab22ae39e07b5174cb12bebbcd2a132bb8570e01b9e253c1bd83cb292de97a&cc=IN&reviewType=gi&vcid=3877384277955108166&srpFilters={"type":["Hotel"]}")
Error:
CodePudding user response:
When you issue the command:
!pip install selenium
by default it installs the latest Selenium 4.1.0
Initially this line of code:
wd = webdriver.Chrome('chromedriver',options=options)
initiates the Selenium driven ChromeDriver initiated google-chrome Browsing Context.
But the following line of code:
driver = webdriver.chrome()
is error prone as chrome()
is a module and is not callable as in:
from selenium.webdriver.chrome.options import Options
Hence you see the error:
'module' object is not callable
Solution
The initial line initiates the ChromeDriver / Chrome combo but with two DeprecationWarning as:
- DeprecationWarning: executable_path has been deprecated, please pass in a Service object
- find_element_by* commands are deprecated in selenium_
For the time being you can ignore the DeprecationWarning but you need to remove the line of code:
driver = webdriver.chrome()