Home > Enterprise >  How to scrape reviews from chrome web store for a given extension?
How to scrape reviews from chrome web store for a given extension?

Time:11-17

I am trying to use this python code to scrape chrome web store

from lxml import html
import requests
url = 'https://chrome.google.com/webstore/detail/cookie-editor/hlkenndednhfkekhgcdicdfddnkalmdm'
values = {'username': '[email protected]',
          'password': 'mypassword'}
page = requests.get(url, data=values)
print(page)
tree = html.fromstring(page.content)
review = tree.xpath('//div[@]/text()')[0]
print(review)

however, I am getting Bad request 400. Is it even possible to scrape chrome web store?

CodePudding user response:

The webpage's contents are loaded by JavaScript. So you have to apply an automation tool something like Selenium to grab the right data.

Example:

from selenium import webdriver
import time
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By

options = webdriver.ChromeOptions()
options.add_experimental_option("detach", True)
webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service,options=options)

data = []
driver.get('https://chrome.google.com/webstore/detail/cookie-editor/hlkenndednhfkekhgcdicdfddnkalmdm')
driver.maximize_window()
time.sleep(3)

driver.find_element(By.XPATH,'//*[@ and contains(text(),"Review")]').click()
time.sleep(1)

soup = BeautifulSoup(driver.page_source,"html.parser")

data =[]
reviews = soup.select('div.ba-bc-Xb')
for review in reviews:
    name = review.select_one('span[]').get_text(strip=True)
    comment = review.select_one('div[]').get_text(strip=True)

    data.append({
        'name': name,
        'comment': comment
    })

print(data)

      

Oputput:

[{'name': 'PingPing But', 'comment': 'Love it..... so simple and easy to use !'}, {'name': 'Zhou Jeffrey', 'comment': "doesn't work anymore"}, {'name': 'eunice miralles', 'comment': 'same im trying to find a fix and in github they said it has a problem with permission but still not fixed'}, {'name': 'Jade Martinito', 'comment': 'me too'}, {'name': 'Bonafide Champ', 'comment': 'It works fine but it does this weird thing when I import cookies in incognito mode, 
the cookies still get imported in the main browser windows.'}, {'name': 'Arman Nawaz World', 'comment': 'Easy to use this extension. it is very user friendly and simple interface, while other looks little complicated\nReview by ArmanxNawaz'}, {'name': 'Bagong Pook Elementary School', 'comment': 'Easy to use! Very helpful'}, {'name': 'Whitelisted', 'comment': 'Works great for development and resetting website cookies without digging through your settings'}, {'name': 'Rehxn Ali', 'comment': 'Best!! Saved Alot of Money With This Extention'}, {'name': 'biniyam demeke', 'comment': 'Oh, Very Helpful'}, {'name': 'Pingu VFX', 'comment': 'Easy to use while scamming kids on their roblox accountes'}, {'name': 'Abstractedjuice09 Z', 'comment': 'how?'}, {'name': 'jd', 'comment': 'lol same'}, {'name': 'Arnells Designs', 'comment': 'good'}, {'name': 'David Galbraith', 'comment': 'How is this called a cookie "editor"?? Not working at all. When I open it, the extension shows  cookies for the page that I\'m currently on. It should be able to show cookies from every site I\'ve visited. And if I type ANYTHING in the search, nothing comes up. Not google, not Facebook, not steam, not one site that I have visited or logged into show up in the search bar. There is something very, very wrong. yeah, I can delete ALL cookies, but CCleaner does that just fine.'}, {'name': 'df fes', 'comment': 'Maybe you dont know how to use it?'}, {'name': 'Galih Kamulyan', 'comment': 'LEGENDARY'}, {'name': 'Aniket Chaudhary', 
'comment': 'Liked it. But after using it for sometime, it shows an "unknown error".'}, {'name': 'Anonymous', 'comment': "mine doesn't work for first time too , it always show unknown error"}, {'name': 'Ehsan Abtahee', 'comment': 'did u find a fix?'}, {'name': 'kashba', 'comment': 'if you find a fix.. do tell me'}, {'name': 'Nischay2004 Muller', 'comment': 'The best easy cookie editor for all , strongly recommended'}, {'name': 'ultra noob', 'comment': 'Super simple and easy to use.'}, {'name': 'विकास कालीरामना', 'comment': 'Loved it!'}, {'name': 'Zachary Bolt', 'comment': 'Clean, easy to use and actively updated. 5 Stars well earned.'}, {'name': 'TALHA JUBAYER', 'comment': "Love it .it's 
working"}, {'name': 'amrozain 2007', 'comment': 'good for hackers'}, {'name': 'Kazuko Masao', 'comment': 'Very good 
.. Very good .. Very good.'}, {'name': 'chase Brigette', 'comment': 'This extention seems to be the culprit that makes bing my default browser!!! The extension was good before I realized this -_-"'}, {'name': 'Digital Audio Directions', 'comment': 'This is a joke right?  Only seems to list cookies of the site you are on and all in a chopped up list format.  NO search function for existing stored cookies?  Search by keyword, date, etc,  Does not seem available.'}, {'name': 'Phantom V', 'comment': 'This seems outdated.'}, {'name': 'Anonymous ZN49', 'comment': 'Easy to use this extension. it is very user friendly and simple interface, while other looks little complicated.'}, {'name': 'YongYi Wu', 'comment': "Who don't love cookies?"}, {'name': 'hush', 'comment': 'was working fine, now im getting an import error'}]
  • Related