Home > Net >  How to crawl question and answer of Google People Also Ask with Selenium and Python for a quantity t
How to crawl question and answer of Google People Also Ask with Selenium and Python for a quantity t

Time:05-16

I found a good solution, but it works on the number of questions and answers that Google gives by default, but for example I need more.

I am a novice developer on Python. How do I get more questions and answers? Do I have to implement a click first to disclose the required amount and then parse?

CodePudding user response:

The following code parse the questions appearing on screen, then asks if you want to parse more questions or not. If you enter y then it clicks on the last question's button so that more are loaded in the page. The questions are stored in the list questions, the answers in the list answers

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service

your_path = '...'
driver = webdriver.Chrome(service=Service(your_path))

driver.get('https://www.google.com/search?q=How to make bakery?&source=hp&ei=j0aZYYjRAvja2roPrcWcyAU&iflsig=ALs-wAMAAAAAYZlUn4NMUPjfIpQmrXSmjIDnaWjJXWIJ&ved=0ahUKEwjI1JDn0Kf0AhV4rVYBHa0iB1kQ4dUDCAc&uact=5&oq=How to make bakery?&gs_lcp=Cgdnd3Mtd2l6EAMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBNQAFgAYJMDaABwAHgAgAF-iAF-kgEDMC4xmAEAoAECoAEB&sclient=gws-wiz')

questions, answers = [], []
while 1:
    for idx,question in enumerate(driver.find_elements(By.CSS_SELECTOR, "div[id*='RELATED_QUESTION']")):
        if idx >= len(questions): # skip already parsed questions
            questions.append(question.text)
            txt = ''
            for answer in question.find_elements(By.CSS_SELECTOR, "div[id*='WEB_ANSWERS_RESULT']"):
                txt  = answer.get_attribute('innerText')
            answers.append(txt)
    inp = input(f'{idx 1} questions parsed, continue? (y/n)')
    if inp == 'y':
        question.click()
        time.sleep(2)
    else:
        break
  • Related