Home > Blockchain >  scrape data using keyword Python
scrape data using keyword Python

Time:10-22

I have this code using Python Requests library:

import requests 

test_URL = "https://www.gasbuddy.com/station/194205"

def get_data(link):
    hdr = {'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Mobile Safari/537.36'}
    req = requests.get(link,headers=hdr)
    content = req.content()
    print(content)

get_data(test_URL)

On the website: https://www.gasbuddy.com/station/194205 there is a section Regular which shows the regular price for gas. I want to grab that value, but have never done this before so am not sure how I would enter a keyword query perhaps within the get request? Any pointers or help on how to?

CodePudding user response:

The website has a few mechanisms inplace to prevent webscraping (or to make it harder):

You can use bs4 to analyse the response you get with requests. (pip install beautifulsoup4 https://pypi.org/project/beautifulsoup4/)

import requests
from bs4 import BeautifulSoup

url = "https://www.gasbuddy.com/station/194205"
hdr = {
    'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Mobile Safari/537.36'}
resp = requests.get(url, headers=hdr)

After getting the response you can use soup.select like this to extract the price for regular and premium

soup = BeautifulSoup(resp.text, "html.parser")
regular, premium = (item.text for item in soup.select('span[class*="FuelTypePriceDisplay-module__price___"]'))

At the time writing you get:

('152.9¢', '162.9¢')
  • Related