Home > Enterprise >  API - Web Scrape
API - Web Scrape

Time:06-15

how to get access to this API:

import requests
    url = 'https://b2c-api-premiumlabel-production.azurewebsites.net/api/b2c/page/menu?id_loja=2691'
    print(requests.get(url))

I'm trying to retrieve data from this site via API, I found the url above and I can see its data , however I can't seem to get it right because I'm running into code 403. This is the website url: https://www.nagumo.com.br/osasco-lj46-osasco-ayrosa-rua-avestruz/departamentos

Obs: please be gentle it's my first post here =]

CodePudding user response:

Not sure exactly what your issue is here. Bu if you want to see the content of the response and not just the 200/400 reponses. You need to add '.content' to your print.

Eg.

#Create Session
s = requests.Session()

#Example Connection Variables, probably not required for your use case.
setCookieUrl = 'https://www...'
HeadersJson = {'Accept-Language':'en-us'}
bodyJson = {"__type":"xxx","applicationName":"xxx","userID":"User01","password":"password2021"}


#Get Request
p = s.get(otherUrl, json=otherBodyJson, headers=otherHeadersJson)
print(p) #Print response (200 etc)
#print(p.headers)
#print(p.content) #Print the content of the response.
#print(s.cookies)

CodePudding user response:

I'm also new here haha, but besides this requests library, you'll also need another one like beautiful soup for what you're trying to do.

bs4 installation: https:https://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-beautiful-soup

Once you install it and import it, it's just continuing what you were doing to actively get your data.

response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

this gets the entire HTML content of the page, and so, you can get your data from this page based on their css selectors like this:

site_data = soup.select('selector')

site_data is an array of things with that 'selector', so a simple for loop and an array to add your items in would suffice (as an example, getting links for each book on a bookstore site)

For example, if i was trying to get links from a site:

import requests
from bs4 import BeautifulSoup

sites = []
URL = 'https://b2c-api-premiumlabel-production.azurewebsites.net/api/b2c/page/menu?id_loja=2691'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

links = soup.select("a")  # list of all items with this selector

for link in links:
   sites.append(link)

Also, a helpful tip is when you inspect the page (right click and at the bottom press 'inspect'), you can see the code for the page. Go to the HTML and find the data you want and right click it and select copy -> copy selector. This will make it really easy for you to get the data you want on that site.

helpful sites: https://oxylabs.io/blog/python-web-scraping https://realpython.com/beautiful-soup-web-scraper-python/

  • Related