I am new to the whole scraping thing and am trying to scrape some information off a website through python but when checking for HTML response (i.e. 200) I am not getting any results back on the terminal. below is my code. Appreciate all sort of help! Edit: I have fixed my rookie mistake in the print section below xD thank you guys for the correction!
import requests
url = "https://www.sephora.ae/en/shop/makeup-c302/"
page = requests.get(url)
print(page.status_code)
CodePudding user response:
For one thing, you don't print to the console in Python with the syntax Print = (page)
. That code assigns the page
variable to a variable called Print
, which is probably not a good idea as print
is a keyword in Python. In order to output to the console, change your code to:
print(page)
Second, printing page
is just printing the response
object you are receiving after making your GET request, which is not very helpful. The response
object has a number of properties you can access, which you can read about in the documentation for the requests
Python library.
To get the status code of your response, try:
print(page.status_code)
CodePudding user response:
The problem is that the page you are trying to scrape protects against scraping by ignoring requests from unusual user agents.
Set the user agent to some well-known string like below
import requests
url = "https://www.sephora.ae/en/shop/makeup-c302/"
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.63 Safari/537.36'
}
response = requests.get(url, headers=headers)
print(response.status_code)