I'm working on an assignment to get the inventory information of a specific product from a website, using python
The product url is: https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389
I'm quite new to calling post/get requests, but from studying the information in the network tab of the browser, I found that I can get the information I need by clicking on the "Revisa disponibilidad en tiendas aquí" link a couple of lines below he add to cart button. In the network tab I can see this link calls this request:
https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView
If I concatenate the url with the parameters it uses, I get this:
https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView?storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12511%2C12524%2C12539%2C12542%2C12552%2C12583%2C12591%2C12592%2C12598%2C12605%2C12613%2C12614%2C12616%2C14003&productId=378219&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=378219&sibstore=12511%2C12524%2C12539%2C12542%2C12552%2C12583%2C12591%2C12592%2C12598%2C12605%2C12613%2C12614%2C12616%2C14003&requesttype=ajax&authToken=-1002%2Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%3D
And if I run it directly in my browser, I get the information I need:
/* {"InventoryAvailability": [ {"physicalStoreName": "8860", "availableQuantity": "0"}, {"physicalStoreName": "8762", "availableQuantity": "7"}, {"physicalStoreName": "8661", "availableQuantity": "17"}, {"physicalStoreName": "8798", "availableQuantity": "2"}, {"physicalStoreName": "8744", "availableQuantity": "0"}, {"physicalStoreName": "8763", "availableQuantity": "0"}, {"physicalStoreName": "1165", "availableQuantity": "13"}, {"physicalStoreName": "8691", "availableQuantity": "18"}, {"physicalStoreName": "8692", "availableQuantity": "15"}, {"physicalStoreName": "8648", "availableQuantity": "13"}, {"physicalStoreName": "8747", "availableQuantity": "0"}, {"physicalStoreName": "8748", "availableQuantity": "0"}, {"physicalStoreName": "8702", "availableQuantity": "14"}, {"physicalStoreName": "8746", "availableQuantity": "0"}] }*/
Now, I've tried building a python script to replicate this, but when I run it I get this error response:
{"errorCode": "2540",
"errorMessage": "CMN3101E El sistema no est� disponible debido a \"ErrorCode=2540\n\".",
"errorMessageKey": "_ERR_GENERIC",
"errorMessageParam": [{"ErrorCode": "2540"}],
"correctiveActionMessage": "",
"correlationIdentifier": "3bff941d:1814799a223:-4d74",
"exceptionData": {"ErrorCode": "2540"},
"exceptionType": "1",
"originatingCommand": "",
"systemMessage": "El error siguiente se ha producido durante el proceso: \"ErrorCode=2540\n\"."}*/
As I said, this is the first time I try to use requests in a python script, so maybe I'm doing something wrong. I'm thinking it might have something to do with the authtoken parameter, but I'm not sure how to deal with it. Is there a way to pass it from the browser to the script? This is my code. Any suggestions?
import requests
url = 'https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView'
payload = {
"storeId":"10351",
"catalogId":"10101",
"langId":"-5",
"physicalStoreListIds":"12511%2C12524%2C12539%2C12542%2C12552%2C12583%2C12591%2C12592%2C12598%2C12605%2C12613%2C12614%2C12616%2C14003",
"productId":"378219",
"fulfilment_type":"Store",
"productavailable":"true",
"type":"ItemBean",
"sibstore":"12511%2C12524%2C12539%2C12542%2C12552%2C12583%2C12591%2C12592%2C12598%2C12605%2C12613%2C12614%2C12616%2C14003",
"requesttype":"ajax",
"authToken":"-1002%2Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%3D"
}
x = requests.post(url, data=payload)
print(x.text)
EDIT: It seems my question is not clear enough. Sorry about that. So here goes a summary with more detail:
I need to build a Python script to somehow get the inventory of the product in this URL: https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389
In the product page, if I click on the "Revisa disponibilidad en tiendas aquí" link almost under the add to cart button, a popup appears with the inventory of the product for all the physical stores.
From the network tab in chrome's developer tools, I can see this popup is filled with the output of this POST request: https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView
If I run the same request from my browser by concatening all parameters in the payload, I get a response with the inventory information I need. This only works in my browser, as long as I don't close it. If I try on another browser, in incognito mode or in a python script, I get the error described in the original question. This is a sample payload:
storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12505%2C12521%2C12554%2C12555%2C12565%2C12567%2C12578%2C12585%2C12609%2C14503&productId=208282&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=208282&sibstore=12505%2C12521%2C12554%2C12555%2C12565%2C12567%2C12578%2C12585%2C12609%2C14503&displayPopupSiblingstores=&requesttype=ajax&authToken=-1002%2Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%3D
I made a test script (code on top), trying to run the request, passing the parameters of the payload, but also get the same error.
As I said, I'm quite new to running request (GET or POST) from scripts. But my theory is that this error is due to the authtoken parameter in the payload. I need to somehow get a valid authtoken (from a browser, or a script) and use it to execute my request. At least that's my theory, but I'm not sure. Is that correct? If yes, how can I do it? If not, what else can I try?
EDIT 2:
Test code I used to check mechanize, still got the same error when checking resp2.read():
import time
import mechanize
from bs4 import BeautifulSoup
br = mechanize.Browser()
resp = br.open("https://www.homedepot.com.mx/banos/accesorios-para-bano/juegos-de-accesorios/accesorios-bao-adelyn-4-piezas-cromo-130389")
html_string = (resp.read()).decode("utf-8")
# f = open("resp.html", "a")
# f.write(html_string)
# f.close()
time.sleep(5)
soup = BeautifulSoup(html_string, "html.parser")
tmp = soup.find("div", {"id": "physicalSelectedStoreList"})
refresh_url = tmp["refreshurl"]
print("refresh_url: {}".format(refresh_url))
print("")
authtoken_full = refresh_url.split("authToken=")[1]
print("authtoken_full: {}".format(authtoken_full))
authtoken = authtoken_full.split("&storeId")[0]
print("authtoken: {}".format(authtoken))
req_url = "https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView?storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12505%2C12521%2C12554%2C12555%2C12565%2C12567%2C12578%2C12585%2C12609%2C14503&productId=208282&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=208282&sibstore=12505%2C12521%2C12554%2C12555%2C12565%2C12567%2C12578%2C12585%2C12609%2C14503&displayPopupSiblingstores=&requesttype=ajax&authToken="
req_url = req_url authtoken
print("")
print("Full request: {}".format(req_url))
time.sleep(5)
resp2 = br.open(req_url)
time.sleep(5)
print("")
print(">>>>>>>>>>>>>>> info")
print(resp2.info())
print(">>>>>>>>>>>>>>> read")
print(resp2.read())
CodePudding user response:
If the URL works as expected in one of your browsers but differently when used in Python, a solution might be to mimic a browser request with your original string by providing a User-Agent
header (https://en.wikipedia.org/wiki/User_agent):
import requests
url = 'https://www.homedepot.com.mx/GetStorePopUpInventoryStatusByIDView?storeId=10351&catalogId=10101&langId=-5&physicalStoreListIds=12511%2C12524%2C12539%2C12542%2C12552%2C12583%2C12591%2C12592%2C12598%2C12605%2C12613%2C12614%2C12616%2C14003&productId=378219&fulfilment_type=Store&productavailable=true&type=ItemBean&catalogEntryIdToUse=378219&sibstore=12511%2C12524%2C12539%2C12542%2C12552%2C12583%2C12591%2C12592%2C12598%2C12605%2C12613%2C12614%2C12616%2C14003&requesttype=ajax&authToken=-1002%2Chqj050NRqj8jCOeOf8xNj4dGjQaR1rxBdxDNL2QdATA%3D'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
response = requests.get(url, headers=headers)
print(response.content)
How to use Python requests to fake a browser visit a.k.a and generate User Agent?
CodePudding user response:
I can you give you a hacky way of doing this, install postman on your local machine, and then hit the endpoint that you want to get the data from. Make sure you get the right response back, then once you confirm that, within Postman you have an option to convert the call to python/node/curl. etc. Easiest way to make sure your call works and then can switch to any language!