Home > Back-end >  How to recover a hidden ID from a query string from an XHR GET request?
How to recover a hidden ID from a query string from an XHR GET request?

Time:02-14

I'm trying to use the hidden airbnb api. I need to reverse engineer where the ID comes from in the query string of a GET request. For example, take this listing:

https://www.airbnb.ca/rooms/47452643

The "public" ID is shown to be 47452643. However, another ID is needed to use the API.

If you look at the XHR requests in Chrome, you'll see a request starting with " StaysPdpSections?operationName". This is the request I want to replicate. If I copy the request in Insomnia or Postman, I see a variable in the query string starting with:

"variables":"{"id":"U3RheUxpc3Rpbmc6NDc0NTI2NDM="

The hidden ID "U3RheUxpc3Rpbmc6NDc0NTI2NDM" is what I need. It is needed to get the data from this request and must be inserted into the query string. How can I recover the hidden ID "U3RheUxpc3Rpbmc6NDc0NTI2NDM" for each listing dynamically?

CodePudding user response:

That target id is burried really deep in the html....

import requests
from bs4 import BeautifulSoup as bs
import json

url = 'https://www.airbnb.ca/rooms/47452643'
req = requests.get(url)

soup = bs(req.content, 'html.parser')
script = soup.select_one('script[type="application/json"][id="data-state"]')
data = json.loads(script.text)

target  = data.get('niobeMinimalClientData')[2][1]['variables']
print(target.get('id'))

Output:

U3RheUxpc3Rpbmc6NDc0NTI2NDM=
  • Related