Endpoint: https://quizlet.com/webapi/3.2/images/search?query=hello&perPage=2
You guys can try to access this page as Incognito
, from my side It works. So I think I can fetch data from that site.
I try to copy the request and run in Javascirpt, Python. However, It doesn't work. I got 403
error.
I also try to use Burp Suite
. I can't access this site through Burp's browser.
Moreover, As I tried using incognito
so I don't think it is relevant to cookies.
Code sample (JS):
import fetch from "node-fetch";
const response = await fetch(
"https://quizlet.com/webapi/3.2/images/search?query=hello&perPage=2",
{
headers: {
accept:
"text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"accept-language": "en",
"cache-control": "no-cache",
pragma: "no-cache",
"sec-ch-ua":
'"Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93"',
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": '"Linux"',
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "none",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1",
},
referrerPolicy: "strict-origin-when-cross-origin",
body: null,
method: "GET",
mode: "cors",
credentials: "include",
}
);
const data = await response.status;
console.log(data);
Code Python
import requests
headers = {
'authority': 'quizlet.com',
'pragma': 'no-cache',
'cache-control': 'no-cache',
'sec-ch-ua': '"Google Chrome";v="93", " Not;A Brand";v="99", "Chromium";v="93"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Linux"',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36',
'accept': 'text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'sec-fetch-site': 'none',
'sec-fetch-mode': 'navigate',
'sec-fetch-user': '?1',
'sec-fetch-dest': 'document',
'accept-language': 'en',
'cookie': 'qi5=i2x3g7y1z9a6:t3vMoQQig2yLcpN.HKWn; qtkn=7gT4DE7pN9URJ2AFDYeaVe; fs=qzkse0; app_session_id=9781a407-4f37-4c09-8e97-8156f182bb45; search_session={"search_session_id":"-2379864199063990974614477b859794","query":"overrated","version":"1.1.1","platform":"WEB","depth":null,"target_object_type":"QImage"}; __cf_bm=cB7hRf6JbcOFZ2kvQ3W12V4bxXiIgn_kF3n87RcI0h0-1631877048-0-Ac Hi0pATLgW5N3JjqYa7uc5W4ZfDLOumvmCQixWJIKdcVj7stciFh8cYFVTOpr q5pM2Q7LrXC/LsffOB6Mh2E=; __cfruid=81f16a673e6117331dd4270b3f4f29111590d7d8-1631877048',
}
params = (
('query', 'hello'),
('perPage', '2'),
)
response = requests.get(
'https://quizlet.com/webapi/3.2/images/search', headers=headers, params=params)
# NB. Original query string below. It seems impossible to parse and
# reproduce query strings 100% accurately so the one below is given
# in case the reproduced version is not "correct".
# response = requests.get('https://quizlet.com/webapi/3.2/images/search?query=hello&perPage=2', headers=headers)
print(response.status_code)
Please help me out. I don't even know how can be that? (browser works, while code doesn't). Thank anyway.
CodePudding user response:
From the python side. I had a look out of interest, as I'm currently developing a REST API and was curious how they where securing it.
Using Wireshark it appears that the "requests" module in python does not handle http requests in the same manor as Chrome/Firefox, which I suspect they are using as a tell to give a captcha.
Anyway switching requests for the httpx module;
pip install httpx
And changing the headers to replicate Firefox in full;
import httpx
headers = [
('Accept','text/html,application/xhtml xml,application/xml;q=0.9,image/webp,*/*;q=0.8'),
('Accept-Encoding','gzip, deflate, br'),
('Accept-Language','en-GB,en;q=0.5'),
('Cache-Control','max-age=0'),
('Connection','keep-alive'),
('Host','quizlet.com'),
('Sec-Fetch-Dest','document'),
('Sec-Fetch-Mode','navigate'),
('Sec-Fetch-Site','none'),
('Sec-Fetch-User','?1'),
('TE','trailers'),
('Upgrade-Insecure-Requests','1'),
('User-Agent','Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0'),
]
params = (
('query', 'hello'),
('perPage', '2'),
)
response = httpx.get('https://quizlet.com/webapi/3.2/images/search', headers=headers, params=params,)
print(response.content)
Gives the following as appose to the captcha page for me;
{
"responses": [{
"models": {
"image": [{
"id": 18957872,
"personId": 16641862,
"timestamp": 1416579222,
"lastModified": 1416579222,
"code": "Gfg5XS88MRmYq8RS",
"license": 1,
"width": 480,
"height": 360,
"flickrId": null,
"flickrOwner": null,
"_legacyUrl": "http://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA.gif",
"_legacyUrlSquare": "http://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_s.gif",
"_legacyUrlSmall": "http://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_m.gif",
"_secureLegacyUrl": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA.gif",
"_secureLegacyUrlLarge": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_b.gif",
"_secureLegacyUrlSquare": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_s.gif",
"_secureLegacyUrlSmall": "https://o.quizlet.com/cZDE.6rHW7IrGptXSGm8FA_m.gif"
}, {
"id": 9228314,
"personId": 513525,
"timestamp": 1406222781,
"lastModified": 1406222781,
"code": "bPHbzaV7KsGWfuXJ",
"license": 1,
"width": 298,
"height": 232,
"flickrId": null,
"flickrOwner": null,
"_legacyUrl": "http://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA.jpg",
"_legacyUrlSquare": "http://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_s.jpg",
"_legacyUrlSmall": "http://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_m.jpg",
"_secureLegacyUrl": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA.jpg",
"_secureLegacyUrlLarge": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_b.jpg",
"_secureLegacyUrlSquare": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_s.jpg",
"_secureLegacyUrlSmall": "https://o.quizlet.com/ptqCa7LsKjiVSBVPI3OfTA_m.jpg"
}]
},
"paging": {
"total": 50,
"page": 1,
"perPage": 2,
"token": "UuKKKAkmxv.r4YtwFDuRevZVGAHr"
}
}]
}