I'm trying to webscrape the names and prices for this URL https://pricempire.com/search via the requests package. When I load the site there is a POST request called items and inside the request payload there is a attribute called captchaToken, basically I want to capture this attribute and use it to create requests for the other pages. I've been using Selenium to scrape the names and prices but it is very slow having to take 20 seconds to load each page. So my goal is to capture the captchaToken in the request payload which looks like this:
json_data = {
'page': 1,
'priceMin': 0,
'orderBy': 'price_desc',
'captchaToken': 'aaabbbcccxxxzzzyyy',
'priceMax': 200000,
'collections': [],
'weaponIds': [],
'wears': [],
'priceProvider': 'buff163',
}
to speed up scraping of the site. But I don't know if its possible and from my research I don't know the answer to this question.
CodePudding user response:
In post request, you have to add data which is sent by the server as payload
as json response and captcha Token
is also a part of payload data. Here is an example how to pull data (name, price) from api calls json response as post method.
import requests
import json
URL = "https://public-api.pricempire.com/api/search/items"
body= {"page":1,"priceMin":0,"orderBy":"price_desc","captchaToken":"03AGdBq271Msp7k_yCTzgNsheZ1yRqLWykDZL17tIK9_YAVo2uZGc3cLH0sNhuZOFsnymBSAbuzRRo2w_Cy6kEEMxaRxgkuZUlXFcDzRPWgYs-Hy-fV5SpxLjU8rACYW3KwZ8y-js1Dye8weAdMfZSPeEBgQ9YP3zdbaPrUOJAHHmjkpqTxH7vPW-Cd2PXHtZf5NlgVkxCBUKIESAyMJ6FyKdNz_WxYdIJvK4uQa6nBdHxMlmQZx6rUgus65NxZkwTaY3BO36ju68WNerv-fQBqFdIz_6jUPfav41DYFiApv9O-MbdASQqpS-ma1TG76mQ82OQdzkqqvpZtAksBGa836HzsxfaOecgbZ2YbswAHr1dXxl919DbRnZum4Wr-UUZMQ66j8Iy5UA_g4B3Ir7IxTf50KhTOrNHtqIIYuBR4Vfz6scc5c7XqATeqMoMvL-06wbBWVATSI44","priceMax":200000,"collections":[],"weaponIds":[],"wears":[],"priceProvider":"buff163"}
headers={
'content-type': 'application/json',
'User-Agent':'mozila/5.0/'
}
jsonData=requests.post(URL,headers=headers,data=json.dumps(body)).json()
for item in jsonData['items']:
name= item['name'].replace('★','').replace('|','').strip()
price=item['price']['price']
print(name)
print(price)
Output:
Souvenir AWP Dragon Lore (Minimal Wear)
100000000
Sticker iBUYPOWER (Holo) Katowice 2014
49999900
Sticker Titan (Holo) Katowice 2014
49999800
Souvenir AWP Dragon Lore (Field-Tested)
36388800
Sticker Reason Gaming (Holo) Katowice 2014
30000000
Souvenir AWP Dragon Lore (Battle-Scarred)
23618000
Sport Gloves Pandora's Box (Factory New)
22000000
Sticker Team LDLC.com (Holo) Katowice 2014
16750000
StatTrak™ Ursus Knife Crimson Web (Factory New)
15000000
StatTrak™ Talon Knife Crimson Web (Factory New)
15000000
StatTrak™ Nomad Knife Safari Mesh (Battle-Scarred)
15000000
Survival Knife Crimson Web (Factory New)
14999999
StatTrak™ Stiletto Knife Slaughter (Field-Tested)
14888800
Sport Gloves Vice (Factory New)
13400000
Sticker Vox Eminor (Holo) Katowice 2014
13000000
Sticker Team Dignitas (Holo) Katowice 2014
11886000
StatTrak™ M9 Bayonet Case Hardened (Factory New)
10999900
StatTrak™ Paracord Knife Crimson Web (Factory New)
10000000
StatTrak™ Ursus Knife Fade (Minimal Wear)
10000000
Sport Gloves Slingshot (Factory New)
10000000