Home > front end >  Scrape website using httpx and requests returns a timeout
Scrape website using httpx and requests returns a timeout

Time:01-12

I am trying to scrape this website https://www.blibli.com/p/facial-tissue-tisu-wajah-250-s-paseo/is--LO1-70001-00049-00003?seller_id=LO1-70001&sku_id=LO1-70001-00049-00001&sclid=7zuGEaS4hh5SowAA6tnfd5i2wKjR6e3p&sid=c5746ccfbb298d3b&pid=LO1-70001-00049-00001&pickupPointCode=PP-3227395

where I located that the url above uses this api https://www.blibli.com/backend/product-detail/products/is--LO1-70001-00049-00003/_summary?pickupPointCode=PP-3227395 to return the product details.

The response headers returns this

akamai-grn: 0.06d62c17.1673499257.a978ef0
content-encoding: gzip
content-length: 1838
content-security-policy: frame-ancestors 'self' https://ext.blibli.com/ https://mcdomo.id/
content-type: application/json
date: Thu, 12 Jan 2023 04:54:17 GMT
link: <https://www.blibli.com/xm1-6K/C5vh/MHG4/Ij6z/3uMhJV/7iSaJfNzYY/XHgJa1FGaAI/VW8QI1/hIGwU>; rel=preload; as=script
set-cookie: bm_sv=22A8F907C6313015C5C3083E1983894A~YAAQBtYsF2Cgj6GFAQAA2ClUpBIai8qi26yqmrx2D8ZyG4Bo/8dZETALmKhcPL3NIXVn4Ev4KGzGiabEq1nOQn9LMDu8wj7qcbuc2aCvCvWdeo zdQsat vNhpsYvp3bn28pS9zdU9SMQmwMlwj14P7xCnQke8 FD0XS92OT87sybuT63iEfivGZyo7PfRmRgfLqcSNa9sNbiGUi3N8aPBa863LkCxJ0pcZqF0n22Gv4phcuQIwxhgZdm6QJj2mjiQ==~1; Domain=.blibli.com; Path=/; Expires=Thu, 12 Jan 2023 06:53:34 GMT; Max-Age=7157; Secure
strict-transport-security: max-age=15724800; includeSubDomains
vary: Accept-Encoding
x-blibli-canary-mode: 0

and the payload is pickupPointCode=PP-3227395

I can't make sense what needs to be passed in the headers from the above headers reponse.So I tried just passing the user agent.

I tried using httpx, initially I tried requests it doesn't work too.

My code is as follows

from fake_useragent import UserAgent
import httpx
ua = UserAgent()
USER_AGENT = ua.random

headers={
'user-agent': USER_AGENT}
url=" https://www.blibli.com/backend/product-detail/products/is--LO1-70001-00049-00003/_summary?pickupPointCode=PP-3227395 "
response=httpx(url,headers=headers)

print(response.json())

This returns a timeout. Using requests just keeps running. Please do help to scrape the web.

CodePudding user response:

So doing something like this worked:

import requests

headers = {
    'User-Agent': 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)',
    }
link = 'https://www.blibli.com/backend/product-detail/products/is--LO1-70001-00049-00003/_summary?pickupPointCode=PP-3227395'
response = requests.get(link,headers=headers,timeout=10,verify=True).text 
print(response)

To get

{"code":200,"status":"OK","data":{"url":"https://www.blibli.com/p/facial-tissue-tisu-wajah-250-s-paseo/ps--LO1-70001-00049?pickupPointCode=PP-3227395","itemSku":"LO1-70001-00049-00003","name":"FACIAL TISSUE / TISU WAJAH 250'S PASEO","productSku":"LO1-70001-00049","productCode":"MTA-26204644","itemCode":"MTA-26204644-00005","pickupPointCode":"PP-3227395","ean":"","urlFriendlyName":"facial-tissue-tisu-wajah-250-s-paseo","stock":250,"stockLimitedThreshold":6,"uniqueSellingPoint":"• *Dengan kelembutan, kualitas premium dan harga yang sangat terjangkau\n<br>•*Membuat kepercayaan tisu Paseo sudah tidak diragukan lagi\n<br>•*Tisu yang terbuat dari serat alami 100%, higienis, lembut, berdaya serap tinggi dan kuat\n<br>•*Kemasan smart yang pas digunakan dalam semua aktivitas Anda serta cocok untuk keluarga","ampUrl":"https://www.blibli.com/amp/p/facial-tissue-tisu-wajah-250-s-paseo/is--LO1-70001-00049-00003","brand":{"name":"Paseo","code":"paseo","official":false,"anchor":""},"images":[{"full":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/full//92/MTA-26204644/paseo_facial_tissue_-_tisu_wajah_250-s_paseo_full03_grwfg6pr.jpg","thumbnail":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/thumbnail//92/MTA-26204644/paseo_facial_tissue_-_tisu_wajah_250-s_paseo_full03_grwfg6pr.jpg"}],"tags":["COMING_SOON","RETURNABLE"],"warranty":{},"expiration":{},"type":"REGULAR","attributes":[{"name":"Warna","type":"IMAGE","values":[{"image":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/thumbnail//92/MTA-26204644/paseo_facial_tissue_-_tisu_wajah_250-s_paseo_full05_cwf9260g.jpg","value":"T CLEAN PLUS 180"},{"image":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/thumbnail//92/MTA-26204644/paseo_facial_tissue_-_tisu_wajah_250-s_paseo_full08_byip7se1.jpg","value":"T GREEN SOFT 200"},{"image":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/thumbnail//92/MTA-26204644/paseo_facial_tissue_-_tisu_wajah_250-s_paseo_full04_ch6izw8k.jpg","value":"T JOLLY 250"},{"image":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/thumbnail//92/MTA-26204644/paseo_facial_tissue_-_tisu_wajah_250-s_paseo_full06_nwsucpg9.jpg","value":"T MONTISS 250"},{"image":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/thumbnail//92/MTA-26204644/paseo_facial_tissue_-_tisu_wajah_250-s_paseo_full03_grwfg6pr.jpg","value":"T NICE 180"},{"image":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/thumbnail//92/MTA-26204644/paseo_facial_tissue_-_tisu_wajah_250-s_paseo_full01_fdv8zpwb.jpg","value":"T PASEO SMART 250"},{"image":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/thumbnail//92/MTA-26204644/paseo_facial_tissue_-_tisu_wajah_250-s_paseo_full07_scwjth8w.jpg","value":"T SEE-U 200"},{"image":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/thumbnail//92/MTA-26204644/paseo_facial_tissue_-_tisu_wajah_250-s_paseo_full02_uq23xfpn.jpg","value":"T SEE-U 250"}]}],"options":[{"id":"is--LO1-70001-00049-00008","selected":false,"available":true,"attributes":[{"name":"Warna","value":"T GREEN SOFT 200"}],"pickupPointCode":"PP-3227395"},{"id":"is--LO1-70001-00049-00007","selected":false,"available":false,"attributes":[{"name":"Warna","value":"T SEE-U 200"}],"pickupPointCode":"PP-3227395"},{"id":"is--LO1-70001-00049-00006","selected":false,"available":true,"attributes":[{"name":"Warna","value":"T MONTISS 250"}],"pickupPointCode":"PP-3227395"},{"id":"is--LO1-70001-00049-00005","selected":false,"available":true,"attributes":[{"name":"Warna","value":"T CLEAN PLUS 180"}],"pickupPointCode":"PP-3227395"},{"id":"is--LO1-70001-00049-00004","selected":false,"available":true,"attributes":[{"name":"Warna","value":"T JOLLY 250"}],"pickupPointCode":"PP-3227395"},{"id":"is--LO1-70001-00049-00003","selected":true,"available":false,"attributes":[{"name":"Warna","value":"T NICE 180"}],"pickupPointCode":"PP-3227395"},{"id":"is--LO1-70001-00049-00002","selected":false,"available":true,"attributes":[{"name":"Warna","value":"T SEE-U 250"}],"pickupPointCode":"PP-3227395"},{"id":"is--LO1-70001-00049-00001","selected":false,"available":true,"attributes":[{"name":"Warna","value":"T PASEO SMART 250"}],"pickupPointCode":"PP-3227395"}],"categories":[{"level":1,"id":"53400","name":"Bliblimart","url":"/c/1/bliblimart/53400/53400"},{"level":2,"id":"PE-1000640","name":"Perawatan Rumah Tangga","url":"/c/2/perawatan-rumah-tangga/PE-1000640/53400"},{"level":3,"id":"TI-1000091","name":"Tisu","url":"/c/3/tisu/TI-1000091/53400"},{"level":4,"id":"TI-1000096","name":"Tisu Wajah","url":"/c/4/tisu-wajah/TI-1000096/53400"}],"masterCategories":[{"level":1,"id":"BL-1000030","name":"Bliblimart","url":"/c/1/bliblimart/BL-1000030/BL-1000030"},{"level":2,"id":"PE-1000587","name":"Perawatan Rumah Tangga","url":"/c/2/perawatan-rumah-tangga/PE-1000587/BL-1000030"},{"level":3,"id":"TI-1000085","name":"Tisu","url":"/c/3/tisu/TI-1000085/BL-1000030"},{"level":4,"id":"TI-1000088","name":"Tisu Wajah","url":"/c/4/tisu-wajah/TI-1000088/BL-1000030"}],"merchant":{"name":"LOOKUP 1689","code":"LO1-70001","rating":{"value":96,"badgeUrl":"https://www.static-src.com/siva/asset///07_2020/icon-top-rated-diamond.png","badge":"DIAMOND","positiveReview":100,"onTimeFulfillment":85,"activeResponse":98,"new":true},"international":false,"official":false,"location":"Kota Jakarta Barat, DKI Jakarta","warehouses":[],"warehouseLocations":[],"url":"/merchant/lookup-1689/LO1-70001","commissionType":"CM","logo":"https://www.static-src.com/wcsstore/Indraprastha/images/catalog/mlogo/LO1-70001-3c922926-6159-4dbd-8bbc-ce716e40ca50.jpg","businessHours":[{"day":"MONDAY","openingTime":32400,"closingTime":61200,"open":true},{"day":"TUESDAY","openingTime":32400,"closingTime":61200,"open":true},{"day":"WEDNESDAY","openingTime":32400,"closingTime":61200,"open":true},{"day":"THURSDAY","openingTime":32400,"closingTime":61200,"open":true},{"day":"FRIDAY","openingTime":32400,"closingTime":61200,"open":true},{"day":"SATURDAY","openingTime":32400,"closingTime":61200,"open":true},{"day":"SUNDAY","openingTime":28800,"closingTime":61200,"open":false}],"businessOperational":{"day":"THURSDAY","status":"OPEN"}},"review":{"rating":4,"count":37,"decimalRating":4.7},"shippingAddress":{"provinceName":"DKI Jakarta","subdistrictName":"Slipi","postalCode":"11410","districtName":"Palmerah","cityName":"Kota Jakarta Barat"},"statistics":{"sold":1206,"seen":3609},"documents":[],"preOrder":{"isPreOrder":false},"freshnessInDays":0,"fulfillmentTypes":{"delivery":"AVAILABLE","pickup":"UNAVAILABLE","selected":"DELIVERY"}}}

Source

  • Related