Home > Net >  Extract first few lines of URL's text data without 'get'ting entire page data?
Extract first few lines of URL's text data without 'get'ting entire page data?

Time:07-06

Use case: Need to check if JSON data from a url has been updated by checking it's created_date field which lies in the first few lines. The entire page's JSON data is huge and i don't want to retrieve the entire page just to check the first few lines.

Currently, For both

x=feedparser.parse(url)
y=requests.get(url).text

#y.split("\n") etc..

the entire url data is retrieved and then parsed.

I want to do some sort of next(url) or reading only first 10 lines (chunks).. thus not sending request for entire page's data...i.e just scroll & check 'created_date' field and exit.

What can be utilized to solve this? Thanks for your knowledge & Apologies for the noob q

Example of URL -> https://www.w3schools.com/xml/plant_catalog.xml

I want to stop reading the entire URL data if the first PLANT object's LIGHT tag hadn't changed from 'Mostly Shady' (without needing to read/get the data below)

CodePudding user response:

Original poster stated below solution worked:

Instead of GET request, one can try HEAD request:

"The GET method requests a representation of the specified resource. Requests using GET should only retrieve data. The HEAD method asks for a response identical to a GET request, but without the response body."

This way, you don't need to request entire JSON, and will therefore speed up the server side part, as well as be more friendly to the hosting server!

  • Related