I want to extract content from a website where the link is as follows:
"www.example.com/getpublicreport?date=2021-10-01"
Using Requests what should be the code to extract data from multiple pages where I could navigate using the date in url.
For example if I want to extract data from date - 2019-01-01 till the current data how do I write code using request library to get the data.
CodePudding user response:
www.example.com/getpublicreport?date=2021-10-01
This is example of URL with parameters, requests does have params
where you should deliver dict with key-value pairs. You might access this as follows
import requests
url = "http://www.example.com/getpublicreport"
parameters = {"date": "2021-10-01"}
r = requests.get(url, params=parameters)
print(r.url) # http://www.example.com/getpublicreport?date=2021-10-01
If you want to know more about URLs read RFC1738.
CodePudding user response:
Hi you can use datetime package :)
For example:
import datetime
import requests
def extract_data(start_date, end_date):
while start_date <= end_date:
yield requests.get('www.example.com/getpublicreport?date=%s' % start_date.isoformat())
start_date = datetime.timedelta(days=1)
if __name__ == '__main__':
for r in extract_data(datetime.date(2019, 01, 01), datetime.date.today()):
print(r.content)