Home > OS >  Unable to send a good querystring with python requests
Unable to send a good querystring with python requests

Time:10-28

I'm trying to make a right get request to this url with python and requests library applying some filters:

https://www.efast.dol.gov/5500search/

I just need some filters for get right data inside the search page which are: planyear, ein and pn. When I try to do the request I get the wrong data because my dict gets a deleted value following the "q"

This is a example:

import requests

args = {'q.parser': 'lucene', 'q': {'ein': '814699012', 'planyear': '2020', 'pn': '001'}}
url = "https://www.efast.dol.gov/services/afs"
response = requests.get(url, params=args)

When I check response.url I get:

https://www.efast.dol.gov/services/afs?q.parser=lucene&q=ein&q=planyear&q=pn

Every key has no value

This is the closest I've been:

 args = {"q.parser":"lucene","q":{"ein":"814699012"}, "planyear":"2020","pn":"001"}

But if I do response.url I get:

'https://www.efast.dol.gov/services/afs?q.parser=lucene&q=ein&planyear=2020&pn=001

The ein value is gone, it doesn't matter if I put planyear or pn as a value next to q, the result is the same.

What am I doing wrong?

The right result would be data corresponding to year 2020, the right ein number and pn number, It doesn't matter if I get several results or just one

A right result would be this:

https://www.efast.dol.gov/services/afs?q.parser=lucene&size=200&sort=planname asc&q=(((planyear:2020)) AND%20((ein:814699012)) AND%20((pn:001)))&facet.planyear={size:30}&facet.plancode={size:100}&facet.plancode={size:100}&facet.assetseoy={buckets:[%22{,100000]%22,%22[100001,500000]%22,%22[500001,1000000]%22,%22[1000001,10000000]%22,%22[10000001,}%22]}&facet.plantype={size:20}&facet.businesscodecat={size:30}&facet.businesscode={size:30}&facet.state={size:100}&facet.countrycode={buckets:["CA%22,"GB%22,"BM%22,"KY%22]}&facet.formyear={size:30}

CodePudding user response:

It seems that you are confused about requests and responses. In your case, you should use the long URL as your requests, and then parse the response JSON data. So the following code should work for you, and you need to parse the response:

import requests

url = "https://www.efast.dol.gov/services/afs?q.parser=lucene&size=200&sort=planname asc&q=(((planyear:2020)) AND ((ein:814699012)) AND ((pn:001)))&facet.planyear={size:30}&facet.plancode={size:100}&facet.plancode={size:100}&facet.assetseoy={buckets:["{,100000]","[100001,500000]","[500001,1000000]","[1000001,10000000]","[10000001,}"]}&facet.plantype={size:20}&facet.businesscodecat={size:30}&facet.businesscode={size:30}&facet.state={size:100}&facet.countrycode={buckets:["CA","GB","BM","KY"]}&facet.formyear={size:30}"

payload={}
headers = {}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

CodePudding user response:

Python's requests package doesn't support dict like parameters. They values in key-value dictionary have to be either strings or list of strings:

Requests allows you to provide these arguments as a dictionary of strings, using the params keyword argument.

https://docs.python-requests.org/en/latest/user/quickstart/#passing-parameters-in-urls

Your website is using non-standard encoding to encode dictionary to url-valid characters.

If you take a look at your example:

(((planyear:2020)) AND%20((ein:814699012)) AND%20((pn:001)))

We can deduce format to:

(((KEY:VALUE)) AND ((KEY:VALUE))) <...>

So it's () where every key:value pair is surrounded with (()) and spaces being urlquoted to .

We can replicate this encoding ourselves in our code:

>>> params = {"planyear": "2020", "ein": 814699012, "pn": "001"}
>>> encoded = ' AND '.join(f"(({k}:{v})" for k, v in params.items())
>>> f"({encoded})"
'(((planyear:2020)) AND ((ein:814699012)) AND ((pn:001)))'

Then just pass this as your q parameter:

  • Related