Home > Mobile >  Python requests library call works on Windows but not Linux... WHY?
Python requests library call works on Windows but not Linux... WHY?

Time:12-14

Long time user of python requests here. Trying to do a simple call to this endpoint: https://www.overstock.com/api/product.json?prod_id=10897789

My current code:

import requests

headers = { 'User-Agent': 'Mozilla/5.0', 'Accept': 'application/json' }
url = 'https://www.overstock.com/api/product.json?prod_id=10897789'
r = requests.get( url, headers=headers )
result = r.json()
print( result )

Expected outcome (shortened):

{'categoryId': 244, 'subCategoryId': 31446, 'altSubCategoryId': 0, 'taxonomy': {'store': {'id': 1, 'name': 'Rugs', 'apiUrl': 'https://www.overstock.com/api/search.json?taxonomy=sto1', 'htmlUrl': 'https://www.overstock.com/Home-Garden/1/store.html'}, 'department': {'id': 3, 'name': 'Casual Rugs'...

Unfortunately, from that same script on Linux, I am not getting the identical result. So far I am stumped as to why this is happening...

Here is the ugly Linux error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/.local/share/virtualenvs/online-project-7j1lNF7P/lib/python3.6/site-packages/requests/models.py", line 900, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1)

What could possibly be the issue? Here's what else I tried...

  1. Linux is NOT running python 3.6 but instead running 2.7x to execute requests.
  2. Adding 'Accept': 'application/json' to headers will surely solve this
  3. decode the data variable first data = response.decode() (link to SO post) Fail: "AttributeError: 'Response' object has no attribute 'decode'"
  4. Use requests.Response.json (link to SO post) Fail: Gives same error as above.
  5. Upgrading to python 3.9.9 may solve it. Nope! This still fails for me.
  6. Perhaps it's your firewall. Nope, checked ufw and it's Status: inactive

#5 Error (on a new Linux machine, upgraded python to 3.9.9):

`$ python3 test.py
Traceback (most recent call last):
  File "/home/user/test.py", line 13, in <module>
    print(r.json())
  File "/usr/lib/python3/dist-packages/requests/models.py", line 892, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None

@balmy - Here is the output I'm getting after confirming requests version 2.26.0 AND python 3.9...

$ python3 test3.py
Traceback (most recent call last):
  File "/home/user/test_scripts/test3.py", line 13, in <module>
    print(r.json())
  File "/home/eric/.local/lib/python3.9/site-packages/requests/models.py", line 910, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1)

@JCaesar - here is the text (shortened to the part I think is relevant which appears bot detection is in play perhaps)

        <div id="bd">
            <div >
                There was an error processing your request.
            </div>
            <span ></span>
        </div>

@Philippe - here is the result in response to your comment 'Can you change the print statement to print(r.text) and run python3 test3.py | jq .'...

$ sudo python3 test3.py | jq .
Traceback (most recent call last):
  File "/usr/lib/command-not-found", line 28, in <module>
    from CommandNotFound import CommandNotFound
  File "/usr/lib/python3/dist-packages/CommandNotFound/CommandNotFound.py", line 19, in <module>
    from CommandNotFound.db.db import SqliteDatabase
  File "/usr/lib/python3/dist-packages/CommandNotFound/db/db.py", line 5, in <module>
    import apt_pkg
ModuleNotFoundError: No module named 'apt_pkg'
Traceback (most recent call last):
  File "/home/eric/test_scripts/test3.py", line 13, in <module>
    print(r.text)
BrokenPipeError: [Errno 32] Broken pipe

@Philippe - answer to your next comment $ sudo python3 test3.py | jq . parse error: Invalid numeric literal at line 2, column 10 Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'> BrokenPipeError: [Errno 32] Broken pipe

Please let me know if you have a solution. Thank you!

CodePudding user response:

Running requests 2.26.0 on macOS 12.0.1 and Python 3.9.9 I discovered that the website requires Accept-Encoding in the headers. This works as expected for me:

import requests

headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Safari/605.1.15',
    'Accept': 'application/json',
    'Connection': 'keep-alive',
    'Accept-Encoding': 'gzip, deflate, br'
}


with requests.Session() as session:
    (r := session.get('https://www.overstock.com/api/product.json?prod_id=10897789', headers=headers)).raise_for_status()
    print(r.json())

CodePudding user response:

It was all due to being IP blocked.

Here is ultimately the script that saved the day...

import requests

url = "https://www.overstock.com/api/product.json?prod_id=10897789"

headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Safari/605.1.15',
    'Accept': 'application/json',
    'Connection': 'keep-alive',
    'Accept-Encoding': 'gzip, deflate, br'
}

http_proxy  = "http://ip:port"
https_proxy = "http://ip:port"

proxyDict = {
              "http"  : http_proxy,
              "https" : https_proxy
            }

r = requests.get(url, headers=headers, proxies=proxyDict)
result = r.json()
print(result)

Thank you all for the group effort!

After seeing this worked for @JCaesar, @diggusbickus, @balmy, and @Philippe I realized that the only remaining stone unturned was the ip address. By adding rotating residential proxy IPs, I made the request and got the data immediately.

Thanks to @JCaesar for revealing 'Accept-Encoding' for without that, it would not work at all. Thank you to @diggusbickus for your comment of walrus notation := for without that I would have assumed Python 3.9.x was running and upgrading that.

  • Related