Requests: Using a `json.JSONDecoder` fails when simplejson is present

Created on 22 Oct 2018  Â·  7Comments  Â·  Source: psf/requests

It appears that when simplejson is in the environment requests preferentially imports it. However, the arguments accepted by JSONDecoder are inconsistent between the standard library json and simplejson leading to errors when using the standard json.JSONDecoder when simplejson is in the environment.

Your documentation for response.json()says:

Parameters: | **kwargs – Optional arguments that json.loads takes.

The documentation for json.loads says:

json.loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

and

To use a custom JSONDecoder subclass, specify it with the cls kwarg; otherwise JSONDecoder is used. Additional keyword arguments will be passed to the constructor of the class.

I expected to be able to use a custom json.JSONDecoder without issue and indeed this works on Python 2.7, but in Python 3.7 it fails.

I can see that the issue of preferentially importing simplejson has been raised a few times e.g.
https://github.com/requests/requests/pull/2516
https://github.com/requests/requests/issues/3052

and that it is slated for removal in Requests 3, which I guess will resolve this issue.

If it is not possible to resolve this issue some other way in Requests 2.x then it would be nice if the documentation around response.json() was updated to make it clear that the arguments to json.loads could be either simplejson.loads or the standard lib json.loads depending on the environment since the preferential import is effectively undocumented and non-obvious.

Expected Result

A successful JSON load when using r.json(cls=json.JSONDecoder).

Actual Result

    print(r.json(cls=json.JSONDecoder))
  File ".../python3.7/site-packages/requests/models.py", line 897, in json
    return complexjson.loads(self.text, **kwargs)
  File ".../python3.7/site-packages/simplejson/__init__.py", line 535, in loads
    return cls(encoding=encoding, **kw).decode(s)
TypeError: __init__() got an unexpected keyword argument 'encoding'

Reproduction Steps

pip install requests
pip install simplejson
import json
import requests

r = requests.get('http://localhost:5984')
print(r.json(cls=json.JSONDecoder))

System Information

Note this issue does not occur in Python 2.7.x.

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "2.7"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.7.0"
  },
  "platform": {
    "release": "18.0.0",
    "system": "Darwin"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.20.0"
  },
  "system_ssl": {
    "version": "1000210f"
  },
  "urllib3": {
    "version": "1.24"
  },
  "using_pyopenssl": false
}

Most helpful comment

I ran into this problem when we added a third-party library that had a transitive dependency on simplejson. That consequently broke all of our error handling, because requests now started throwing simplejson.errors.JSONDecodeError instead of json.JSONDecodeError that we had catches for. Judging from the referenced issues above, more users are affected.

All 7 comments

Hi @ricellis, thanks for revisiting this. I don't believe we'll be seeing this removed in the near future until 3.0.0 is ready. If you'd like to make a small addendum to the documentation about this, we'd be happy to review it.

Thanks @nateprewitt I can update the docs, but would you be willing to accept a patch that protects people from the error if they are expecting to be using json, but simplejson happens to be present in the env (probably unbeknownst to them).

For example adding a check to Response.json() that uses the built-in json package if cls extends the built-in json.decoder.JSONDecoder. Something like this https://gist.github.com/ricellis/0690ebfc7f1a56df64df87c7a5251e77

Your proposed change breaks the deterministic behaviour of that function which is not ideal in the slightest. I don't think we can accept that.

breaks the deterministic behaviour of that function

It's annoying that the json() method could end up using a different module to what is used elsewhere, but that seems better than an error to me. It is deterministic though, the outputs are governed by the inputs and I think it only impacts the case that is currently broken:

| simplejson | cls | current | proposal |
| --- | --- | --- | --- |
yes | extends json.JSONDecoder | Error | Decoded by json using cls |
yes | None | Decoded by simplejson | Decoded by simplejson |
no | extends json.JSONDecoder | Decoded by json using cls | Decoded by json using cls |
no | None | Decoded by json | Decoded by json|

not ideal in the slightest

It was just a first stab at something that might work and I was trying to contain the change to just the specific area of the problem.

I agree it is not the cleanest thing to do, but neither is swapping the JSON module with an API incompatible one. That's where this issue is different from the others that have been reported previously around the preferential import - this has identified a place where simplejson and json are not API compatible yet requests exposes that API to the end user through the Response.json() kwargs.

That in itself might not be a problem when the requests caller is also in control of the module environment. As you can see from our linked issue though the problem for us comes when a user of our library includes another module that itself has a dependency on simplejson. We are using the requests API in good faith, but it is broken by a transitive dependency over which we have no control. I think the discussion around simplejson that has preceded this issue didn't have to account for that.

Perhaps someone can think of a better option than my proposal or maybe an option to disable the simplejson preferential import could be revisited in light of this particular scenario?

Or if you're not interested in fixing 2.x at all then just say so and I'll submit a docs patch and spend my time implementing a workaround in our library instead.

@nateprewitt @sigmavirus24 any thoughts on this ^?

Simplejson is slower than stdlib json implementation in python 3, why don't you want to allow to override the default behavior at least?

I ran into this problem when we added a third-party library that had a transitive dependency on simplejson. That consequently broke all of our error handling, because requests now started throwing simplejson.errors.JSONDecodeError instead of json.JSONDecodeError that we had catches for. Judging from the referenced issues above, more users are affected.

Was this page helpful?
0 / 5 - 0 ratings