It looks like flask.json does not allow the user to specify an alternative json library (such as rapidjson). Instead it just does this:
try:
from itsdangerous import simplejson as _json
except ImportError:
from itsdangerous import json as _json
This currently un-overridable behavior also means that, due to differences in when default(..) is called between simplejson and json, bugs can mysteriously appear when simplejson is installed that didn't appear without it.
Would you consider giving the user some control over which json library is used?
In the meantime, it might be possible to work around this by doing something like:
import flask.json
from rapidjson import dumps
flask.json.dumps = dumps
during app initialization / setup.
Except unfortunately flask.json.jsonify assumes that dumps accepts a separators argument and doesn't give the caller the ability to suppress passing that:
separators = (',', ':')
if current_app.config['JSONIFY_PRETTYPRINT_REGULAR'] \
and not request.is_xhr:
indent = 2
separators = (', ', ': ')
# Note that we add '\n' to end of response
# (see https://github.com/mitsuhiko/flask/pull/1262)
rv = current_app.response_class(
(dumps(dict(*args, **kwargs), indent=indent, separators=separators),
and unfortunately rapidjson.dumps does not currently accept a separators kwarg (see https://github.com/kenrobbins/python-rapidjson#incompatibility).
Would you accept a patch to make this more flexible?
This currently un-overridable behavior also means that, due to differences in when default(..) is called between simplejson and json, bugs can mysteriously appear when simplejson is installed that didn't appear without it.
Wouldn't you have similar issues when you insert something like rapidjson? How can you be sure Flask or any of its extensions don't malfunction when using _arbitrary_ JSON libraries (instead of potentially two)?
Thanks for taking a look at this @untitaker. The idea is the user wouldn't have to worry about malfunction caused by arbitrary JSON libraries, because with the proposed changes, they could actually control which library is being used, and not worry about subtle side effects caused by transitive dependencies changing behavior unexpectedly.
With the current situation, it's possible to have your app working fine one minute, and then you install some package which installs simplejson as a transitive dependency, and then all of a sudden your app breaks in some unrelated place. (This happened to me.:)
When this code was originally written, was the json library landscape different, and now that there are more good options, Flask should allow the user to override its default?
To be clear, the current try/except ImportError code leading to unexpected results is a separate (but related) issue. But by removing the silent fallback import, just having one good, deterministic default, and allowing the user to explicitly opt in if they want something different, we can address both issues at once in an elegant way.
Does that make any sense?
I worry about different breakage, concretely Flask extensions not being prepared for using any JSON lib other than json/simplejson.
I wonder if we should simply decide on exactly one json lib.
When this code was originally written, was the json library landscape different
It just occurred to me that that code could have been written that way since at the time it was very important to support Python 2.5, which predates the inclusion of json in the stdlib. If Flask's user base is now primarily on 2.6+, it sounds like a consistency win to at least change the order of the imports in the try/except.
But if we're talking about changes for 1.0, I think that could be a reasonable time to converge on stdlib json as the default Flask uses, and if any other json library is needed (either because you're in the minority still stuck on 2.5 or you need the speed of rapidjson or whatever else), Flask offers a setting for it.
Would also allow removing crap like this I think?
The problem is that we have to prefer simplejson over json because json has various problems in different versions of Python which we so far got away with by using simplejson when available. I can see that this is an issue. Do you have some examples of what's different with default? Maybe we can fix this somehow in Flask.
any strong reasons for not requiring simplejson? (besides having to add a new dependency)
json has various problems in different versions of Python which we so far got away with by using simplejson when available
Ah, thanks for explaining Armin. From the slash escape thing it looks like different versions of simplejson can cause annoying differences too?
any strong reasons for not requiring simplejson?
I guess this would at least allow Flask to peg to a specific version range to have better consistency guarantees, yeah?
Do you have some examples of what's different with default?
Before I had simplejson installed, I was providing Flask a custom JSON encoder class that overrode default to work around rethinkdb/rethinkdb#3301 and the workaround worked. With simplejson installed, the r.binary values that were causing the issue were getting intercepted by this code (since isinstance(r.binary, bytes)), so they no longer were getting passed to my overridden default method, causing encoding errors.
Does this example demonstrate a need for better consistency guarantees in the json handling? Thanks so much again for taking a look.
What's super annoying about all of it is that there are 4 modes:
All are different and the whole point of flask.json was to create a consistent experience on top. I would love to peg against simplejson but I would not be surprised there are some new discoveries by doing that we were not aware before. Maybe we should add some tests to figure out the behavior we want and then check which gets the closest.
Ugh, that is super annoying. Sounds like it would be nice if Flask could just depend on simplejson (or some other library) to provide the consistent experience across environments.
Given how hard it's been to get consistent behavior across all 4 modes, is 1.0 an appropriate time to change strategies: Instead of trying to fix everything for everyone, settle for whatever is the single best (i.e. least bad) default, and give users a way to override?
Not sure if it is, but if it's possible to put more control in users' hands and lighten Flask's maintenance burden at the same time, no one should fault you for it.
I'm happy to make a decision on picking one of those. Unfortunately it has never been quite clear which version is good to decide on. On 3.x for a long time json was the better version until simplejson was ported. On 2.x simplejson always looked like the better plan.
Here is the tricky part though: json in the stdlib is based on unicode strings whereas flask.json and simplejson is based on bytes. So there will always be some trickery needed. I need to see what I can do there.
Ah, sorry to hear it Armin. Thanks for giving this your attention and I'm sure you'll come up with a good plan (and if there's any way we can help let us know).
I tentatively added this to the 1.0 milestone because it might be a good time to introduce any breaking changes around the json handling. I added it to serve as a reminder to see if there's any obvious changes, not with the intent of holding up the 1.0 release until this is solved.
json in the stdlib is based on unicode strings whereas flask.json and simplejson is based on bytes. So there will always be some trickery needed.
So if we required simplejson, we wouldn't need some of this trickery?
After reading this thread - it seems to me that
1) simplejson should be a direct dependency, and the default json library used in Flask
2) Being able to swap out simplejson for rapidjson/ujson, etc is desired by some, but the behavior would be undefined.
3) jsonify() should accept all args that can be passed to the encoder, so custom encoder classes can be used.
What's the best way to make progress on this for PyCon sprinting? Should a patch that includes simplejson as the default be the starting point?
Frankly I don't think this issue is appropriate for a sprint, we have too little information on which to act on.
If we depended on simplejson we might be able to unify behavior of Python 2 vs 3 more easily, at the expense of
@untitaker Dependencies are cheap. I also doubt that simplejson is less used than the stdlib one. Internally to my company, it's the default implementation, handling billions of Kafka messages a day being processed by Python.
What code would break that has been working fine with Python 2 already? I didn't see that above.
If this bug isn't going to get any traction, I think an executive decision to close it should be made, or push it off to a non-1.0 milestone.
@dsully I missed the word potentially -- in principle swapping out an entire library will cause a few edgecase behavior changes
I'm not really against adding simplejson as hard dependency though.
@mitsuhiko What do you think?
According to the tests (#1869), everything passes with simplejson on all supported implementations. What is simplejson gaining us over the built-in json? @mitsuhiko mentioned bytes vs. unicode above, but we don't seem to be doing anything specific for either library in that regard.
@davidism we support both bytes and unicode and we unify the interface no matter what's the underlying implementation doing.
I'd love to continue and resolve this discussion. Have there been any more developments since the last comment? The simplejson vs json issue seems prevalent enough that there should be established ways to support both.
I would like to work on this issue. This has not yet been resolved, right?
It hasn't been resolved, although I'm not sure what a solution would look like for this. If you have an idea about how to allow other libraries, I'd be interested.
I think @dsully's https://github.com/pallets/flask/issues/1602#issuecomment-223639685 summarizes a rather good plan.
- simplejson should be a direct dependency, and the default json library used in Flask
- Being able to swap out simplejson for rapidjson/ujson, etc is desired by some, but the behavior would be undefined.
- jsonify() should accept all args that can be passed to the encoder, so custom encoder classes can be used.
I would just default to stdlib's json instead. It's stable and it works. Yes, there's difference in performance between python version but so what...
Switching to alternative implementation (ujson, simplejson and alike) should be programmatically specified (=no auto-detection). Then it's super clear for users what implementation they are actually using.
You can mention/define that stdlib's json is what Flask supports 100%. 3rd party libs not so much (afaik they will work for your standard use cases, but maybe not for exotic ones).
Or was there a technical reason why this approach wouldn't be suitable? Or was there already a decision? I am not sure what I'm reading...
Any update on this issue? Facing the same problem: TypeError: Object of type 'Decimal' is not JSON serializable
Has anyone got any temporary solutions that worked?
Willing to work on this!
@shivam6294 that you can solve by setting a custom json_encoder.
The only interface I can think of for this, since it needs to work at import time, is to use an entry point. If a package defines a [flask.json] provider entry point, we'd try to import that as the JSON library first. It would still need to conform to the built-in json API, but that could be accomplished by pointing at a module that wraps the other library.
I'd be willing to put this in as long as it comes with a huge "this is completely unsupported" warning.
I've noticed that the PyPy simplejson Travis jobs are much slower than the cpython json jobs. Probably better to leave that optional. It's documented in the installation docs now, so it's at least more visible.
I was interested in checking out where the new installation docs talk about
json libraries as you mentioned, so I checked
http://flask.pocoo.org/docs/dev/installation/ and
https://github.com/pallets/flask/commits, but couldn't immediately find it.
Do you have a link handy?
On Tue, May 30, 2017 at 18:17 David Lord notifications@github.com wrote:
I've noticed that the PyPy simplejson Travis jobs are much slower than the
cpython json jobs. Probably better to leave that optional. It's documented
in the installation docs now, so it's at least more visible.—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/pallets/flask/issues/1602#issuecomment-305024695, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAD94Jh3TpPhUKDsgk0NRWBFvIGujdpBks5r_JV3gaJpZM4GZobz
.
I started implementing the entry point idea I described above, but I really didn't like it. The only alternative library mentioned here was ujson, and it was already causing problems since it didn't have a JSONEncoder class. If someone has a better idea for supporting this, please make a new issue or patch with details. Closing this because it hasn't seen any development.
~~~python
_json = None
_override = next(
pkg_resources.iter_entry_points('flask', 'json_provider'), None
)
if _override is not None:
try:
_json = _override.load()
except ImportError:
pass
if _json is None:
from itsdangerous import json as _json
~~~
Thanks for documenting the current behavior, @davidism.
I don't know, json encoding/decoding is not needed during import-time, so couldn't we just pass the choice of json library as option to Flask app? jsonify(), for example, already checks app property (current_app.config['JSONIFY_PRETTYPRINT_REGULAR']). @davidism Could you clarify the need for entry point?
The flask.json functions can be used at any time, not just in an app context. So the import needs to happen independent of the app. Also:
The only alternative library mentioned here was ujson, and it was already causing problems since it didn't have a JSONEncoder class.
Like I said, if anyone has a more concrete idea, please submit a patch or detailed issue.
Yeah, but do you _need_ to use flask.json outside app context? What's the use case?
It's not about ujson or any other 3rd implementation - even with only json and simplejson, the latter gets auto-picked now if it's available, which may not be what the user wants - thus the suggestion to allow manually choosing what implementation to use.
Yeah, but do you need to use flask.json outside app context? What's the use case?
It's regularly used outside the app context. In particular because people already use it to load config data to configure the application itself.
I also argue in favour of not auto-picking libs. I derive and set a custom encoder from stdlib's json.JSONEncoder which throws after pip installing simplejson in a conda environment:
...
File "/home/moonshot/miniconda3/envs/fat-ml/lib/python3.6/site-packages/simplejson/__init__.py", line 397, in dumps
**kw).encode(obj)
TypeError: __init__() got an unexpected keyword argument 'encoding'
I have the same problem as @ahirner. Is there a way to get jsonify to work when simplejson is installed?
It does work when simplejson is installed. All tests pass with and without simplejson.
This issue is resolved. If you've encountered a new bug with Flask's code, please open a new issue with a minimal reproducible example demonstrating the issue.
Most helpful comment
I think @dsully's https://github.com/pallets/flask/issues/1602#issuecomment-223639685 summarizes a rather good plan.
I would just default to stdlib's
jsoninstead. It's stable and it works. Yes, there's difference in performance between python version but so what...Switching to alternative implementation (
ujson,simplejsonand alike) should be programmatically specified (=no auto-detection). Then it's super clear for users what implementation they are actually using.You can mention/define that stdlib's
jsonis what Flask supports 100%. 3rd party libs not so much (afaik they will work for your standard use cases, but maybe not for exotic ones).Or was there a technical reason why this approach wouldn't be suitable? Or was there already a decision? I am not sure what I'm reading...