In my own implementation of the benchmark using attrs+cattrs (#543 referenced above), I was getting ~3x faster execution for attrs+cattrs relative to pydantic.
Trying to get to the bottom of why the attrs+cattrs benchmark was faster (and if anything could be easily modified to catch up), I did some line-by-line profiling, and while nothing stood out to me as an obvious bottleneck, I did notice that you pay a price for some of the validations even when they are disabled (e.g., when validating a string, anystr_length_validator gets called even if there are no length restrictions on the instance).
@samuelcolvin Is there a way to actually modify the set of validators that gets called for a given model? That seems more reflective of my understanding of how attrs+cattrs performs validation, and I would expect it to help speed things up if, for instance, you only want type validation. It looked to me like something was happening where there was iteration over each field's list of validators, but if I touched that list (even just removing items from it) it surprisingly caused everything to run much more slowly.
@sebastianmika for what it's worth, if you are trying to resolve a bottleneck, I was able to get a ~30% speed improvement in the pydantic benchmark by cythonizing just fields.py, main.py and validators.py with zero changes to the underlying source (except I had to rename uses of include to in main.py as include is reserved in cython).
_Originally posted by @dmontagu in https://github.com/samuelcolvin/pydantic/pull/513#issuecomment-495798076_
@dmontagu this is really interesting, I've copied this to a new issue to separate it from discussions around attr benchmarks/
Is there any easy way for you share your result from profiling? This is something I've been intending to do for a year or so but haven't had the time to sit down and concentrate on it. If we can easily benefit from your work without having to start from scratch that would be great.
Regarding cythonizing, I'd be very keen to add this to the pypi release, my initial thoughts/concerns:
.py and .pyx, if so we should minimise the amount of duplicated codesystem=null installs are actually windows? Either way windows binaries are too much hassle for me, at least initially@dmontagu are you sure you got main.py to compile to cython?
I've been playing around and while fields.py and validators.py worked fine, cythonizing main.py always gives me no validator found for <class 'cython_function_or_method'>
see #548, feedback much appreciated.
@samuelcolvin ah yes, I tried to cythonize main.py but I think I ran into precisely the issue you are describing, I can't remember if I was able to work around it.
One thing I remember noticing from the line profiler was that this check in particular was eating a couple % of execution time, despite being basically irrelevant after model definition. But it looks like that was removed in the cython branch.
I had to hack the codebase to get the line profiler working due in part to conflicts with built-in names (in particular, I think the presence of a file called profile.py next to the benchmark was causing some conflicts). Sadly I nuked my profiler setup but it shouldn't be hard to put it back together. I'll get it working and then share.
no worries, i got line profiling working fine. (didn't run into the profile.py problem you mentioned oddly).
But it looks like that was removed in the cython branch.
No just moved, will look at that too.
You were right about extra validators being called, removing that saved a significant about of time.
@samuelcolvin Here is the output of the line-profiler running the benchmark on my computer: https://gist.github.com/dmontagu/0a8b823c2c5dec3954405a921028f7dd.
profile_all.log has more functions being line-profiled (as it is set up in #551)profile_last_layer.log has only the "last layer" of functions being line-profiled (basically just commenting out where I added the other functions to the line profiler), so has less "noise" in the outputIf you look at the code in #551 I think it should be clear how to modify to run it on different functions if you want to profile others. (But it sounds like you might now have your own line-profiling setup.)
Next I'll try to run it on the cython branch and add that output.
https://gist.github.com/dmontagu/0a8b823c2c5dec3954405a921028f7dd now contains
with the results of line-profiling the cythonized code, analogous to the files described above. Note: line profiling the cythonized code required some tweaks to the setup.py to add line traces, but wasn't too bad (see the comment on #552 for more info).
Just to chime in about Windows:
All of FastAPI's users are Pydantic users. I don't expect them all to be power-users.
And I expect many developers would use Windows, only/mainly for development, not for production (I imagine/hope). I personally use Linux, but FastAPI is compatible with all platforms.
I would like to suggest to keep a Windows-compatible version available.
Now, I wouldn't really care much about an optimized version for Windows, just any "vanilla" version (without Cython) that developers can use when developing in Windows.
I expect anyone that wants maximum performance in production would use Linux (and would take advantage of any Cython optimized code). But I would like to suggest keeping a simple Windows-compatible version too.
That aside, I think these efforts to optimize performance to the next level are amazing :tada:
thanks @tiangolo.
From the download numbers it looks like very few people use windows (although maybe the platform=null people are windows, not at all clear, maybe they're macos????). These numbers will of course include fastapi users.
The version in #548 has close to identical behaviour with and without compilation with cython - tests are run with and without cython for every major python version, no tests are skipped in either case.
Tests are now also run (without compiling to cython) on windows using azure pipeline for all builds.
Regarding building precompiled binaries for windows and macos, there's #555 - maybe no one cares. It's just there to link to from the docs.
Awesome, thanks @samuelcolvin .
As I reported in the PR, I was able to install it in Windows and it works without problems (with the non-Cythonized version). :heavy_check_mark:
So, my use case (FastAPI) is covered. :tada:
Just for completeness, I think there's something weird with the usage report.
Here's a snapshot of the visitors of FastAPI's docs site by OS in the last month:
macOS: 1,751
Windows: 1,648
Linux: 1,008
Others: ...

I know website visitors is not the same as installs, but I would at least expect the statistics to be somewhat correlated. The number of Windows users shouldn't be that high.
But again, this is just for completeness.
I don't think there's anything else that needs to be done given that Pydantic keeps installing without problems on Windows.
And I expect Windows users would still deploy to Linux, even more, if they want maximum performance.
Interesting. Downloads from pypi will be heavily skewed by CI (eg. travis will often download packages multiple times for every build) which is normally linux.
But windows installs still shouldn't be that low. Either way, I agree we're fine on that front.
@samuelcolvin I ran the benchmarks for these branches using PyPy as well, in case you were interested in the performance.
https://gist.github.com/dmontagu/4b31ce876c0d41a6735b641c65fb7285
That gist has benchmark timings on my local machine for:
For the PyPy runs I did 10 repetitions of the benchmark to give it some time to jit away; I'm not sure if this is enough to see the full potential or not. PyPy seems to help the other libraries more, relatively speaking, though it does offer some improvement over fully non-cythonized pydantic. But the cythonized Pydantic is still near the front.