How can I allow msgpack or JSON encoding in requests/responses without the middleware approach based on Content-Type and Accept headers. See the middleware approach here: https://github.com/tiangolo/fastapi/issues/521
We use msgpack because it's smaller as mentioned in the previous issue, but also because numerical data can be encoded/decoded much faster (1,000x+) with msgpack, (e.g., NumPy, Pandas, PyTorch, ...). In addition, general access to raw byte returns is beneficial without having have to encode/decode Python strings. Basing the encoding off Content-Type and Accept headers is beneficial for clients who do not want to bother with msgpack.
My current solution is as follows.
Prevent FastAPI from automatically serializing NumPy arrays and other binary objects in its jsonable_encoder function:
import numpy as np
from pydantic.json import ENCODERS_BY_TYPE
# Warnings and exclamation marks as this is a bit dangerous and Pydantic global...
ENCODERS_BY_TYPE[np.ndarray] = lambda x: x
...
Custom msgpack encoders/decoders:
def custom_serializer(data: Any) ->bytes:
def array_serialize(obj: Any) -> Any:
# We can provide real msgpack here, but close enough for demonstration
if isinstance(obj, np.ndarray):
return np.ascontiguousarray(obj).tobytes()
return obj
return msgpack.loads(blob, object_hook=msgpack_decode, raw=False)
def custom_deserialize():
...
Allowing msgpack requests without running through JSON can be worked as in #521 with the following:
class MsgpackRequest(Request):
async def json(self) -> Any:
if "application/x-msgpack" in self.headers.getlist("Content-Type"):
self._json = custom_deserialize(self._body)
return await super().json()
Responses are the trickier part without running through JSON as in #521. You can return a custom response class for every route, but that doesn't allow you to parametrize this based off the Accept header as the information never exists within the Response classes [1]. My current solution is to make a new APIRoute class which replaces this function which works fine, but I worry is a bit brittle.
Questions:
1) Is there a simpler way to accomplish the above?
2) Is [1] patchable where we can pass through the headers to the Response class which can then determine its own content serialization (solves step 4)?
3) Is there room for strategic PRs which make other serialization formats beyond JSON easier?
[1] This may not be fully true. Looking at it, I think Starlette allows this, but FastAPI simply does not pass this through.
Thanks for filing this @dgasmith. Like my comment on #521 said, we are in the same boat as you. Working with medium to large sized numerical data APIs. Would love for one of the maintainers to chime in either there or here.
I second that! I am currently working on micro services for neutrino oscillation probability calculations and passing around numpy arrays with hundreds of millions of entries is only doable with workarounds like above.
Most helpful comment
Thanks for filing this @dgasmith. Like my comment on #521 said, we are in the same boat as you. Working with medium to large sized numerical data APIs. Would love for one of the maintainers to chime in either there or here.