[x] I used the GitHub search to find a similar issue and didn't find it.
https://github.com/tiangolo/fastapi/issues/701#issuecomment-552312286
[x] I searched the FastAPI documentation, with the integrated search.
I am dealing with JSON data in the order of 10 MBs (up to hundred MBs) which I directly get from a Postgres Instance.
This data is stored as JSONB on this database. To fetch this large amount of data without parsing it into a dictionary I do the following:
items = db_session.query(models.Table.id, cast(models.Table.data, String)).filter_by(id=id).all()
Since I know this data has been properly validated when it was inserted I just use the _construct_ factory from pydantic:
class Item(BaseModel):
id: int
data: Union[A, B, C, D]
built_items = [schemas.Item.construct(id=x[0], data=x[1]) for x in items]
Then, on the endpoint I directly return a response using:
starlette.responses.JSONResponse(content=jsonable_encoder(built_items))
But, I still describe the response_model as List[Item] as I need the documentation for this endpoint.
Using this strategy I am able to achieve really good response times, though the original JSON data is encoded now as a string and not as an object when decoded.
So the clients of the API have to decode the response many times:
1) The request itself
2) Once for each JSON object retrieved
Is there any good practice on how to tackle this problem?
When fetching data that's been pre-encoded in JSON, I generally create a custom response class to skip JSON encoding and validation completely. FastAPI will acknowledge response_model in your route for any subclass of JSONResponse, so all you need to do is this:
from starlette.responses import JSONResponse
class RawJSONResponse(JSONResponse):
def render(self, content: bytes) -> bytes:
return content
@get("/foo", response_model=MyModel, response_class=RawJSONResponse)
def foo():
return """{"raw": "json"}"""
The problem I'm seeing here is that you don't get "raw" json from the DB, you get individual JSON elements that still need to be assembled together. What you could always do is use Postgres' json_array_agg() function to aggregate a column of JSON elements into a JSON array. Since id and data are also stored separately, you'll probably also need something like json_build_object('id', id, 'data', data), and then wrap that in json_array_agg.
SELECT json_array_agg(json_build_object('id', id, 'data', data)) FROM mytable WHERE id=5
I don't know how hard that would be to acheive in ORM mode, though.
fastest I achieved is with orjson and transforming my request to json in postgresql, you can do that with something like
class ORJSONResponse(JSONResponse):
media_type = "application/json"
def render(self, content: typing.Any) -> bytes:
return orjson.dumps(content)
queryjson = """select coalesce(array_to_json(array_agg(row_to_json(response))),'[]')::character varying AS BODYFROM from (
YOURQUERYHERE
) response"""
response = await db.fetchrow(queryjson)
return ORJSONResponse(json.loads(response.get("bodyfrom")))
@euri10: I can't imagine why the json.loadstring before re-encoding with osjson would be necessary, unless you're doing that to not skip the validation step, but then I'm not really seeing any benefit to making Postgres handle the JSON encoding instead of just returning your data as rows. Is there anything I'm missing?
I think you're right and that you could directly dump into json
https://magicstack.github.io/asyncpg/current/usage.html?highlight=json#example-automatic-json-conversion
Le jeu. 12 déc. 2019 à 8:37 PM, sm-Fifteen notifications@github.com a
écrit :
@euri10 https://github.com/euri10: I can't imagine why the
json.loadstring before re-encoding with osjson would be necessary, unless
you're doing that to not skip the validation step, but then I'm not really
seeing any benefit to making Postgres handle the JSON encoding instead of
just returning your data as rows. Is there anything I'm missing?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tiangolo/fastapi/issues/788?email_source=notifications&email_token=AAINSPVV24TJUJM47KQZWDLQYKHGZA5CNFSM4J2ATMG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGXY4JQ#issuecomment-565153318,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAINSPQUKHGCNHS4UM6RS6TQYKHGZANCNFSM4J2ATMGQ
.
Thanks, I'll change my code to what you showed me 👍
I was able to fetch a complete json
When fetching data that's been pre-encoded in JSON, I generally create a custom response class to skip JSON encoding and validation completely. FastAPI will acknowledge
response_modelin your route for any subclass of JSONResponse, so all you need to do is this:from starlette.responses import JSONResponse class RawJSONResponse(JSONResponse): def render(self, content: bytes) -> bytes: return content @get("/foo", response_model=MyModel, response_class=RawJSONResponse) def foo(): return """{"raw": "json"}"""The problem I'm seeing here is that you don't get "raw" json from the DB, you get individual JSON elements that still need to be assembled together. What you could always do is use Postgres'
json_array_agg()function to aggregate a column of JSON elements into a JSON array. Sinceidanddataare also stored separately, you'll probably also need something likejson_build_object('id', id, 'data', data), and then wrap that injson_array_agg.SELECT json_array_agg(json_build_object('id', id, 'data', data)) FROM mytable WHERE id=5I don't know how hard that would be to acheive in ORM mode, though.
I used this strategy and it is now really fast 🎊 But now I can't use response_model=List[Item] using Item.construct as the response is already built as a JSON string. Is there any other way other than using pydantic Model.construct, but still get the schema documentation on swagger docs?
But now I can't use
response_model=List[Item]usingItem.constructas the response is already built as a JSON string.
Are you sure? Setting response_model=List[MyModel] works just fine for me.
Apparently I am doing something wrong because fastapi is not skipping validation with your solution. When I run your code exactly like this:
from starlette.responses import JSONResponse
import pydantic
class CustomJson(JSONResponse):
def render(self, content: bytes) -> bytes:
return content
class Item(pydantic.BaseModel):
id: int
@app.get("/foo", response_model=Item, response_class=CustomJson)
def foo():
return """{"raw": "json"}"""
I get this error:
File "...\lib\site-packages\fastapi\routing.py", line 72, in serialize_response
raise ValidationError(errors, field.type_)
pydantic.error_wrappers.ValidationError: 1 validation error for Item
response
value is not a valid dict (type=type_error.dict)
Apparently, response_class is not actually doing anything so I have to manually create the CustomJson instance. Is this a bug?
```python
from starlette.responses import JSONResponse
import pydantic
class CustomJson(JSONResponse):
def render(self, content: bytes) -> bytes:
return content
class Item(pydantic.BaseModel):
id: int
@app.get("/foo", response_model=Item)
def foo():
return CustomJson("""{"raw": "json"}""")
````
@littlebrat This isn't a bug -- FastAPI first takes your return value, then converts it into the specified response_model, then feeds that to the response_class.
The idea is that the custom response class should expect to receive a model, rather than raw bytes.
You can see the logic for this here:https://github.com/tiangolo/fastapi/blob/7a445402d4960d6173d76dac43393ad6c5040521/fastapi/routing.py#L126-L146
Currently, you can get around this by manually returning the response class, which short circuits any processing. (You could put this into a small decorator if you want easier reusability.)
If you think there is a gap about this in the documentation, a PR would be welcome!
Also, if you think the behavior should be different, feel free to open a feature request issue.
Apparently, response_class is not actually doing anything so I have to manually create the
CustomJsoninstance. Is this a bug?from starlette.responses import JSONResponse import pydantic class CustomJson(JSONResponse): def render(self, content: bytes) -> bytes: return content class Item(pydantic.BaseModel): id: int @app.get("/foo", response_model=Item) def foo(): return CustomJson("""{"raw": "json"}""")
No, you're right, I forgot about that part, my bad. You actually need to return a pre-instanciated Response object.
Thanks everyone for the help here! :rocket: :cake:
Does that solve it for you @littlebrat ?
Sorry for the delay, yes it did!
Thanks for reporting back @littlebrat ! :rocket:
Most helpful comment
When fetching data that's been pre-encoded in JSON, I generally create a custom response class to skip JSON encoding and validation completely. FastAPI will acknowledge
response_modelin your route for any subclass of JSONResponse, so all you need to do is this:The problem I'm seeing here is that you don't get "raw" json from the DB, you get individual JSON elements that still need to be assembled together. What you could always do is use Postgres'
json_array_agg()function to aggregate a column of JSON elements into a JSON array. Sinceidanddataare also stored separately, you'll probably also need something likejson_build_object('id', id, 'data', data), and then wrap that injson_array_agg.I don't know how hard that would be to acheive in ORM mode, though.