LIMIT is supported. Is it possible to add OFFSET as well? I understand I can do that manually but having native support would be nice. The downside of manual paging is more data transmitted over the wire from server to client.
We can add this, but performing pagination using this method is extremely inefficient, especially for an analytic database like Presto that often has to perform a full table or partition scan. Additionally, the results will not necessarily be consistent between queries, so you can have duplicate or missing results when navigating between pages. It is better for the application to cache the result set and perform pagination itself using the cached data.
Here is a good article on the subject: http://use-the-index-luke.com/no-offset
Thanks. Definitely agree it's a better solution and have dealt with similar solutions on other platforms. We're working behind a legacy interface that allows arbitrary pagination, so that gives us some challenges using the stated approach. We've got some workarounds planned for now until we can rework the interface. Cheers.
Hi @jcowanpdx ,
would you please share your workaround approach,
thanks very much:)
we're facing such problem like you;
For the record, the standard syntax is:
<query expression> ::=
[ <with clause> ]
<query expression body>
[ <order by clause> ]
[ <result offset clause> ]
[ <fetch first clause> ]
<result offset clause> ::= OFFSET <offset row count> { ROW | ROWS }
<offset row count> ::= <simple value specification>
A
<query expression>may contain both a<result offset clause>and a<fetch first clause>, in which case the<result offset clause>is applied first, followed by the<fetch first clause>.i) If
<result offset clause>is not specified, then let RORC be 0 (zero).b) If RORC is less than 0 (zero), then an exception condition is raised: data exception — invalid row count in result offset clause.
I really I'd like to hear what's the best recommendation around that.
I'm using AWS Athena which I suppose is backed by PrestoDB under the hood, and I'm needing a mechanism to do this sort of pagination, the tricky thing is that my CSV format by itself doesn't include a sequential field like an id or a created_at that I could filter for it manually using WHERE and then applying the LIMIT clause accordingly.
Including the id is possible on the CSV, but I really didn't want to do that mandatorily.
This issue has been automatically marked as stale because it has not had any activity in the last 2 years. If you feel that this issue is important, just comment and the stale tag will be removed; otherwise it will be closed in 7 days. This is an attempt to ensure that our open issues remain valuable and relevant so that we can keep track of what needs to be done and prioritize the right things.
Most helpful comment
We can add this, but performing pagination using this method is extremely inefficient, especially for an analytic database like Presto that often has to perform a full table or partition scan. Additionally, the results will not necessarily be consistent between queries, so you can have duplicate or missing results when navigating between pages. It is better for the application to cache the result set and perform pagination itself using the cached data.
Here is a good article on the subject: http://use-the-index-luke.com/no-offset