Postgraphile: Pagination via cursor on custom query functions does not work (4.4.x)

Created on 29 Oct 2019  ·  6Comments  ·  Source: graphile/postgraphile

I'm submitting a ...

  • [x] bug report
  • [ ] feature request
  • [ ] question

PostGraphile version: 4.4.4

Minimal SQL file that can be loaded into a clean database:

CREATE OR REPLACE FUNCTION filter_articles(
    published_before timestamp with time zone DEFAULT NULL,
    published_after timestamp with time zone DEFAULT NULL,
    published_last text DEFAULT NULL,
    search_plain text DEFAULT ''::text,
    search_tsquery text DEFAULT ''::text,
    category_filter text DEFAULT ''::text,
    tags_filter text[] DEFAULT ARRAY[]::text[],
    tag_ids_filter text[] DEFAULT ARRAY[]::text[],
    districts_filter district[] DEFAULT ARRAY[]::district[],
    capability_filter capability[] DEFAULT ARRAY[]::capability[],
    reference_point point DEFAULT POINT(-1, -1),
    distance double precision DEFAULT 10,
    publication_priority_filter priority DEFAULT NULL::priority,
    revision_by text DEFAULT ''::text,
    status_filter status DEFAULT NULL,
    push_notification_filter push_notification[] DEFAULT ARRAY[]::push_notification[],
    only_templates boolean DEFAULT false,
    sort_by filter_articles_sort DEFAULT NULL)
 RETURNS SETOF articles
 LANGUAGE plpgsql
 STABLE
  …

Steps to reproduce:

Current behavior:
every GraphQL query trying to paginate on that function returns the same value for pageInfo.endCursor. Downgrading postgraphile to 4.3.3 solved the problem for us.

Expected behavior:
pageInfo.endCursor should return a different value, for every value the of after

🐛 bug

Most helpful comment

Fixed in https://github.com/graphile/graphile-engine/pull/559

That was... unpleasant. Somebody needs to buy me a beer :rofl:

All 6 comments

Reproducing using this function from the kitchen-sink-schema:

create function c.table_set_query() returns setof c.person as $$ select * from c.person $$ language sql stable;
{
  tableSetQuery(first: 3){
    edges {
      cursor
      node {
        id
      }
    }
    pageInfo {
      startCursor
      endCursor
    }
  }
}

gives

{
  "data": {
    "tableSetQuery": {
      "edges": [
        {
          "cursor": "WyJuYXR1cmFsIiwxXQ==",
          "node": {
            "id": 1
          }
        },
        {
          "cursor": "WyJuYXR1cmFsIiwyXQ==",
          "node": {
            "id": 2
          }
        },
        {
          "cursor": "WyJuYXR1cmFsIiwzXQ==",
          "node": {
            "id": 3
          }
        }
      ],
      "pageInfo": {
        "startCursor": "WyJuYXR1cmFsIiwxXQ==",
        "endCursor": "WyJuYXR1cmFsIiwzXQ=="
      }
    }
  }
}

So far so good.

{
  tableSetQuery(first: 3, after:"WyJuYXR1cmFsIiwzXQ=="){
    edges {
      cursor
      node {
        id
      }
    }
    pageInfo {
      startCursor
      endCursor
    }
  }
}

returns

{
  "data": {
    "tableSetQuery": {
      "edges": [
        {
          "cursor": "WyJuYXR1cmFsIiw0XQ==",
          "node": {
            "id": 4
          }
        },
        {
          "cursor": "WyJuYXR1cmFsIiw1XQ==",
          "node": {
            "id": 5
          }
        },
        {
          "cursor": "WyJuYXR1cmFsIiw2XQ==",
          "node": {
            "id": 6
          }
        }
      ],
      "pageInfo": {
        "startCursor": "WyJuYXR1cmFsIiw0XQ==",
        "endCursor": "WyJuYXR1cmFsIiw2XQ=="
      }
    }
  }
}

This is working as expected.

Please can you file a minimal SQL file that reproduces the issue? Your current example is incomplete.

I'm seeing differences between plpgsql functions and sql functions. Taking your kitchen sink example function and creating an equivalent plpgsql function reproduced the issue that I believe OP is seeing.

Function:

create or replace function c.table_set_query_1171() returns setof c.person as
  $$
  begin
    return query select * from c.person;
    return;
  end
  $$ language plpgsql stable;

Running this with a 2 row limit (data from master only had 5 rows).

query Test3{
  tableSetQuery1171(first: 2){
    edges {
      cursor
      node {
        id
      }
    }
    pageInfo {
      startCursor
      endCursor
    }
  }
}

gives

{
  "data": {
    "tableSetQuery1171": {
      "edges": [
        {
          "cursor": "WyJuYXR1cmFsIiwxXQ==",
          "node": {
            "id": 1
          }
        },
        {
          "cursor": "WyJuYXR1cmFsIiwyXQ==",
          "node": {
            "id": 2
          }
        }
      ],
      "pageInfo": {
        "startCursor": "WyJuYXR1cmFsIiwxXQ==",
        "endCursor": "WyJuYXR1cmFsIiwyXQ=="
      }
    }
  }
}

So far so good here as well.

query Test4{
  tableSetQuery1171(first: 2, after:"WyJuYXR1cmFsIiwzXQ=="){
    edges {
      cursor
      node {
        id
      }
    }
    pageInfo {
      startCursor
      endCursor
    }
  }
}

returns

{
  "data": {
    "tableSetQuery1171": {
      "edges": [
        {
          "cursor": "WyJuYXR1cmFsIiwxXQ==",
          "node": {
            "id": 3
          }
        },
        {
          "cursor": "WyJuYXR1cmFsIiwyXQ==",
          "node": {
            "id": 4
          }
        }
      ],
      "pageInfo": {
        "startCursor": "WyJuYXR1cmFsIiwxXQ==",
        "endCursor": "WyJuYXR1cmFsIiwyXQ=="
      }
    }
  }
}

Note the repeated cursor values compared to the first page of results. The test data I loaded had 5 rows so this problem makes retrieving page 3 impossible via the cursor.

Fascinating.

Can confirm; have reproduced with this function:

create function c.table_set_query_plpgsql() returns setof c.person as $$ begin return query select * from c.person; end $$ language plpgsql stable;

Screenshot_20191106_193637

The SQL is subtly different.

SQL:

with __local_0__ as (
  select to_json(
    (
      json_build_object(
        '__identifiers'::text,
        json_build_array(__local_1__."id"),
        '@node'::text,
        (
          json_build_object(
            'id'::text,
            (__local_1__."id"),
            'name'::text,
            (__local_1__."person_full_name")
          )
        )
      )
    )
  ) as "@edges",
  to_json(
    json_build_array(
      'natural',
      (
        row_number( ) over (partition by 1)
      )
    )
  ) as "__cursor"
  from (
    select __local_1__.*
    from "c"."table_set_query_plpgsql"( ) as __local_1__
    where (TRUE) and (TRUE)
    limit 2 offset 2
  ) __local_1__
),
__local_2__ as (
  select json_agg(
    to_json(__local_0__)
  ) as data
  from __local_0__
)
select coalesce(
  (
    select __local_2__.data
    from __local_2__
  ),
  '[]'::json
) as "data"

plpgsql:

with __local_0__ as (
  select to_json(
    (
      json_build_object(
        '__identifiers'::text,
        json_build_array(__local_1__."id"),
        '@node'::text,
        (
          json_build_object(
            'id'::text,
            (__local_1__."id"),
            'name'::text,
            (__local_1__."person_full_name")
          )
        )
      )
    )
  ) as "@edges",
  to_json(
    json_build_array(
      'natural',
      (
        row_number( ) over (partition by 1)
      )
    )
  ) as "__cursor"
  from "c"."table_set_query"( ) as __local_1__
  where (TRUE) and (TRUE)
  limit 2 offset 2
),
__local_2__ as (
  select json_agg(
    to_json(__local_0__)
  ) as data
  from __local_0__
)
select coalesce(
  (
    select __local_2__.data
    from __local_2__
  ),
  '[]'::json
) as "data"

diff:

--- a.sql   2019-11-06 19:48:43.511473647 +0000
+++ b.sql   2019-11-06 19:46:44.711593022 +0000
@@ -24,12 +24,9 @@
       )
     )
   ) as "__cursor"
-  from (
-    select __local_1__.*
-    from "c"."table_set_query_plpgsql"( ) as __local_1__
-    where (TRUE) and (TRUE)
-    limit 2 offset 2
-  ) __local_1__
+  from "c"."table_set_query"( ) as __local_1__
+  where (TRUE) and (TRUE)
+  limit 2 offset 2
 ),
 __local_2__ as (
   select json_agg(

We have to run the PL/pgSQL function inside a subquery, and interestingly we add the limit/offset THERE rather than in the parent. This is almost certainly a performance optimisation because we don't have to evaluate the rows at all until we've already done the limiting. However it messes up the row_number() logic that we're doing above because it no longer factors in the offset.

Fixed in https://github.com/graphile/graphile-engine/pull/559

That was... unpleasant. Somebody needs to buy me a beer :rofl:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

outsidenote picture outsidenote  ·  4Comments

mrbarletta picture mrbarletta  ·  5Comments

giacomorebonato picture giacomorebonato  ·  3Comments

tonyhschu picture tonyhschu  ·  3Comments

calebmer picture calebmer  ·  3Comments