Postgraphile: Improving RLS support...

Created on 8 May 2018 · 11Comments · Source: graphile/postgraphile

First of all, Postgraphile IS SUPER AWESOME. You are doing an amazing job, I cannot thank you enough!

I'm submitting an improvement request regarding RLS support.

[x] improvement request

PostGraphile version: 4.0.0-beta.5

-- NOTE:
-- Table `public.flow_step` is behind a RLS policy controlled by the user rights.

-- NOTE:
-- Table `public.flow` is also behind a RLS policy. It is controled depending on the `public.flow_step` access.
-- Meaning that if the user has access to the specific `public.flow_step`, he has access to referenced `public.flow` too;
-- Respectively, no access to `public.flow_step` = no access to `public.flow`.

-- NOTE:
-- The `private.set_next_flow_step(uuid)` function allows the user to set the next `public.flow_step` without him
-- knowing what the next step actually is. The function itself is defined as `SECURITY DEFINER` meaning that the
-- function will have access to ALL steps which will allow it to calculate the next one. (My Postgraphile config
-- does not allow introspection of the `private` schema so the function is safe).
CREATE FUNCTION private.set_next_flow_step(
    flow_id uuid
) RETURNS public.flow AS
$$
    WITH next_flow_step AS (
        SELECT fs.id FROM public.flow_step AS fs
        WHERE (
            -- Logic for calculating the next step goes here... --
        )
        LIMIT 1
    )
    UPDATE public.flow
        SET step_id=(SELECT id FROM next_workflow_step)
    WHERE (id = set_next_flow_step.flow_id)
    RETURNING *
$$
LANGUAGE plpgsql VOLATILE SECURITY DEFINER;

-- NOTE:
-- The `public.set_next_flow_step(uuid)` function is the introspected/public version of the above mentioned function.
-- It PERFORMs the `private.set_next_flow_step(uuid)`, re-SELECTs the `public.flow` in focus and RETURNs it. (The re-SELECT
-- is very important for finding out if the user has access to the next step, because the `public.set_next_flow_step(uuid)`
-- function is run with `current_user` rights)
-- IMPORTANT: Since the user might not have access to the next flow step, the result can be either `NULL` or `public.flow`.
CREATE FUNCTION public.set_next_flow_step(
    flow_id uuid
) RETURNS public.flow AS
$$
DECLARE
    next_flow RECORD;
BEGIN
    PERFORM private.set_next_flow_step(set_next_flow_step.flow_id);

    SELECT * INTO next_flow FROM public.flow WHERE (id = set_next_flow_step.flow_id)
    RETURN next_flow;
END;
$$
LANGUAGE plpgsql VOLATILE;

----
-- The CLIENT runs a GraphQL mutation joining the `public.flow` in the response...
----

-- NOTE:
-- Postgraphile will run the above functions in 2 steps.

-- 1. step
-- PARAMETER: $1 = <some-unique-uuid>
with __local_0__ as (
    select __local_1__.* from "public"."set_next_flow_step"($1) __local_1__
)
select (__local_0__)::text from __local_0__

-- 2. step
-- PARAMETER: $1 = <result-of-step-1>
with __local_0__ as (
    select (str::"public"."flow").*
    from unnest(($1)::text[]) str
)
-- ... rest of the query ... --

-- FIXME: 
-- Now the fun part, if the 1. step produced a NULL value (because the user does not have access to the next step),
-- the parmeter will behave like so: $1 = '{"(,,,,,,,,,,)"}', instead of $1 = NULL!

-- This poses HUGE problems like: if `public.flow` has domain columns which are defined with `NOT NULL`,
-- the function will ALWAYS fail throwing the following exception: `domain X does not allow null values`

Hope I provided sufficient information! If not, please ask.

Kind regards,
Denis

Source

enisdenjo

All 11 comments

I... Have no idea what you're asking. Please explain in words what you're after and then I can read the code with a little context.

benjie on 8 May 2018

I realised now, that this does not apply only forthe above RLS implementation... I'll elaborate.

Lets say we have the following domains:

CREATE DOMAIN created_time timestamp WITHOUT TIME ZONE NOT NULL DEFAULT (now() AT TIME ZONE 'UTC');
COMMENT ON DOMAIN created_time IS 'Timestamp of the creation of data entries';

CREATE DOMAIN updated_time timestamp WITHOUT TIME ZONE NOT NULL DEFAULT (now() AT TIME ZONE 'UTC');
COMMENT ON DOMAIN updated_time IS 'Timestamp of the performed update on data entries';

and a table:

CREATE TABLE public.document (
    id    uuid PRIMARY KEY,
    title text,

    created_at created_time,
    updated_at updated_time
);

Obviously the public.document cannot exist without having created_at and updated_at columns set!

Now lets say we have a PL/pgSQL function defined like so:

CREATE FUNCTION public.update_or_delete_document(
    document_id uuid
) RETURNS public.document AS
$$
DECLARE
    next_document RECORD;
BEGIN
    --                                                                                 --
    -- Here goes the logic which decides if the document should be updated or deleted. --
    --                                                                                 --

    SELECT * INTO next_document FROM public.document WHERE id = update_or_delete_document.document_id;
    RETURN next_document; -- next_document is either `public.document` or NULL
END;
$$
LANGUAGE plpgsql VOLATILE;

_I know that an update_or_delete function does not make sense (the RLS implementation above does), but bear with me..._

If a user runs a GraphQL query against Postgraphile like so:

mutation UpdateOrDeleteDocument {
    updateOrDeleteDocument(input:{
        clientMutationId: ""
        documentId: "1c8d6384-993a-4af4-9904-dcb14c5880d4"
    }) {
        clientMutationId
        document {
            id
            title
        }
    }
}

If the function decides to delete the document and return NULL, It will actually raise an exception saying: domain updated_time does not allow null values!

Why?

Simple because Postraphile first runs the following query:

with __local_0__ as (
    select __local_1__.* from "public"."update_or_delete_document"($1) __local_1__
)
select (__local_0__)::text from __local_0__;

which will have (__local_0__)::text set to: (,,,,) if the public.update_or_delete_document(uuid) function returns NULL!

Then runs this one:

with __local_0__ as (
    select (str::"public"."document").*
    from unnest(($1)::text[]) str
)
-- ...rest

_Where $1 is the result of first Postgraphile query_

This means that the last query might raise the above mentioned exception as you cannot unnest (,,,,) to public.document becuase of the defined DOMAINs with NOT NULL constraints!

enisdenjo on 8 May 2018

Ah; so you're saying

with __local_0__ as (
    select __local_1__.* from "public"."update_or_delete_document"($1) __local_1__
)
select (__local_0__)::text from __local_0__;

Should actually be:

with __local_0__ as (
    select __local_1__.* from "public"."update_or_delete_document"($1) __local_1__
)
select (__local_0__)::text from __local_0__ where not (__local_0__ is null);

(Note: foo is not null and not (foo is null) are different.)

I have this check in many places, I can't think of any reason off-hand that we can't have this here for single mutations; I'll have to evaluate this and it could well be that we add it. Setof mutations, on the other hand, are a little more complex.

Actually it should probably be in the select:

with __local_0__ as (
    select __local_1__.* from "public"."update_or_delete_document"($1) __local_1__
)
select (
  case
  when __local_0__ is null then null::text
  else __local_0__::text
  end
) from __local_0__;

benjie on 8 May 2018

👍1

Yes, you got it! 😄

If you take a look at the first comment (usage with RLS), it sure does make sense having this check inside a mutation query!
TLDR: If the RLS restricts seeing the row after the update, the function will return NULL!

Looking at the proposed fixed query:

with __local_0__ as (
    select __local_1__.* from "public"."update_or_delete_document"($1) __local_1__
)
select (
  case
  when __local_0__ is null then null::text
  else __local_0__::text
  end
) from __local_0__

it will work for SETOF out-of-the-box too! Since every __local_0__ is a new row in the result. Even if the result of the CTE is:
(,,,,)
(05bca130-0150-465c-b304-35de5526daf5,,"2018-05-08 15:21:36.512504+02","2018-05-08 15:21:36.512504+02")
(,,,,)

I would really appreciate it if you could put fix this in the next version, as it will solve a lot of issues I am currently facing (and surely for others too).

enisdenjo on 8 May 2018

Yeah; that's why I rewrote it to do that so it works for setof 👍

benjie on 8 May 2018

❤1

So turns out I hit an issue similar to this a few days ago: https://github.com/graphile/graphile-build/issues/217

What's more, it turns out that PostgreSQL 9.4 vs PostgreSQL 10 handle these nulls differently; PG10 behaves a lot better, PG9.4 throws some tantrums in certain edge cases.

Not sure if that affects this issue explicitly yet, but just mentioning it for context.

benjie on 8 May 2018

Minimal SQL to reproduce issue:

create schema c;
create domain c.not_null_timestamp timestamptz not null default now();
create table c.issue756 (
  id serial primary key,
  ts c.not_null_timestamp
);
create function c.issue756_mutation() returns c.issue756 as $$
begin
  return null;
end;
$$ language plpgsql volatile;

Minimal GraphQL to demonstrate issue:

mutation {
  issue756Mutation(input:{}) { issue756 { id ts } }
}

Expected result: success with null payload

Actual result:

[GraphQLError: domain c.not_null_timestamp does not allow null values]

benjie on 8 May 2018

👍1

Just one of these days I'd like to receive an issue that I can fix in like 2 minutes...

https://github.com/graphile/graphile-build/pull/220

benjie on 8 May 2018

👍1

No I feel I have to extend https://github.com/graphile/graphile-build/issues/217 to cover more situations for consistency.

benjie on 8 May 2018

I know the feeling... Does it go really deep? Anything I can help with?

enisdenjo on 9 May 2018

We really need more tests; coming up with what tests to write (given we already have a load) is hard. Help there would be welcome.

benjie on 9 May 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Is @foreignKey available for 2 or more columns in 1 view?

k-ogawa-1988 · 3Comments

can't install with yarn berry

kilianc · 4Comments

Print postgraphql response logs with Express

safaiyeh · 3Comments

Network fetch error apollo client postgraphile

CarlFMateus · 4Comments

[SOLVED] relay errors when parsing postgraphql generated schema

WestleyArgentum · 3Comments