Postgrest: Restart required for detecting table relations on schema change

Created on 28 Jan 2016 · 30Comments · Source: PostgREST/postgrest

I have an event_group table with fields group_id and event_id referencing events.id and groups.idrespectively.

The weird thing is that even if hydrating the event IDs works perfectly:

GET /event_group?select=events{*}
[
  {
    "events": {
      "doc": {
     ...
       }
  }
]

When I try to do the same with groups , I receive this error:

GET  /event_group?select=groups{*}
 {
  "hint": null,
  "details": null,
  "code": "42703",
  "message": "column groups.group_id does not exist"
}

why does it expect groups.group_id and not groups.id? I didn't need a event.event_id for events!

Here's the SQL:

CREATE TABLE "1".event_group
(
  group_id character varying(32) NOT NULL,
  event_id character varying(36),
  CONSTRAINT fk_eid FOREIGN KEY (event_id) REFERENCES "1".events (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE NO ACTION,
  CONSTRAINT fk_groups FOREIGN KEY (group_id) REFERENCES "1".groups (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT uq_gid_eid UNIQUE (event_id, group_id)
)


CREATE TABLE "1".events
(
  doc jsonb,
  id character varying(36) NOT NULL DEFAULT uuid_generate_v4(),
  CONSTRAINT pkey_eid PRIMARY KEY (id)
)

CREATE TABLE "1".groups
(
  id character varying(32) NOT NULL DEFAULT uuid_generate_v4(),
  name character varying(128) NOT NULL,
  description character varying(800),
  CONSTRAINT group_pkey PRIMARY KEY (id)
)

enhancement

Source

sscarduzio

Most helpful comment

OK how about this: postgrest listens for both SIGHUP and a specially named SQL event and reloads the schema in either situation. In production you would send the server a signal, and in development you create an event trigger on ddl_command_end and have it notify postgrest through the sql event.

In development there would be a snippet of sql that the developer would have to run once in their database which would enable postgrest auto schema change detection.

begriffs on 3 Apr 2016

👍3

All 30 comments

This was fixed by restarting your postgrest server, correct? If so I'll rename this issue to discuss ways for PostgREST to detect schema changes.

begriffs on 29 Jan 2016

Restarting PostgREST has fixed my issue indeed. But honestly, before I consider this resolved, I'd love to see some notes on the documentation that makes this (probably temporary) side effect evident to the users.
Do you want to take care of this or shall I?

sscarduzio on 29 Jan 2016

I'd love your help with the docs if you could.

begriffs on 29 Jan 2016

Here are the options I can see for handling schema change.

Don't cache knowledge of the schema
- When a request comes in that needs to know the db structure, query for the relevant parts of the schema then and there
- :+1: keeps postgrest stateless
- :+1: fast propagation of schema changes to the API
- :-1: may need to break up the consolidated schema detection code into little pieces for each type of check
- :-1: slow and wasteful when the schema doesn't change much (which is probably the common case)
Refresh schema structure cache on a sysadmin signal
- Typically in unix it is SIGHUP, but we could also listen for a database NOTIFY as well so that a stored proc could cause the schema reload
- :+1: Only makes the sql calls when it has to
- :+1: Probably not a whole lot to change about our code
- :-1: You have to remember to SIGHUP just like you currently have to remember to restart, so still potentially confusing
- :-1: Painstaking to do on a cluster of load-balanced postgrests via a unix signal, but maybe OK if they all listen for a db notify
Poll for changes
- :+1: nobody has to remember to restart or send a signal
- :+1: can be implemented _in addition_ to the refresh signals
- :-1: most APIs probably run for a long time without schema change so polling is wasteful
Refresh on ddl_command_end event trigger
- :+1: happens for all and only the kind of schema-altering events we care about
- :+1: available in postgres 9.3+
- :+1: no manual signaling required, and schema changes propagate quickly to the API
- :-1: requires the db admin to create the trigger which translates ddl_command_end into a pg_notify for which postgrest can listen. Maybe there is a way that postgrest can create such a trigger for the duration of its run?
- :+1: supports manual refresh too by issuing a sql notify statement that mimics the one generated by our ddl event trigger

This last option is really nice. Ideally we can have postgrest set it up without any admin intervention. Any options or pros/cons I'm missing?

begriffs on 29 Jan 2016

If you number the options:

No change, keep current behavior: schema knowledge is loaded when postgrest starts
Don't cache knowledge of the schema
Refresh schema structure cache on a sysadmin signal
Poll for changes
Refresh on ddl_command_end event trigger

The options 2 and 4 seem worse than current behavior, 3 and 5 look OK and could probably both be implemented.

If we go for option 5, just like for authentication, it would make sense to keep this feature out of postgrest and implement it as a separate postgrest-supervisor or something.

eric-brechemier on 29 Jan 2016

How about this, on startup have postgrest run this sql:

CREATE OR REPLACE FUNCTION public.notify_ddl_postgrest()
  RETURNS event_trigger
 LANGUAGE plpgsql
  AS $$
BEGIN
  NOTIFY ddl_command_end;
END;
$$;

DROP EVENT TRIGGER IF EXISTS ddl_postgrest;
CREATE EVENT TRIGGER ddl_postgrest ON ddl_command_end
   EXECUTE PROCEDURE public.notify_ddl_postgrest();

Then any schema change will notify on channel ddl_command_end. Also an admin can call public.notify_ddl_postgrest to manually tell postgrest to reload the schema. If the function is available with the appropriate permissions then you could access it over http at /rpc/notify_ddl_postgrest.

The missing piece is that postgrest needs to be able to listen for changes. @nikita-volkov would it be feasible to add support for listen in hasql? Something similar to this?

begriffs on 30 Jan 2016

@begriffs There actually is a PR from Leon Smith with that: https://github.com/nikita-volkov/hasql/pull/43/files . I just need to adapt it a little better.

nikita-volkov on 30 Jan 2016

I think PostgREST produces enough value to be a tad more "invasive" in terms of integration.

I mean, when one wants to use PostgREST, he/she may be fine with the fact this component will require an "installation phase" in which it (automatically) creates triggers, tables and whatever else it needs to provide all these non trivial features.

I imagine the UX would be something along the lines of:

$ postgrest postgres://[email protected]/somedb -a postgres -s "1"
WARNING: PostgREST is not installed in "somedb". Some features might not work as expected.
SOLUTION: restart using the --install option

This way, we can let people opt out the custom triggers and tables if they don't use the advanced features (user management, schema auto-refresh, etc.). But we'd offer a turn-key setup process for the rest of the users (as opposed to "if you want user management, follow this howto..").

sscarduzio on 31 Jan 2016

@sscarduzio I agree, in fact for this feature we just need to create a proc and a trigger so maybe postgrest should just do it without asking. The trigger shouldn't add much of a burden on anyone's db because it triggers only on DDL changes.

begriffs on 31 Jan 2016

I dont agree :) in order for this to work, the authenticator needs to be granted high privileges, and for what? Compromise security for a small convinience?

ruslantalpa on 1 Feb 2016

Does it take high privileges to create a function and a trigger?

begriffs on 1 Feb 2016

I would consider the ability to create procedures and triggers "high privileges".
Why not simply implement (3) and let some other script do the refresh in whatever way it chooses to.
In production there is no "random" schema change, usually it's done using CI so why not allow the ability to issue a HUP at the end of the deployment script?

ruslantalpa on 1 Feb 2016

👍1

In production there is no "random" schema change, usually it's done using CI so why not allow the ability to issue a HUP at the end of the deployment script?

Exactly my thought after I posted my previous comment. For example, I am using git to deploy new versions on a server, and the repository includes also the nginx configuration of the virtual host and the sqitch folder with SQL scripts.

I currently run nginx -s reload after the git push to reload nginx configuration, I would simply add service reload postgrest or something similar to reload postgrest configuration after deploying database changes with sqitch deploy.

eric-brechemier on 1 Feb 2016

Well, plot twist: there's not only "production" that counts.
I'm creating tables as I progress with the user stories: postgres - the stateful layer of my architecture -keeps the pace with me (no need to restart), why should my development cycle include restarting a supposedly "stateless" layer? It's really awkward, looks like we're back in Tomcat times.

sscarduzio on 1 Feb 2016

Well, plot twist: there's not only "production" that counts.

Also agreed, but development uses extra tools, not installed on the server.

I am still suggesting to have a separate, officially sanctioned executable to refresh postgrest config at regular intervals or using elevated privileges to detect schema changes during development.

eric-brechemier on 1 Feb 2016

so you suggest I have a while true SIGHUP sleep 1 in my dev env? Is it what you'd propose for tomcat as well?

sscarduzio on 1 Feb 2016

In your dev env you create those triggers then listen to notify events with a 10 line script and restart postgrest

ruslantalpa on 1 Feb 2016

My memories of Tomcat days are slowly fading away, but afaik using a _separate_ process to monitor (e.g. file) changes to reload the server is still pretty much current state of the art:

node-supervisor for NodeJS
https://github.com/petruisfan/node-supervisor

or more generally

Rerun
https://github.com/alexch/rerun

eric-brechemier on 1 Feb 2016

I still don't understand why it's ok for developers if the developer experience is crap. We have the solution in hand to make this a pleasant thing to use, why pulling back?

If you don't want postgrest to retain CREATE permissions, just don't give them. Postgrest will print a warning (as I showed) that schema won't be refreshed.

sscarduzio on 1 Feb 2016

New to this - so not sure if this is a silly option- but why not make the notify/advanced features a separate extension for postgres itself?

It could be installed separately in its own schema-
1) the authenticator user wouldn't need any special permissions as there could still just be listen/notify sql code that resides in the postgrest-extension schema
2) Said code could then notify with the name of schemas that have been refreshed and the postgrest instance would simply have to know what schema it was running and only update if its name was called
3) If the postgrest extension isn't installed everything would still work, you just won't get the advanced features and dropping the postgrest extension would drop the advanced features etc. ]

Anyway, there are probably cons here that I'm not considering, but just wondering why it's not a part of the discussion.

davidkuep on 1 Feb 2016

In development there would be a snippet of sql that the developer would have to run once in their database which would enable postgrest auto schema change detection.

begriffs on 3 Apr 2016

👍3

Look what's now available! https://github.com/cocreature/hasql-notifications

begriffs on 31 Aug 2016

Has this been resolved? Seems like if your DB changes, you'll end up with some failed API requests until postgrest has been restarted. Are there any plans to implement the auto-refresh via notifications?

matthewmueller on 12 Nov 2016

It hasn't been resolved in full generality, but postgrest does listen for SIGHUP nowadays which will cause it to reload the schema. It's less intrusive than having to restart the server.

begriffs on 12 Nov 2016

Ah gotcha, so the idea would be that you SSH into your container / dyno and trigger a SIGHUP on that process?

matthewmueller on 12 Nov 2016

For now yeah. Having postgrest LISTEN for a specially named event seems feasible for v0.4.1. Once that is implemented we can add a section in the docs about hooking ddl_command_end up to NOTIFY that event for immediate postgrest schema reloads on DDL changes.

begriffs on 12 Nov 2016

Got it, thanks for your help!

matthewmueller on 12 Nov 2016

I followed up on the hasql-notifications library which seems to have run into a race condition bug last year. We'll see if that gets resolved, and then it'll be pretty easy to finally get this issue closed.

begriffs on 13 Jul 2017

@begriffs: https://github.com/JarvusInnovations/lapidus/issues/19

It seems as though it'd be trivial to create a tiny sidecar application that sends SIGHUP to PostgREST if the Haskell library isn't ready yet and just suggest its use in the documentation.

Go:
http://coussej.github.io/2015/09/15/Listening-to-generic-JSON-notifications-from-PostgreSQL-in-Go/

Rust:
https://github.com/sfackler/rust-postgres/commit/bb837bd8722ef16f085ea76e473f0ae3fbae4d9f

Elixr/Erlang:
https://github.com/elixir-ecto/postgrex/blob/v0.13.3/lib/postgrex/notifications.ex#L78

I think it may be possible to do this with a shell script assuming the user is providing credentials using something libpq uses. I'm not seeing any examples though.

jmealo on 31 May 2018

what is about the part " a specially named SQL event and reloads the schema"? I saw #570 implement the SIGNUP part.

for my use case, postgrest run in docker. I run test in another docker. send a signal from one container to another is possible I guess, but not interesting.