Postgrest: Feature Request Configure to Exclude Schemas

Created on 12 May 2020 · 14Comments · Source: PostgREST/postgrest

Environment

PostgreSQL version: 10+
PostgREST version: 7
Operating system: Any supported

Description of issue

We are running postgrest in an environment with a quickly increasing number of tenants that get access to a PostGIS database. It seems difficult at this stage using the configuration option db-schema as we will have to change db-schema each time a tenant is added. We would, therefore, like to propose that it be possible to configure schemas to be excluded from end point creation. Is that possible?

EDIT: It would also be great if the end point generation could be schema strict (may be via a configuration option). I have the impression that all configured schemas are added to the SEARCH_PATH and the all tables in the SEARCH_PATH get their end point accordingly. That causes issues with tables of the same name in different schemas (please refer to #1519).

EDIT: I should add that this is a great tool. Very fast.

enhancement

Source

dzelge

👍1

Most helpful comment

@dzelge I see what you mean, db-schema must be more dynamic for schema based multi-tenancy.

Not sure about the config option yet, but maybe we could reuse the pg LIKE pattern. The config could be like db-schema=%tenant%.

Would that work for matching your tenants names?

Btw, even with a config like this, postgrest schema cache would still need to be reloaded. A SIGUSR1 would be needed. If https://github.com/PostgREST/postgrest/issues/1512#issuecomment-623585929 gets done it could also work with a NOTIFY.

It would also be great if the end point generation could be schema strict (may be via a configuration option). I have the impression that all configured schemas are added to the SEARCH_PATH and the all tables in the SEARCH_PATH get their end point accordingly.

Yes, that's the original design. Only one of the exposed schemas should get added to the search_path. #1519 is likely a bug, might be related to the fact that the public schema is used.

steve-chavez on 13 May 2020

👍2

All 14 comments

@dzelge I see what you mean, db-schema must be more dynamic for schema based multi-tenancy.

Not sure about the config option yet, but maybe we could reuse the pg LIKE pattern. The config could be like db-schema=%tenant%.

Would that work for matching your tenants names?

It would also be great if the end point generation could be schema strict (may be via a configuration option). I have the impression that all configured schemas are added to the SEARCH_PATH and the all tables in the SEARCH_PATH get their end point accordingly.

Yes, that's the original design. Only one of the exposed schemas should get added to the search_path. #1519 is likely a bug, might be related to the fact that the public schema is used.

steve-chavez on 13 May 2020

👍2

@steve-chavez

Yes, a LIKE pattern would suffice if that made things easier for you. I'd be very happy with that.
I don't mind having to reload the cache. That's just the way it is. Hard to change.

Cheers

dzelge on 13 May 2020

@steve-chavez What your plan on this? Can I contribute? Not very proficient with Haskell tho.

dzelge on 18 May 2020

@dzelge I'm going to release a patch version with some bugfixes we have accumulated. After that I'll see about implementing new enhancements.

Would you like to prioritize this feature? If so, you could help me by supporting PostgREST development!! We could have the feature implemented early(this week) and then I could provide you an executable.

If you'd like to contact me privately, you can shoot me an email at [email protected].

steve-chavez on 18 May 2020

Also would love to have dynamic db_schema updating!

Adding/removing by name would be my preference, if needed I can even write them to the postgrest.conf, but would still need a way to re-read config without restarting postgrest, I don't know if that's possible?

I recently implemented the SIGUSR1 method of updating the schema cache on table updates so would happily use the same method on addition/removal of schemas, as it seems to work great

awalias on 21 May 2020

@awalias Ah, yes, that would need re-reading the config on SIGUSR1. Could you open another issue for that feature request? This one is more about not having to edit the config file for new schemas.

steve-chavez on 26 May 2020

Instead of the LIKE pattern I think we should use SIMILAR TO, that way we can match schemas that don't follow a pattern. So the config would be like this:

db-schema-pattern="%tenant%|oregon|portland"

I'm not using the db-schema config because it's overloaded now. Changing it to accept | instead of , would imply a breaking change.

steve-chavez on 15 Jun 2020

As per the original issue, some setups would actually require excluding certain schemas(not all tenants names are known in advance), so additionally we could support a db-schema-pattern-negate(false by default) config, this would make the matching operator a NOT SIMILAR TO. Could work like this:

db-schema-pattern = "public|extensions|private"
db-schema-pattern-negate = true;

Basically, this says all schemas except schemas were extensions are located should be exposed.

(Note that we always exclude the pg_catalog schema internally).

steve-chavez on 16 Jun 2020

👍1

This feature caters to a very specific situation (tenant per schema, or the more general, tenant per database) which in 99% of the cases is a bad idea https://news.ycombinator.com/item?id=23305111. Features like this should not make it to postgrest core (and complicate the config/code for everybody) but rather people in need for this should extend/fork postgrest (with some guidance/help/consulting)

ruslantalpa on 16 Jun 2020

@ruslantalpa This is basically extending the schema-based multitenancy use case that List of schemas already enables(also mentioned in docs). It only makes it more flexible for cases where tenants names don't follow a pattern and avoids the need of manually editing the config file.

It's debatable if the schema-based multitenancy approach is bad(and if instead users should be instructed to do RLS, also see MS Multi-Tenant Data Architecture), but we're already enabling such a use case.

But yeah, adding more configs is not ideal. Another alternative would be to keep overloading db-schema.

steve-chavez on 16 Jun 2020

(Going to have to think more about this design, restricting exposed schemas might not be as easy as I initially thought. Just been brainstorming some possibilities until now.)

steve-chavez on 16 Jun 2020

I've noted that it's possible to restrict the schemas as usual if we query them first by using the db-schema-pattern. The db-schema-pattern-negate is looking a bit much for now, so I'll work on a PR that just focuses on db-schema-pattern. Will also see if it's possible to overload the db-schema confing and avoid having an extra one.

steve-chavez on 17 Jun 2020

Even if we come up with a simple config like db-schema = NO public,private,extensions, the negation stuff is looking too risky.

This is because extensions like timescaledb and pipelinedb create their own schemas:

_timescaledb_cache
_timescaledb_catalog
_timescaledb_config
_timescaledb_internal
timescaledb_information
pipelinedb

We would abandon our schema isolation security if we allowed this. Blacklisting schemas would not be good for us.

So the best option for a dynamic db-schema right now is SIGUSR2(added in https://github.com/PostgREST/postgrest/pull/1544).

steve-chavez on 20 Jul 2020

👍1

To improve handling of many schemas, it could be possible to implement a custom solution with an override of pgrst.set (postgrest pg extension) suggested in #1562. The schemas to fill the pgrst.db_schema variable could be held in a custom table and the variable could be filled with a query from this table.