Postgrest: Feature Request: Updating Columns to Default Values

Created on 5 Aug 2020 · 10Comments · Source: PostgREST/postgrest

Currently, there is no way to update tables, setting columns to their default values. I have found myself wanting to allow "resetting" of columns back to how they were when the row was created, but to no avail since such a feature is not supported.

I suggest adding a defaults or revert query parameter for PATCH requests that accepts a comma-separated list of column names (like the columns param for POST) to be set to their default values using column = DEFAULT.

Thank you for this amazing piece of software.

enhancement

Source

Exahilosys

Most helpful comment

@wolfgangwalther POST has an issue since it's not possible to do INSERT INTO ... SELECT DEFAULT, col1... Check this question.

That being said, I think your idea should work for PATCH because of the UPDATE .. SET = DEFAULT form. To do this we'd have to pass all the table/view columns to QueryBuilder.hs and make a diff with ?columns for the defaults.

steve-chavez on 5 Aug 2020

👍2

All 10 comments

I think we don't even need another parameter for this - we could just use the existing columns, like this:

every column in the ?columns=... list gets added to the UPDATE SET part
when a column is not defined in the json body, the column is set to DEFAULT instead
all keys in the json that are not in the list are ignored (as is currently the case)

This can also be extended to POST - and would improve batch inserts as well, because some inserted records could have a value set, while others would use the default value for the same key.

Currently both PATCH and POST insert NULL when a column in ?columns= doesn't have a key in the json body. But since null is a valid json value as well and can therefore be set explicitely, this wouldn't need to be like this. While I couldn't find anything in the docs or in the tests about this, I guess it would still have to be considered a breaking change.

wolfgangwalther on 5 Aug 2020

@wolfgangwalther POST has an issue since it's not possible to do INSERT INTO ... SELECT DEFAULT, col1... Check this question.

steve-chavez on 5 Aug 2020

👍2

Down the road it would be good to be able to support this for POST as well - would be good if PATCH and POST behave the same here.

I looked a bit deeper into this - even if INSERT INTO ... SELECT would support DEFAULT, we'd still insert NULLs only. This is because json_to_recordset(), which we are using to expand the data to the insert columns, returns NULL for every missing key already. There is however an approach, that should work: We can get the default expression for every column from pg_attrdef. If we then build up a json object with all the default values for this table, we can merge the actual request body on top of this - the resulting json would have the default for missing keys (part of this is described here: https://rudism.com/inserting-json-data-with-default-values-in-postgresql/). We probably need to split up json_to_recordset into json_array_elements and json_to_record, because the default object would need to be applied between those steps.

Those expressions from pg_attrdef could be taken from the schema cache. Once we have a pg_postgrest extension in place, we could also put this part into an SQL function to simplify the query.

wolfgangwalther on 7 Aug 2020

👍1

Pardon me if I'm missing any details here. Wouldn't trying to add this functionality for columns on POST be redundant? If the user wants to leverage the DEFAULT functionality, they can just withhold the respective columns from the json payload being sent.

In fact, I would argue that passing NULL (like it's done now) is not all that intuitive and may lead to issues with NOT NULL columns. Simply ignoring columns that are in the columns parameter but not the payload (and vice versa) for POST would make more sense. I feel like this feature is meant to filter which columns are "looked into" rather than guarantee that they will have a value by the end of the request.

Exahilosys on 8 Aug 2020

For POST, this would become interesting when doing Bulk-Inserts, so inserting multiple rows at once. In this case you can't make some of those have default values for a column and others have values in the json for this same column, right now.

wolfgangwalther on 10 Aug 2020

Not including the column on some bulk-POST payloads would signal PostgreSQL to use their default values, isn't that right? For example, having:

CREATE TABLE api.pets (name TEXT, type TEXT, fed BOOL DEFAULT FALSE, PRIMARY KEY(name, type));

And then doing POST ON /pets with:

[
    {"name": "Sparky", "type": "Dog", "fed": true}, 
    {"name": "Whiskers", "type": "Cat"},
    {"name": "Bubbles", "type": "Fish"}
]

Would result in Sparky's fed being TRUE, while Whiskers' and Bubbles' being FALSE. The default value was used simply by redacting the column from the respective json row payload. I don't see how columns could further enhance this behaviour.

Exahilosys on 10 Aug 2020

Not including the column on some bulk-POST payloads would signal PostgreSQL to use their default values, isn't that right? For example, having:
CREATE TABLE api.pets (name TEXT, type TEXT, fed BOOL DEFAULT FALSE, PRIMARY KEY(name, type));
And then doing POST ON /pets with:
[
    {"name": "Sparky", "type": "Dog", "fed": true}, 
    {"name": "Whiskers", "type": "Cat"},
    {"name": "Bubbles", "type": "Fish"}
]
Would result in Sparky's fed being TRUE, while Whiskers' and Bubbles' being FALSE.

Yes, that would be the goal.

Currently Sparky's fed would TRUE, but Whiskers' and Bubbles' would be NULL.

The default value was used simply by redacting the column from the respective json row payload. I don't see how columns could further enhance this behaviour.

You're right as far as passing columns to the request shouldn't be needed - in fact we need to support both cases, with and without. If I'm not mistaken, not using columns just means that a proper value for columns is generated by looking at the json payload itself. So in regards to the query built, your request to /pets will be the same without columns or with columns=name,type,fed.

wolfgangwalther on 10 Aug 2020

I seem to have overlooked that bulk inserts require payload keys to be uniform, which my example did not abide to. After some quick sanity checking, postgrest replied with a 400 "All object keys must match" for that payload.

Some further testing confirmed my suspicions that specifying a NOT NULL column in columns and excluding the respective key from the POST payload results in a non-null constraint violation (since, like @wolfgangwalther mentioned, NULL is inserted by default in such cases).

It seems like simply not doing anything for keys in columns but not in the payload would be better for everyone. Postgrest would not have to inject a forced value (like NULL) which is prone to subsequent faults through constraints like non-null, and PostgreSQL could organically serve default values and more direct errors.

I am not versed enough in haskell to know the implementation details, but lifting the uniformity-of-keys limitation from bulk inserts would really let the above suggestion shine. However, I am unaware if that is possible.

Since the intricacies of POST and columns are not meant to be part of this issue, I would be happy opening a new one if required. Please advise further.

Exahilosys on 10 Aug 2020

I think we can solve the default issue by specifying it like this:

select * from json_populate_recordset(row(null, null, 1)::projects, $$[
  {"id": 666, "name": "project-666"}, 
  {"id": 667, "name": "project-667"}, 
  {"id": 668, "name": "project-668", "client_id": 3}
]$$);

┌─────┬─────────────┬───────────┐
│ id  │    name     │ client_id │
├─────┼─────────────┼───────────┤
│ 666 │ project-666 │         1 │
│ 667 │ project-667 │         1 │
│ 668 │ project-668 │         3 │
└─────┴─────────────┴───────────┘

As it can be seen, the row(null,null,1) fills the client_id with a default for the rows that have it as null.

For this to work, we would need to get the columns defaults on our schema cache and then inject it on a row(..) for every INSERT query.

For tables it should be straightforward. For VIEWs, I think we need to find the view source columns(might have to revisit allSourceColumns in https://github.com/PostgREST/postgrest/pull/1625).

With this, we should be able to avoid the "All object keys must match" validation and stop looking at the payload for determining the columns.

steve-chavez on 6 Dec 2020

For this to work, we would need to get the columns defaults on our schema cache and then inject it on a row(..) for every INSERT query.

There's a problem with filling those default expressions in the json_populate_recordset argument: Those will only be evaluated once per query. So when you do a bulk-insert or bulk-patch and have a column default that is computed (e.g. nextval(...)), they will all end up with the same value.

This might even lead to breaking-changes regarding the insert of auto-generated PKs etc., without ?columns= set.

However, what about a query like this instead?

select
  ..,
  case
    when el ? 'client_id' then rec.client_id
    else <default expression here>
  end as client_id
from jsonb_array_elements($$[
    {"id": 666, "name": "project-666"}, 
    {"id": 667, "name": "project-667"}, 
    {"id": 668, "name": "project-668", "client_id": 3}
  ]$$) as el,
  jsonb_populate_record(null::projects, el) as rec
;

Note that, for the ? operator to work, this has to be jsonb instead of json.

If you worry about performance of the case, you could take this one step further and try to detect constant default values to set those via jsonb_populate_record(...) - and only the variable expressions via case.

For tables it should be straightforward. For VIEWs, I think we need to find the view source columns(might have to revisit allSourceColumns in #1625).

Views have default values for columns as well. They can be set with ALTER VIEW x ALTER COLUMN y SET DEFAULT z. So views can and should be handled just the same as tables.

wolfgangwalther on 6 Dec 2020

Was this page helpful?

0 / 5 - 0 ratings