Cockroach: sql: default value (stored in `pg_attrdef.adbin`) loses type information after ALTER TABLE

Created on 9 Apr 2020  路  13Comments  路  Source: cockroachdb/cockroach

ActiveRecord runs a query against PostgreSQL to get column information. The results come back slightly differently when it's run against CockroachDB for date columns. Here it is against PostgreSQL

activerecord_unittest=# SELECT a.attname, format_type(a.atttypid, a.atttypmod), pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod, c.collname, col_description(a.attrelid, a.attnum) AS comment FROM pg_attribute a LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum LEFT JOIN pg_type t ON a.atttypid = t.oid LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation WHERE a.attrelid = 'defaults'::regclass AND a.attnum > 0 AND NOT a.attisdropped AND attname = 'fixed_date';
  attname   | format_type |    pg_get_expr     | attnotnull | atttypid | atttypmod | collname | comment
------------+-------------+--------------------+------------+----------+-----------+----------+---------
 fixed_date | date        | '2004-01-01'::date | f          |     1082 |        -1 |          |

and against CockroachDB

root@localhost:26258/activerecord_unittest> SELECT a.attname, format_type(a.atttypid, a.atttypmod), pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod, c.collname, col_description(a.attrelid, a.attnum) AS comment FROM pg_attribute a LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum LEFT JOIN pg_type t ON a.atttypid = t.oid LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation WHERE a.attrelid = 'defaults'::regclass AND a.attnum > 0 AND NOT a.attisdropped AND attname = 'fixed_date';
   attname   | format_type | pg_get_expr  | attnotnull | atttypid | atttypmod | collname | comment
-------------+-------------+--------------+------------+----------+-----------+----------+----------
  fixed_date | date        | '2004-01-01' |   false    |     1082 |        -1 | NULL     | NULL
(1 row)

We should have pg_get_expr to return '2004-01-01'::date instead of '2004-01-01'. @otan mentioned that we just return the exact same value back but the correct way to do it is to parse that string as the oid given and return it with the cast if necessary as a string.

A-sql-pgcompat C-bug S-3-ux-surprise good first issue

Most helpful comment

Making pg_attrdef look like psql sounds like a fine idea.

How can we know the type, as in the case of the default column value, we pass,

Ah i misread it a little bit. Your solution seems better.

All 13 comments

since it's not urgent, i'm going to let the world tackle this one! (let me know if it should be done quicker though)

First time here, and I'd be happy to tackle this.

I have a question: I'm looking at "pg_get_expr" in pg_builtins, and wondering how to implement

the correct way to do it is to parse that string as the oid given and return it . . .

I see that the oid and the string in question are passed in as arguments but need more context on how to add parsing.

Excited that you're looking to take this on

I think we'll probably have to do something like this:

play around with both postgres and crdb and make sure the inputs look alike.

I would ignore the typmod argument for now; if you're curious, we can tackle that in a follow up PR.

I'm seeing something similar when changing string column defaults. For example, here's a companies table. type doesn't have a default.

root@localhost:26258/activerecord_unittest> SELECT a.attname, format_type(a.atttypid, a.atttypmod), pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod, c.collname, col_description(a.attrelid,
a.attnum) AS comment FROM pg_attribute a LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum LEFT JOIN pg_type t ON a.atttypid = t.oid LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation WHERE a.attrelid = 'companies'::regclass AND a.attnum > 0 AND NOT a.attisdropped ORDER BY a.attnum;
    attname   |    format_type    |               pg_get_expr               | attnotnull | atttypid | atttypmod | collname | comment
--------------+-------------------+-----------------------------------------+------------+----------+-----------+----------+----------
  id          | bigint            | nextval('companies_nonstd_seq'::STRING) |    true    |       20 |        -1 | NULL     | NULL
  type        | character varying | NULL                                    |   false    |     1043 |        -1 | NULL     | NULL
  firm_id     | bigint            | NULL                                    |   false    |       20 |        -1 | NULL     | NULL
  firm_name   | character varying | NULL                                    |   false    |     1043 |        -1 | NULL     | NULL
  name        | character varying | NULL                                    |   false    |     1043 |        -1 | NULL     | NULL
  client_of   | bigint            | NULL                                    |   false    |       20 |        -1 | NULL     | NULL
  rating      | bigint            | 1                                       |   false    |       20 |        -1 | NULL     | NULL
  account_id  | bigint            | NULL                                    |   false    |       20 |        -1 | NULL     | NULL
  description | character varying | ''::STRING                              |   false    |     1043 |        -1 | NULL     | NULL
(9 rows)

After setting a default on type (ALTER TABLE "companies" ALTER COLUMN "type" SET DEFAULT 'Firm'), the default will be 'Firm' but I'd expect it to be'Firm'::STRING. description had a default on creation, and it comes back as ''::STRING.

root@localhost:26258/activerecord_unittest> SELECT a.attname, format_type(a.atttypid, a.atttypmod), pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod, c.collname, col_description(a.attrelid,
a.attnum) AS comment FROM pg_attribute a LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum LEFT JOIN pg_type t ON a.atttypid = t.oid LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation WHERE a.attrelid = 'companies'::regclass AND a.attnum > 0 AND NOT a.attisdropped ORDER BY a.attnum;
    attname   |    format_type    |               pg_get_expr               | attnotnull | atttypid | atttypmod | collname | comment
--------------+-------------------+-----------------------------------------+------------+----------+-----------+----------+----------
  id          | bigint            | nextval('companies_nonstd_seq'::STRING) |    true    |       20 |        -1 | NULL     | NULL
  type        | character varying | 'Firm'                                  |   false    |     1043 |        -1 | NULL     | NULL
  firm_id     | bigint            | NULL                                    |   false    |       20 |        -1 | NULL     | NULL
  firm_name   | character varying | NULL                                    |   false    |     1043 |        -1 | NULL     | NULL
  name        | character varying | NULL                                    |   false    |     1043 |        -1 | NULL     | NULL
  client_of   | bigint            | NULL                                    |   false    |       20 |        -1 | NULL     | NULL
  rating      | bigint            | 1                                       |   false    |       20 |        -1 | NULL     | NULL
  account_id  | bigint            | NULL                                    |   false    |       20 |        -1 | NULL     | NULL
  description | character varying | ''::STRING                              |   false    |     1043 |        -1 | NULL     | NULL
(9 rows)

I'd be happy to tackle this issue :blush:,
currently, I am facing problem in understanding,

the string is a Datum, you just need to map an OID to a Type. This map should help:

How can we know the type, as in the case of the default column value, we pass,

pg_get_expr(adbin, adrelid)

which are string and relation's oid.

and, I am not sure how difficult is to implement this,
but is there any problem in modifying pg_attrdef to look more alike psql.
as in crdb we have,

    oid     | adrelid | adnum |     adbin     |     adsrc
-------------+---------+-------+---------------+----------------
  1926519145 |      52 |     1 | '2001-01-01'  | '2001-01-01'
  3271596152 |      53 |     1 | 'sss'::STRING | 'sss'::STRING

in psql


adrelid | adnum |                                                                             adbin                                                                              |          adsrc           
---------+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------
   19671 |     1 | {CONST :consttype 1082 :consttypmod -1 :constcollid 0 :constlen 4 :constbyval true :constisnull false :location 47 :constvalue 4 [ 28 27 0 0 0 0 0 0 ]}        | '2019-01-01'::date
   19682 |     1 | {CONST :consttype 1043 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull false :location 45 :constvalue 7 [ 28 0 0 0 115 115 115 ]} | 'sss'::character varying

Thank you.

Ah @abhishek20123g I got stuck on the issue you're describing last weekend -- I saw that adrelid wasn't present in OidToType. If it's acceptable, your idea of updating pg_attrdef to be closer to psqls might be pretty simple. I see that pg_attrdef in pg_catalog.go could use the Type field on ColumnDescriptor when constructing adbin.
@otan what is the best way to proceed?

Making pg_attrdef look like psql sounds like a fine idea.

How can we know the type, as in the case of the default column value, we pass,

Ah i misread it a little bit. Your solution seems better.

Sweet! Thanks so much @otan . I'm writing tests now. The functions tested in https://github.com/cockroachdb/cockroach/blob/78a8ce08af647843789caa3c57117aa824edd614/pkg/sql/sem/tree/testdata/eval/builtins can be tested be tested by passing in primitive values but I _think_ to test pg_get_expr I will need to setup some tables. Additionally, where do tests for pg_attrdef live?

if you need tables they can go in here: https://github.com/cockroachdb/cockroach/blob/25bcf0b9d6c69abb6053f53e9f9c3513ece14166/pkg/sql/logictest/testdata/logic_test/builtin_function

make testbaselogic FILES='builtin_function' will help, optionally use make testbaselogic FILES='builtin_function' TESTFLAGS='-rewrite' if you're too lazy to generate the output yourself manually. Looks like tests for pg_attrdef are splattered around the logic_test directory, so put it where it makes sense for you.

Heads up I'm going to be trying to clock off on weekends, so sorry if there's slow replies over then. If you need faster assistance, feel free to HMU on the community slack!

@otan I've found that for some data types, such as varchar, the datatype is already append to the column.DefaultExpr after being run through FormatNode in pg_attrdef. I tried to trace through FormatNode to understand where the decision to append datatype or not is made, but am having a bit of trouble understanding the flags.
Are any of the FormatNode flags set by default? Or, in the case of pg_attrdef is the explicitly set FmtPGAttrdefAdbin flag the only set flag?
Additionally, I confirmed the case @alimi mentioned above: if a default is added to a string field after table creation, the ::STRING suffix is not present.

Are any of the FormatNode flags set by default? Or, in the case of pg_attrdef is the explicitly set FmtPGAttrdefAdbin flag the only set flag?

Doesn't appear that way from what I checked, so I assumed so.

if a default is added to a string field after table creation

yep seems to be the case. i think the actual defaultexpr stored is different.

[email protected]:52539/movr> create table a (a text, b text default 'bob');
CREATE TABLE

Time: 2.937ms

[email protected]:52539/movr> select * from pg_attrdef;
     oid    | adrelid | adnum |     adbin      |     adsrc
------------+---------+-------+----------------+-----------------
  581442133 |      59 |     2 | 'bob'::STRING  | 'bob'::STRING
  581442132 |      59 |     3 | unique_rowid() | unique_rowid()
(2 rows)

Time: 1.559ms

[email protected]:52539/movr> alter table a alter column a set default 'alice';
ALTER TABLE

Time: 22.563ms

[email protected]:52539/movr> select * from pg_attrdef;
     oid    | adrelid | adnum |     adbin      |     adsrc
------------+---------+-------+----------------+-----------------
  581442134 |      59 |     1 | 'alice'        | 'alice'
  581442133 |      59 |     2 | 'bob'::STRING  | 'bob'::STRING
  581442132 |      59 |     3 | unique_rowid() | unique_rowid()
(3 rows)

it actually just looks like we sanitize default expressions differently when we run ALTER TABLE (https://github.com/cockroachdb/cockroach/blob/91db3e274bf984ce40d4b659ab4b0e6f8814e0d6/pkg/sql/alter_table.go#L962) vs CREATE TABLE (https://github.com/cockroachdb/cockroach/blob/91db3e274bf984ce40d4b659ab4b0e6f8814e0d6/pkg/sql/sqlbase/table.go#L170).

i'd check to see see if they're equal at creation time - if so something else is fishy, if not, we'll have to fix that as a bug and make pg_attrdef implicitly cast it if it detects it to not have a cast around it.

I'm renaming the issue, because the pg_get_expr expression only differs after doing alter table .. set default. For example:

root@:26257/defaultdb> create table dev (a int primary key, b string default 'foo');
CREATE TABLE

root@:26257/defaultdb> SELECT a.attname, format_type(a.atttypid, a.atttypmod), pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod, c.collname, col_description(a.attrelid, a.attnum) AS comment FROM pg_attribute a LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum LEFT JOIN pg_type t ON a.atttypid = t.oid LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation WHERE a.attrelid = 'dev'::regclass AND a.attnum > 0 AND NOT a.attisdropped AND attname = 'b';
  attname | format_type |  pg_get_expr  | attnotnull | atttypid | atttypmod | collname | comment
----------+-------------+---------------+------------+----------+-----------+----------+----------
  b       | text        | 'foo'::STRING |   false    |       25 |        -1 | NULL     | NULL

root@:26257/defaultdb> alter table dev alter column b set default 'bar';
ALTER TABLE

root@:26257/defaultdb> SELECT a.attname, format_type(a.atttypid, a.atttypmod), pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod, c.collname, col_description(a.attrelid, a.attnum) AS comment FROM pg_attribute a LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum LEFT JOIN pg_type t ON a.atttypid = t.oid LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation WHERE a.attrelid = 'dev'::regclass AND a.attnum > 0 AND NOT a.attisdropped AND attname = 'b';
  attname | format_type | pg_get_expr | attnotnull | atttypid | atttypmod | collname | comment
----------+-------------+-------------+------------+----------+-----------+----------+----------
  b       | text        | 'bar'       |   false    |       25 |        -1 | NULL     | NULL

@otan it looks like the default expressions in applyColumnMutation and MakeColumnDefDescs are the same until they go through tree.Serialize . The difference happens in FormatNode in sql/sem/tree/format.go at the line https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/sem/tree/format.go#L363.
The argument passed to tree.Serialize in applyColumnMutation, t.Default, is not a Datum while the argument passed to tree.Serialize in MakeColumnDefDescs is a Datum.
I tried to fix the discrepancy between applyColumnMutation and MakeColumnDefDescs by turning t.Default.String() into a Datum using tree.NewDString which _almost_ worked. The resulting value is surrounded by some extra characters e.g. e'\'a_default\''::STRING.
With regards to the originally raised issue, date values not being suffixed with ::date regardless of if the column had a default value upon creation, the argument passed to tree.Serialize IS a Datum, but for some reason, even though the line https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/sem/tree/format.go#L373 is executed, the final value doesn't have the type appended.
Where should these inconsistencies be resolved? At the applyColumnMutation and MakeColumnDefDescs level, the pg_attr_def level, or both?
Also, how do you make the nice, clean source code references?

Was this page helpful?
0 / 5 - 0 ratings