Yugabyte-db: [YSQL] Columns with default values may be initialized to `null` instead

Created on 13 Aug 2019  路  8Comments  路  Source: yugabyte/yugabyte-db

In YB 1.3.1.0, on Debian, it looks like columns with a default value of NOW() may, occasionally, be initialized to NULL instead. With a schema like

CREATE TABLE foo (
  k TIMESTAMP DEFAULT NOW(),
  v INT
)  IF NOT EXISTS;

And a workload comprised of inserts (INSERT INTO foo (v) VALUES (?)), and reads (SELECT k, v FROM foo ORDER BY k), it appears that sometimes, k may wind up NULL, rather than being a timestamp. This log line shows a series of [k, v] pairs from a read; notice that it contains one row (13) with a timestamp, and one row (9) with null. 20190812T154729.000-0400.zip

2019-08-12 15:50:01,257{GMT}    INFO    [jepsen worker 5] yugabyte.ysql.append-table: table append11 has ([#inst "2019-08-12T19:49:57.609143000-00:00" 13] [nil 9])

Other tables may contain no null entries, or many nulls. These null values appear stable over multiple seconds. To reproduce, try Jepsen e324b33, and run

lein run test --os debian --version 1.3.1.0 -w ysql/append-table --concurrency 2n --time-limit 300

The checker doesn't verify this property, so the test failing or succeeding isn't meaningful. Instead, grep the logs for a line with a table with nil entries.

grep -e 'table append.\+ has.\+nil' store/latest/jepsen.log 
areysql kinbug

Most helpful comment

Heyyyy congats!!! :)

All 8 comments

Notes:

I've changed PG NOW function code (timestamp.c) to log its state:

Datum
now(PG_FUNCTION_ARGS)
{
    elog(LOG, " === Calling NOW() ===");
    TimestampTz txnStartTs = GetCurrentTransactionStartTimestamp();
    Datum result = TimestampTzGetDatum(txnStartTs);
    elog(LOG, " === Start TS: %ld, result: %ld ===", txnStartTs, result);
    return result;
}

When using ysqlsh, it triggers both when using original schema/query, or even simple SELECT NOW().

When tested with Jepsen, let's take a look at this log excerpt


(click to expand)

2019-08-14 15:37:32,297{GMT}    INFO    [jepsen worker 5] jepsen.util: 5    :invoke :txn    [[:append 1 3] [:r 1 nil]]
2019-08-14 15:37:32,297{GMT}    INFO    [jepsen worker 0] jepsen.util: 0    :invoke :txn    [[:append 0 2]]
2019-08-14 15:37:32,297{GMT}    INFO    [jepsen worker 4] jepsen.util: 4    :invoke :txn    [[:r 1 nil] [:r 1 nil] [:r 1 nil]]
2019-08-14 15:37:32,299{GMT}    INFO    [jepsen worker 6] jepsen.util: 6    :invoke :txn    [[:append 1 5] [:r 2 nil]]
2019-08-14 15:37:32,299{GMT}    INFO    [jepsen worker 1] jepsen.util: 1    :invoke :txn    [[:append 1 1] [:append 1 2] [:append 0 1] [:append 2 1]]
2019-08-14 15:37:32,299{GMT}    INFO    [jepsen worker 9] jepsen.util: 9    :invoke :txn    [[:r 2 nil] [:append 0 3]]
2019-08-14 15:37:32,299{GMT}    INFO    [jepsen worker 3] jepsen.util: 3    :invoke :txn    [[:r 0 nil] [:append 1 4] [:append 2 2] [:append 2 3]]
2019-08-14 15:37:32,300{GMT}    INFO    [jepsen worker 8] jepsen.util: 8    :invoke :txn    [[:append 1 6] [:r 1 nil] [:r 2 nil]]
2019-08-14 15:37:32,300{GMT}    INFO    [jepsen worker 2] jepsen.util: 2    :invoke :txn    [[:append 2 4] [:append 1 7] [:r 0 nil] [:r 2 nil]]
2019-08-14 15:37:32,300{GMT}    INFO    [jepsen worker 7] jepsen.util: 7    :invoke :txn    [[:r 1 nil] [:r 1 nil]]
2019-08-14 15:37:32,345{GMT}    INFO    [jepsen worker 4] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,345{GMT}    INFO    [jepsen worker 3] yugabyte.ysql.append-table: Creating table append0 and retrying
2019-08-14 15:37:32,346{GMT}    INFO    [jepsen worker 9] yugabyte.ysql.append-table: Creating table append2 and retrying
2019-08-14 15:37:32,346{GMT}    INFO    [jepsen worker 7] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,347{GMT}    INFO    [jepsen worker 1] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,350{GMT}    INFO    [jepsen worker 6] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,354{GMT}    INFO    [jepsen worker 8] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,360{GMT}    INFO    [jepsen worker 5] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,360{GMT}    INFO    [jepsen worker 2] yugabyte.ysql.append-table: Creating table append2 and retrying
2019-08-14 15:37:32,362{GMT}    INFO    [jepsen worker 0] yugabyte.ysql.append-table: Creating table append0 and retrying
2019-08-14 15:37:40,485{GMT}    WARN    [jepsen worker 7] yugabyte.ysql.append-table: Encountered error with conn "n3"; reopening
2019-08-14 15:37:40,499{GMT}    WARN    [jepsen worker 8] yugabyte.ysql.append-table: Encountered error with conn "n4"; reopening
2019-08-14 15:37:40,500{GMT}    WARN    [jepsen worker 2] yugabyte.ysql.append-table: Encountered error with conn "n3"; reopening
2019-08-14 15:37:40,527{GMT}    WARN    [jepsen worker 0] yugabyte.ysql.append-table: Encountered error with conn "n1"; reopening
2019-08-14 15:37:40,531{GMT}    WARN    [jepsen worker 1] yugabyte.ysql.append-table: Encountered error with conn "n2"; reopening
2019-08-14 15:37:40,546{GMT}    WARN    [jepsen worker 5] yugabyte.ysql.append-table: Encountered error with conn "n1"; reopening
2019-08-14 15:37:40,672{GMT}    INFO    [jepsen worker 8] jepsen.util: 8    :info   :txn    [[:append 1 6] [:r 1 nil] [:r 2 nil]]   [:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,680{GMT}    INFO    [jepsen worker 2] jepsen.util: 2    :info   :txn    [[:append 2 4] [:append 1 7] [:r 0 nil] [:r 2 nil]] [:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,685{GMT}    INFO    [jepsen worker 8] jepsen.util: 18   :invoke :txn    [[:r 1 nil] [:r 1 nil]]
2019-08-14 15:37:40,685{GMT}    INFO    [jepsen worker 7] jepsen.util: 7    :info   :txn    [[:r 1 nil] [:r 1 nil]] [:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,690{GMT}    INFO    [jepsen worker 7] jepsen.util: 17   :invoke :txn    [[:r 2 nil]]
2019-08-14 15:37:40,690{GMT}    INFO    [jepsen worker 1] jepsen.util: 1    :info   :txn    [[:append 1 1] [:append 1 2] [:append 0 1] [:append 2 1]]   [:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,697{GMT}    INFO    [jepsen worker 2] jepsen.util: 12   :invoke :txn    [[:append 2 5] [:append 2 6]]
2019-08-14 15:37:40,697{GMT}    INFO    [jepsen worker 1] jepsen.util: 11   :invoke :txn    [[:append 0 4] [:append 2 7]]
2019-08-14 15:37:40,730{GMT}    INFO    [jepsen worker 0] jepsen.util: 0    :info   :txn    [[:append 0 2]] [:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,734{GMT}    INFO    [jepsen worker 0] jepsen.util: 10   :invoke :txn    [[:append 0 5]]
2019-08-14 15:37:40,757{GMT}    INFO    [jepsen worker 5] jepsen.util: 5    :info   :txn    [[:append 1 3] [:r 1 nil]]  [:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,764{GMT}    INFO    [jepsen worker 5] jepsen.util: 15   :invoke :txn    [[:append 1 8] [:append 0 6] [:r 0 nil] [:r 1 nil]]
2019-08-14 15:37:40,964{GMT}    INFO    [jepsen worker 2] jepsen.util: 12   :ok :txn    [[:append 2 5] [:append 2 6]]
2019-08-14 15:37:40,964{GMT}    INFO    [jepsen worker 2] jepsen.util: 12   :invoke :txn    [[:r 1 nil]]
2019-08-14 15:37:41,010{GMT}    INFO    [jepsen worker 0] jepsen.util: 10   :ok :txn    [[:append 0 5]]
2019-08-14 15:37:41,010{GMT}    INFO    [jepsen worker 0] jepsen.util: 10   :invoke :txn    [[:r 1 nil] [:r 2 nil]]
2019-08-14 15:37:41,090{GMT}    INFO    [jepsen worker 1] jepsen.util: 11   :ok :txn    [[:append 0 4] [:append 2 7]]
2019-08-14 15:37:41,091{GMT}    INFO    [jepsen worker 1] jepsen.util: 11   :invoke :txn    [[:append 2 8] [:r 2 nil] [:r 0 nil] [:append 0 7]]
2019-08-14 15:37:41,484{GMT}    INFO    [jepsen worker 7] yugabyte.ysql.append-table: table append2 has ([nil 5] [nil 6] [nil 7])
2019-08-14 15:37:41,484{GMT}    INFO    [jepsen worker 7] jepsen.util: 17   :ok :txn    [[:r 2 [5 6 7]]]
2019-08-14 15:37:41,485{GMT}    INFO    [jepsen worker 7] jepsen.util: 17   :invoke :txn    [[:append 1 9] [:r 1 nil] [:append 2 9] [:r 1 nil]]
2019-08-14 15:37:41,556{GMT}    INFO    [jepsen worker 5] yugabyte.ysql.append-table: table append0 has ([nil 6] [nil 4] [nil 5])
2019-08-14 15:37:43,273{GMT}    WARN    [jepsen worker 0] yugabyte.ysql.append-table: Encountered error with conn "n1"; reopening
2019-08-14 15:37:43,889{GMT}    WARN    [jepsen worker 3] yugabyte.ysql.append-table: Encountered error with conn "n4"; reopening
2019-08-14 15:37:43,985{GMT}    WARN    [jepsen worker 8] yugabyte.ysql.append-table: Encountered error with conn "n4"; reopening
2019-08-14 15:37:44,389{GMT}    INFO    [jepsen worker 9] yugabyte.ysql.append-table: table append2 has ([nil 5] [nil 6] [nil 7])

We can search for the first NOW() invocation in the logs:

alex@alex-debian-jessie:~/code/jepsen/yugabyte$ grep -iR '=== Start TS' store/latest/ | cut -d' ' -f2 | sort | head -n 1
15:37:43.851

Note that first append transaction returned :ok at 15:37:40,964 - meaning that NOW() wasn't invoked at all!
Logs: logs-jepsen_2019-08-14_20-40-08__append-table_c2n_t60_logs2.tar.gz

Interesting thing: we're actually writing tombstone instead of NOW

I0815 11:58:14.998617  3349 docdb.cc:1409] === Apply: SubDocKey(DocKey(0x04cf, ["B\xe8Z2mUO\xba\xb1R\xeaO\xb0\x18v\xff"], []), [SystemColumnId(0)]) = null (type = kNullLow) (int64 = 4067611415479844864), time: { physical: 1565870294982314 }, txn: 7af7b080-2dd3-4dbb-9a3b-f371dd089576, raw: 4704CF5342E85A326D554FBAB152EA4FB01876FF000021214A800D08238001C41760777966804A
I0815 11:58:15.000723  3349 docdb.cc:1409] === Apply: SubDocKey(DocKey(0x04cf, ["B\xe8Z2mUO\xba\xb1R\xeaO\xb0\x18v\xff"], []), [ColumnId(1)]) = DEL (type = kTombstone) (int64 = 232), time: { physical: 1565870294982314 }, txn: 7af7b080-2dd3-4dbb-9a3b-f371dd089576, raw: 4704CF5342E85A326D554FBAB152EA4FB01876FF000021214B810D08238001C41760777966803FAB
I0815 11:58:15.001297  3349 docdb.cc:1409] === Apply: SubDocKey(DocKey(0x04cf, ["B\xe8Z2mUO\xba\xb1R\xeaO\xb0\x18v\xff"], []), [ColumnId(2)]) = 9 (type = kInt32) (int64 = 38654705824), time: { physical: 1565870294982314 }, txn: 7af7b080-2dd3-4dbb-9a3b-f371dd089576, raw: 4704CF5342E85A326D554FBAB152EA4FB01876FF000021214B820D08238001C41760777966803F8B

@m-iancu made an assumption about the cause:

What can happen is that:
a. conn 1 creates the table (this does multiple ops internally):

  1. create YB table
  2. add pg_class/pg_attribute entries
  3. modify entries to set default (done as a separate step even when part of create table query itself)

b. conn 2 does an insert between step 2 and 3 then it could cause this issue.

This appears to be correct, the following Java test reliably reproduces this:
https://gist.github.com/frozenspider/9c5719968f5acf21a89c82fdf22f37a3

Should be fixed as part of #1404

I've expanded the test which uncovered this issue into a workload which specifically demonstrates the issue and reports its presence. Note that this problem affects any default value, not just timestamps or NOW(). With Jepsen d9e6a09, try

lein run test --os debian --url https://s3-us-west-2.amazonaws.com/downloads.yugabyte.com/new/yugabyte-1.3.1.2-b1-release-centos-x86_64.tar.gz -w ysql/default-value --time-limit 300

This has been resolved and is no longer reproducible. This was confirmed by verifying the following:

  • These issues still reproduce when test is run against the original version v1.3.1.0
  • These issues no longer reproduce when test is run against v2.1.3 (as of commit 9f4448d)

Note that in order to be safe, the testing was done using 100 iterations of the Jepsen test

lein run test --os debian ... -w ysql/append-table --concurrency 2n --time-limit 300

(using Jepsen commit e324b33 and corresponding --version or --url)

Heyyyy congats!!! :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rkarthik007 picture rkarthik007  路  5Comments

fabiocmazzo picture fabiocmazzo  路  4Comments

ndeodhar picture ndeodhar  路  5Comments

frafra picture frafra  路  3Comments

joeblew99 picture joeblew99  路  5Comments