Dbt: issues with concurrent transaction

Created on 10 Feb 2018  路  13Comments  路  Source: fishtown-analytics/dbt

The proper way to atomically replace a view/table in Postgres/Redshift is:

  1. begin transaction
  2. rename model to "model_old"
  3. rename new model to "model'"
  4. commit
  5. drop table "model_old"

Currently, dbt drops the table inside of the transaction, which can lead to errors like:

ERROR: table 12345 dropped by concurrent transaction

or

could not complete because of conflict with concurrent transaction

Via @teej
See also: this SO post

To fix this, dbt's table and view materializations need to be updated to drop the old model _outside_ of the transaction.

This PR will ultimately require < 5 lines of code change, but since it involves the table and view materializations, we should be very cautious here. Is there a good way to write an integration test for this?

bug redshifpg

All 13 comments

You might be able to simulate this behavior by:

  • First, run dbt and build a model.
  • Second, start a second dbt run which builds a model in a thread
  • Finally, in a concurrent thread, attempt to access the results of the model.

However I think the model would have to take long enough to build so as to ensure that the test doesn't sporadically pass.

You can force this condition to trigger by adding a sleep in the model's SQL. You can sleep in Redshift with:

CREATE OR REPLACE FUNCTION f_sleep (x float)
RETURNS bool IMMUTABLE
AS
$$
  from time import sleep
  sleep(x)
  return True
$$ LANGUAGE plpythonu;

oooh nice idea @teej

We only started seeing this issue when updating from 0.8.0 to 0.9.1.

Is there any workaround? And, do we know which version introduced this issue?

@igrayson we overhauled how model materializations work in 0.9.0, but it looks like the core SQL logic was unchanged.

Ultimately, this is going to be a small code change, but it's a little daunting to make because Redshift's transaction management is pretty finicky.

Happy to point someone in the right direction if they're interested in tackling this, and if not, I think we'll be able to hit it pretty soon. @igrayson @teej @abelsonlive are you folks able to help test an alpha version of this fix when one is ready? That would go a long way towards making me feel good about any prospective fix.

@drewbanin can definitely help test anything you put out... have been a bit swamped but might still take a stab at this if my schedule clears up

thanks @abelsonlive -- we're focused on getting 0.9.2 out, we'll tackle this one for 0.9.3

Sounds good - as discussed on the Slack channel, this could potentially also fix an issue I've been having where DBT is forced to wait indefinitely due to Redshift failing to execute some statements (e.g. DROP VIEW) within a transaction block. When I check the stv_recents Redshift table, the query is left in a 'running' state and doesn't complete.

This change is going to apply to Redshift / Postgres / Snowflake

I could reproduce that bug when specifying a pre-hook at the model level like so: docs

image

Running with dbt=0.14.3

@louisguitton that's surprising -- can you tell me which database you're using? We have _definitely_ seen this issue when grants are applied to a _schema_, but i've never seen it happen when grants are applied to a specific model.

Eg, this can fail:

models:
  project-name:
    post-hook: "grant select on all tables in schema {{ this.schema }} to db_reader"

or

models:
  project-name:
    post-hook: "grant usage on schema {{ this.schema }} to db_reader"

What's the exact error message that you're seeing from the database?

Thanks for following up @drewbanin .
I'm using Redshift, and a pre_hook that creates 1 UDF.
I'm running dbt run --models model_1 model_2 with 8 threads and I get

Completed with 1 error and 0 warnings:

Database Error in model stg_localytics__users_match_viewed (models/staging/localytics/stg_localytics__users_match_viewed.sql)
  could not complete because of conflict with concurrent transaction
  compiled SQL at target/compiled/of_models/staging/localytics/stg_localytics__users_match_viewed.sql

Due to the error message, I assume this is to be expected: the UDF is created twice (once for model_1 and once for model_2) on separate threads so they end up being conflicting concurrent transactions.

If I run dbt run --models model_1 && dbt run --models model_2 it works fine.

I guess then I'm approaching creating the UDFs wrong.

Ah! ok - are you able to use something like an operation to create these UDFs? Ideally Redshift would let you create UDFs transactionally, but I suppose there isn't a ton of merit to creating the same UDF once for each model anyway

Was this page helpful?
0 / 5 - 0 ratings