Dbt: Relation doesn't exist for dbt test after `dbt run -m state:modified` if the test was modified but the model wasn't

Created on 15 Oct 2020  路  3Comments  路  Source: fishtown-analytics/dbt

Describe the bug

Our Slim CI job runs

dbt run -m state:modified
dbt test -m state:modified

(there are some additional exclusions to the dbt test arg, but they're not relevant here)

When I added another value to an accepted_values test, dbt identified that the test needed to be run, but it hadn't built the model during dbt run so I got this error

Database Error in test accepted_values_revenue__at_risk_by_day_at_risk_status__Lost__At_Risk__Large_Decrease__Strategic_Loss (models/marts/revenue/revenue.yml)
  relation "dbt_cloud_pr_5833_158.revenue__at_risk_by_day" does not exist
  compiled SQL at target/compiled/educationperfect/models/marts/revenue/revenue.yml/schema_test/accepted_values_revenue__at_risk_by_day_b3322b939ced8c28690422e1448abfc2.sql

Steps To Reproduce

  1. Change the existing test

        - name: at_risk_status
          tests:
          - accepted_values:
              values: ['Lost', 'At Risk', 'Large Decrease']

to


        - name: at_risk_status
          tests:
          - accepted_values:
              values: ['Lost', 'At Risk', 'Large Decrease', 'Strategic Loss']

and do

dbt run -m state:modified
dbt test -m state:modified

Expected behavior

The test to pass, because all values in the column are accepted

Screenshots and log output

dbt run debug.log
dbt run console.log
dbt test console.log
dbt test debug.log

System information

Which database are you using dbt with?

  • [ ] postgres
  • [x] redshift
  • [ ] bigquery
  • [ ] snowflake
  • [ ] other (specify: ____________)

The output of dbt --version:

Running with dbt=0.18.0

The operating system you're using:
dbt Cloud

The output of python --version:

bug

Most helpful comment

Hey @joellabes, you're asking the $64k question here. @huntdm07 and I had a lengthy thread about just this in slack yesterday.

My initial hope had been to resolve this via deferral for tests (#2701). If there's a model you didn't build, no worries, dbt will execute your new/modified test query against the production (deferred) version of that model instead. This turned out to be way trickier to implement than we expected, based on how resource selection works鈥攈ow can dbt determine if a model is in the selection criteria or should be deferred, when in actuality dbt test doesn't _really_ select _models_ at all? Conceptually, test deferral is much stranger than it appears on first glance, and it led us into a number of lose-lose corner cases.

Instead, I've placed my hopes elsewhere: given all the clever node selection syntax available in v0.18, we should be able to find a way to make this work. There's some good discussion in #2704 about what that looks like today vs. how it could look in the future.

Today, we can write selection syntax to ensure that the first-order unmodified parent of a modified test is always built in a CI run. In a dbt Cloud job definition (i.e. without --defer and --state), I think that looks like:

dbt run -m state:modified 1+state:modified,1+test_type:schema 1+state:modified,1+test_type:data
dbt test -m state:modified

"Run all modified models, plus any models that are first-order parents of modified tests." This is a safe-not-sorry approach: it's likely to include a few more models than strictly necessary鈥攊.e. unmodified models that are first-order parents of modified models and also have unmodified tests鈥攂ut it should work.

Longer term, the answer here looks like:

  • Having a way, in test selection, to disambiguate between selecting tests with property X, on the one hand, and tests that select from models with property X, on the other (Jake's proposal in #2704). Because of "last-mile" test selection as it works today, it's very convenient to write dbt test -m my_model and mean "run all the tests on my_model", but it's much harder than it should be to pick out tests with a fine-toothed comb.
  • Having a way, via YAML selectors, to apply an intersection _before_ the first-order parent selection (Dan's proposal in #2704), so as to avoid the collateral baggage.

That's my spiel, and I appreciate you opening this issue because it forced me to write it down in one place. So... whatcha think?

All 3 comments

Hey @joellabes, you're asking the $64k question here. @huntdm07 and I had a lengthy thread about just this in slack yesterday.

My initial hope had been to resolve this via deferral for tests (#2701). If there's a model you didn't build, no worries, dbt will execute your new/modified test query against the production (deferred) version of that model instead. This turned out to be way trickier to implement than we expected, based on how resource selection works鈥攈ow can dbt determine if a model is in the selection criteria or should be deferred, when in actuality dbt test doesn't _really_ select _models_ at all? Conceptually, test deferral is much stranger than it appears on first glance, and it led us into a number of lose-lose corner cases.

Instead, I've placed my hopes elsewhere: given all the clever node selection syntax available in v0.18, we should be able to find a way to make this work. There's some good discussion in #2704 about what that looks like today vs. how it could look in the future.

Today, we can write selection syntax to ensure that the first-order unmodified parent of a modified test is always built in a CI run. In a dbt Cloud job definition (i.e. without --defer and --state), I think that looks like:

dbt run -m state:modified 1+state:modified,1+test_type:schema 1+state:modified,1+test_type:data
dbt test -m state:modified

"Run all modified models, plus any models that are first-order parents of modified tests." This is a safe-not-sorry approach: it's likely to include a few more models than strictly necessary鈥攊.e. unmodified models that are first-order parents of modified models and also have unmodified tests鈥攂ut it should work.

Longer term, the answer here looks like:

  • Having a way, in test selection, to disambiguate between selecting tests with property X, on the one hand, and tests that select from models with property X, on the other (Jake's proposal in #2704). Because of "last-mile" test selection as it works today, it's very convenient to write dbt test -m my_model and mean "run all the tests on my_model", but it's much harder than it should be to pick out tests with a fine-toothed comb.
  • Having a way, via YAML selectors, to apply an intersection _before_ the first-order parent selection (Dan's proposal in #2704), so as to avoid the collateral baggage.

That's my spiel, and I appreciate you opening this issue because it forced me to write it down in one place. So... whatcha think?

My initial hope had been to resolve this via deferral for tests (#2701).

This is how I thought it worked - I must have read that proposal at some point and not caught up with all the gotchas that led to it being put aside. All good!

I don't follow all of the backstory of Jake's proposal in #2704, but it implies that there's some baggage where the dbt test selectors magically expand to do what people _mean_, not what they strictly _typed_. And as with all magic it's hard to override it if you want something else?

So... whatcha think?

Right now, I think I'm happy to sit back and trust y'all to make a good decision! I haven't got enough context to add much of value one way or the other. The (admittedly verbose) selector you've got above acts as a good workaround for now, so thanks for that!

And as with all magic it's hard to override it if you want something else?

That's right!

This is now documented (prerelease) as a known caveat to state:modified. I'm going to close this issue for the time being, but it certainly doesn't mean that we're done thinking about how to better address this in future.

Was this page helpful?
0 / 5 - 0 ratings