Dbt: --verbose flag

Created on 11 Dec 2017  路  5Comments  路  Source: fishtown-analytics/dbt

This could be be used as a global flag for a number of purposes:

  • details on incremental delete/inserts
  • show pre/post-hooks
  • Show sampled data for dbt seed
  • archival steps
  • Schema test failure SQL (good idea @ryantbrennan1)
  • etc

Most helpful comment

@norton120 I think something like that would be better handled by a dedicated feature like https://github.com/fishtown-analytics/dbt/issues/517 or https://github.com/fishtown-analytics/dbt/issues/903. My current thinking is that we should update the schema tests in dbt to:

  1. return the failing rows, not the count of failing rows, so that test failures can be observed directly
  2. print out the failing rows inline (maybe a configurable # of rows) to help close the loop on the testing workflow

Would love to hear your thoughts in one of these associated issues @norton120 :)

All 5 comments

馃憤 to this. Particularly useful for incremental model logging

This would help alleviate confusion about why "dbt is taking so long to get started". We run a few on-run-start hooks and have a large number of models so it takes a while before it gets to the meat of the dbt run.

I don't think the Jinja code would have to be written, just some description or name of the current step it's on.

Our dbt_project.yml has

on-run-start:
    - "{{resume_warehouse(var('resume_warehouse', false), var('warehouse_name'))}}"
    - "{{create_udfs()}}"

on-run-end:
    - "{{grant_usage_to_schemas(schema_name, rolename)}}"
    - "{{suspend_warehouse(var('suspend_warehouse', false), var('warehouse_name'))}}"

models:
    pre-hook: "{{ logging.log_model_start_event() }}"
    post-hook: "{{ logging.log_model_end_event() }}"

On a dbt run it'd be nice too see something like

dbt run --models date_details
Running with dbt=0.13.1

Parsing Macros, Models, Tests
Running on-run-start hook Resume Warehouse
Running on-run-start hook Create UDFs
Running model pre-hook Logging Start

Found 221 models, 888 tests, 4 archives, 0 analyses, 253 macros, 8 operations, 5 seed files, 114 sources

...

I'm not sure where that name / description would get set (maybe in a config somewhere or the jinja itself?

could --verbose include the head value for returned records in failed tests? something similar to when pytest assert x in y fails, where it would have a fixed number of characters from each vector printed? could leverage this with custom tests to seriously speed up development workflow - test fails and I get a sample of the failing rows where I've restricted the select to the subject columns, and can immediately debug without switching to query tool.

@norton120 I think something like that would be better handled by a dedicated feature like https://github.com/fishtown-analytics/dbt/issues/517 or https://github.com/fishtown-analytics/dbt/issues/903. My current thinking is that we should update the schema tests in dbt to:

  1. return the failing rows, not the count of failing rows, so that test failures can be observed directly
  2. print out the failing rows inline (maybe a configurable # of rows) to help close the loop on the testing workflow

Would love to hear your thoughts in one of these associated issues @norton120 :)

Closing this as unactionable in its current state, but it remains a pretty good idea. The hard part here isn't adding a --verbose flag, it's instrumenting additional logging throughout the entirety of dbt.

Was this page helpful?
0 / 5 - 0 ratings