Dbt: Optionally skip `expand_target_column_types` during incremental materialization

Created on 5 Aug 2020  路  3Comments  路  Source: fishtown-analytics/dbt

Describe the feature

Would like to be able to skip expand_target_column_types in an incremental materialization; we'd like the types in our tables to be immutable/fully specified by code.

Describe alternatives you've considered

Considered overriding the materialization and removing

{% do adapter.expand_target_column_types(
           from_relation=tmp_relation,
           to_relation=target_relation) %}

as someone did here https://getdbt.slack.com/archives/C2JRRQDTL/p1594844172456700?thread_ts=1594742649.422500&cid=C2JRRQDTL, but concerned about future code divergence/our team is very new to dbt so getting this out of the box would be great.

Additional context

@drewbanin suggested making an issue here https://getdbt.slack.com/archives/C2JRRQDTL/p1594820882440000?thread_ts=1594742649.422500&cid=C2JRRQDTL but I didn't see one, so figured I'd make my first contribution! Seems like a straightforward override flag in the config, like expand_target_column_types=false would suffice?

Who will this benefit?

Anyone who would like the types in their tables to be immutable/fully specified by code. For example, we have some values that we expect to be no more than 16 characters long; an unintended expansion of those fields to the max varchar length means we'd lose desired validation of those types.

Are you interested in contributing this feature?

I can't right now, but not for lack of desire; can see myself contributing in the future :)

enhancement good first issue

All 3 comments

Great idea, @smomen! I really like the rationale around preserving full code-based type specification.

I'm imagining a boolean config flag, expand_target_column_types, which is true by default (as it is today) but could be overridden:

{{ config(
    materialized = 'incremental',
    expand_target_column_types = false,
    ...
)}}

This would require a very small code addition to the default and Snowflake implementations of the incremental materialization. I think it would also be a good idea to add some debug logging: "Skipped expanding target column types because..." If there is data type mismatch, the user will see the corresponding database error.

The only other place where we use adapter. expand_target_column_types is for _snapshots_. In that case, I don't feel we want to disable column expansion, because the inputs to snapshots can be less predictable (frequently raw source data) and, as a non-idempotent operation, the stakes of failure are higher.

I'm going to tag this as a "good first issue" for anyone in the community interested in contributing. If you find some extra time lying around, I'm happy to work with you to get this shipshape.

I'm interested in working on this! @jtcohen6

@tashay sweet! Let me know if you'd like any support in getting started :)

Was this page helpful?
0 / 5 - 0 ratings