DBT should have a dry-run mode so that users can see all of the "commands" DBT will run for a given command without having DBT actually execute those commands against a database.
This can be useful for both debugging as well as making sure that a given command is configured properly. This is especially useful when you're configuring a production setup and want to make sure you're not about to drop schemas or tables you didn't intend to.
It can also be useful for testing macros and other DBT configurations that are post-compilation but don't show up in the compiled SQL
So, I took a look at dbt/adapters/default/impl.py to see how hard this might be and I don't think I have enough context on why some things are implemented the way they are to take a stab at this myself without some guidance.
I can see that the add_query method is where the statements get executed (cursor.execute(sql, bindings) but there's a bunch of other stuff happening in here I don't understand (I suspect a lot of it is related to handling multiple connections).
In case it's helpful to see where my head is going, the way I've implemented this in the past is to have two functions for executing SQL. One for commands (no results returned) and one for queries (returning data) -- this way your read and write operations are split.
Then, for the dry-run mode, you can put some logic in the command-handling function that just logs the command instead of passing it to cur.execute(). Something like:
def run_command(sql):
print(sql)
if not DRY_RUN:
self.cur.execute(sql)
def get_data(sql):
print(sql)
self.cur.execute(sql)
result = cur.fetchall()
return result
yes, I vote to have the dry-run feature, which will be super helpful for debug and performance tuning
@cmcarthur What's the reasoning for this to be removed from the milestone?
Hi @thalesmello - this is a pretty involved change to dbt! Connor and I tried to prioritize it for the Wilt Chamberlain release (0.14.0), but we ended up deciding to cull it in favor of Archive-related functionality. This feature is definitely going to be a big improvement to dbt - I'm looking forward to re-prioritizing it!
Thanks for the explanation @drewbanin 馃憤
+1 as well to adding a dry run feature such as a dbt compile --all command
Any movement on this?
Hello everyone!
Any progress on this? I think being able to see the actual sql that will be executed is quite the important feature.
@giovanni-girelli-sdg To see the actual SQL that dbt will execute you can use dbt compile https://docs.getdbt.com/reference/commands/compile/.
Hi @alepuccetti! thanks for the reply, but there you can only see the SELECT statement. What I'd like to see is the actual SQL that will run, such as a MERGE, INSERT or CREATE statement.
Edit at least it's what I can find. Is there a way to see the full materialization code?
Edit2: I had not seen the target/run folder, that's enough for debugging in DEV. I still think it would be great to be able to have a dry run mode where no sql is actually executed, but I can see how that would be much harder to do. Thanks for the great work!
@mikekaminsky you opened this on my birthday!
+1 to this, especially if it allowed the calculation of the cost of the run, and allowed prevention of the run if the cost was too high unless a --force argument was used.
Most helpful comment
+1 as well to adding a dry run feature such as a
dbt compile --allcommand