Metabase: Make default dashboard query timeout of 60 seconds configurable

Created on 19 Jan 2016 · 53Comments · Source: metabase/metabase

I'm using Metabase to query a read replica of our production db, but some queries take longer than 60 seconds. The default hard coded 60 second timeout kills these queries. It would be useful to configure this timeout value to suit our needs... ideally with an environment variable (AWS Environment Property).

I'm certain there are other scenarios where reducing/increasing the timeout might prove helpful.

Most helpful comment

+1 for this feature

veve90 on 25 Apr 2018

👍18 👎1

All 53 comments

For reference the constant hardcoded to 60 seconds in question is DATASET_TIMEOUT

camsaul on 20 Jan 2016

+1 for this feature.

We store a lot of data in Redshift, and it's not a terribly fast DB, but for analytical access, it's perfectly okay to wait a few minutes for a result.

teddziuba on 20 Feb 2016

👍8

@kdoh + @mazameli Does this need any UX treatment aside from a input text box in the admin? I'm inclined to just use our standard pattern under "Settings -> General".

Could see a case being made for this being a per-DB setting, but am inclined to keep it simple.

To recap, this would apply to individual questions, not database cards.

salsakran on 24 Mar 2016

From what's been mentioned I can't currently think of anything that would warrant going beyond the established settings pattern. I'd imagine it would live under general as part of the overall instance settings?

kdoh on 24 Mar 2016

👍2

We should also make it clear the dashboard timeout doesn't change.

How's the below sound?
Title: Query Timeout
Description: The time in seconds before a data warehouse queries times out. (Note that dashboards have a timeout of 60s that is not affected by this setting)

salsakran on 29 Mar 2016

Maybe this is a separate issue, but why do dashboards have their own timeout? Is the current behavior that all queries of all cards on a dashboard execute together as a batch, so that if one of them stalls, the whole dashboard fails? I would think ideally that each card would load or not load independently of the others, so that you'd never get into a state where the whole dashboard fails unless each individual card did.

mazameli on 29 Mar 2016

Think it mainly comes from the idea that a dashboard should be something that comes up reasonably quickly and can be refreshed. I've gone either way on the issue over time. On one hand a dashboard full of cards that take 30 minutes to run is kinda useless as a dashboard, and will tend to have users hit "refresh" which exacerbates the problem. On the other hand, there are situations where a collection of cards is really more of a heavy report or exploration than a real "dashboard".

@agilliland has pretty strong feelings here =)

salsakran on 29 Mar 2016

BTW, now that I look at this, that DATASET_TIMEOUT constant doesn't actually cause the query to be cancelled on the backend, it just rejects the client-side promise after 60 seconds ¯\_(ツ)_/¯. We should either have the backend enforce the timeout (maybe a ?timeout= parameter on the /api/dataset endpoint) or actually abort the request and ensure the backend cancels the query when a request is aborted (if possible)

tlrobinson on 29 Mar 2016

Hey Guys,
Any action here? This is useful not for dashboard per se but for when someone wants to do some advanced/big question that doesn't have pre-aggregated data for example.

Cheers

HelmiRifai on 27 Apr 2016

+1 for this feature as well. This is a showstopper for us, as many of our important queries run over 1 minute. We need this configurable if we were to consider using Metabase seriously for our company. If it is an easy fix, please update!

Thanks!

mattau600 on 28 Apr 2016

@HelmiRifai , @mattau600 , would you mind providing a bit more context around your database setup, roughly how long your queries tend to run, and how you are using Metabase? This information helps us a lot with understanding people's use cases when we are planning things out.

agilliland on 28 Apr 2016

So a couple things we should explicitly tackle (or at least document as a caveat) in this issue:

front end timeout
server thread pool and any timeouts or queuing there
any nginx (or elb timeouts) that occur when using our recommended Elastic Beanstalk
timeouts we're setting on the jdbc connection if any

salsakran on 28 Apr 2016

👍3

For my example I sometimes want to run an adhoc query against user actions
with a table of a few billion rows. The table is on Redshift (some sample
queries would take in the few minute range, usually a max of 5mins) and I
would like to visualize some data from it before I create an extraction
aggregation job that will produce a table suitable for a dashboard question.

So a configurable timeout for questions would be great.

A seperate timeout for dashboards as well would be useful, I don't want my
dashboards to be taking a long while to load in the first place :)

Also I haven't checked in 0.16.x but are timed out questions on the
frontend correctly sending query kill commands to the database? Because on
0.15.x they were not, so the query would actually keep running, but
metabase wouldn't give the result :D so selfish.

Cheers
On Apr 28, 2016 10:20 PM, "Sameer Al-Sakran" [email protected]
wrote:

So a couple things we should explicitly tackle (or at least document as a
caveat) in this issue:

front end timeout

server thread pool and any timeouts or queuing there

any nginx (or elb timeouts) that occur when using our recommended
Elastic Beanstalk

timeouts we're setting on the jdbc connection if any

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/metabase/metabase/issues/1749#issuecomment-215534150

HelmiRifai on 28 Apr 2016

Post conversations, I think we should also change the timeout message in the following ways:

Create 1-3 stages of "timeouts"
At 10 seconds, display a "waiting" timeout, with the average execution time of the query as well as the max execution time
At 60 seconds, display a "this is taking a while" image with avg/max and elapsed, and the creators name
At 5 minutes, all of the above + the creators email.

The goal here is rather than providing an in-app harsh cutoff, we let social pressures force the conversation of how long is too long.

salsakran on 28 Apr 2016

+1 on this request.

We are using a Redshift backend too. We run adhoc queries which may join several tables for exploratory analysis. These queries can complete within 5 min usually. Another case is when we have occasional big queries aggregating rows from across a rather long time.

In both cases it is okay to be slow (to up 15 minutes I would say), as long as the query eventually complete.

derekchan on 29 Apr 2016

Will the stages be configurable? I think that's the point of this issue.

If Metabase let me decide how long is too long, I won't have trouble
because I know my database and know how long questions should take to load.

Is this possible?

MrMauricioLeite on 29 Apr 2016

Our use case is similar to @derekchan and @HelmiRifai - billions of rows for ad hoc exploratory queries that could take minutes to complete. Ideally, the global timeout would be configurable in the admin panel (or configured to have no timeout), and the user have several chances to kill a question if the query runs too long (along the lines of @salsakran, but not a social pressure thing). These stages don't have to be configurable.

mattau600 on 29 Apr 2016

Going to start a to-do list for this

[ ] Is this only for dashboards, or also query builder? (currently it's only enforced on the frontend on dashboards. database and/or load balancer may have other timeouts that apply even in query builder, but we can't control those)
[ ] Decide if we want to have messages on every card on a dashboard, or if there's a message at the top of the dashboard (e.g., "This dashboard is taking a super long time to load…")
[ ] Put an input box in the Admin Panel (probably under Settings > General). Error conditions/states?

mazameli on 2 May 2016

Also note that currently the frontend only enforces the 60 second timeout on dashboards, not in the query builder. If a question in the query builder is timing out it's because the database or backend is timing out.

tlrobinson on 2 May 2016

@mazameli I think we should try to unpack where the timeout is coming from and presenting that information to the user somehow. Specifically, it should be possible to figure out where it's happening and who can help with that specific piece of the architecture.

For situations where it's a piece we control in our docker config or bean stalk recipe, we should figure out ways to apply a timeout someplace sensible.

salsakran on 2 May 2016

@tlrobinson Hey, just to chime in on your point

Also note that currently the frontend only enforces the 60 second timeout on dashboards, not in the query builder. If a question in the query builder is timing out it's because the database or backend is timing out.

The core issue is that it's being enforced in the question page, not just in dashboards. My backend is not timing out (also Metabase is not terminating the query when its frontend logic times out -- so it actually keeps running on the database) I can see the running query on the DB Console even after Metabase tells me it has taken too long. I think it is reasonable to have 2 configurable timeouts, one for Dashboards and one for Questions; but the issue of query termination on exceeding timeout should be implemented as well.

HelmiRifai on 3 May 2016

@HelmiRifai Hmm, AFAIK there's no code in Metabase that will timeout queries on the /card or /q pages. I tested this by doing SELECT pg_sleep(10000); in a PostgreSQL native query. Are you connecting directly to Metabase or through some other infrastructure that could be timing out HTTP requests? (Heroku, ELB, etc)

I think we've concluded we don't want any timeout at all on question page, but we do want the user to be able to cancel a query that's taking to long. That's covered by #2399.

We may eventually want to split the query endpoint into multiple endpoints for starting queries, checking the status, getting the results, and canceling queries to avoid issues with infrastructure timeouts.

Also note that 1) on dashboards we don't actually cancel the HTTP request to /api/dataset, and 2) even if we did the backend doesn't do anything to actively cancel the query in the database/data warehouse (though it will avoid pulling data from the db). I plan to fix 1 as part of this issue, but 2 is a little more involved (and more @camsaul's domain)

tlrobinson on 3 May 2016

Yes indeed, I just wrote a long running query to test, its been running for a few mins without a hiccup.

I am not using anything between Metabase and my Redshift cluster to connect -- and I have run into issues where Metabase shows the "Your question took too long" (or something along that line) message when doing a question, is there a way to follow up on that? It could be an issue with my flaky internet somedays I guess.

HelmiRifai on 3 May 2016

could be a variety of things, but it's going to be something that's specific to your environment. here's some things to check:

http://docs.aws.amazon.com/redshift/latest/dg/r_statement_timeout.html
http://docs.aws.amazon.com/redshift/latest/mgmt/connecting-firewall-guidance.html
http://stackoverflow.com/questions/21457282/long-query-in-amazon-redshift-never-return

agilliland on 3 May 2016

So I definitely got timed out while running a question. Running the same query directly on the database (aws redshift) via a sql client doesn't time out, and finishes in 7 minutes. We have metabase installed as on MacOS and on a server via aws elastic beanstalk, and has the same problem.

I tried changing the tcp/ip timeout setting on both my mac and also on the ec2 instance, and problem remains.

mattau600 on 4 May 2016

👍5

Hi @camsaul

Is it now possible with Metabase to configure your time outs? I would only like my queries to time out after 3 minutes.

Alexa

AlexandraLouise on 23 Sep 2016

@metabase/core-developers did #2570 resolve this? It sounds like something different

camsaul on 26 Sep 2016

@AlexandraLouise the front end (and I believe backend) shouldn't be timing out.

What are you seeing specifically?

salsakran on 26 Sep 2016

I'm seeing the "your question took too long" response with the latest version installed using the recommended settings on AWS EBS today querying Redshift. My queries take a while to return data, and Metabase killed it.

brianspolarich on 27 Sep 2016

Hi @camsaul I'm seeing the exact same thing as @brianspolarich - this will really change my life if we can somehow increase the timeout on this

FYI my query takes 3 minutes to run usually.

AlexandraLouise on 28 Sep 2016

@brianspolarich @AlexandraLouise Just to confirm you're talking about timeouts on the query builder, not dashboards?

I don't believe we intentionally time out queries anywhere, but certain platforms (Heroku, etc) have load balancers that might timeout a request. What platform are you on?

tlrobinson on 30 Sep 2016

I’m using EBS on AWS per the deployment instructions in the metabase docs. I’m not experienced with EBS, ELB, etc. Maybe the defaults included in the application package are too low for our scenarios?

brianspolarich on 30 Sep 2016

I'm also using EBS on AWS. When I ask a question, it attempts to run the query and then times out.

AlexandraLouise on 30 Sep 2016

@brianspolarich + @AlexandraLouise can you check your ELB timeout? You can get to it by clicking on EC2 from the services list, going down to "Load Balancers" on the left hand pane, clicking on the load balancer beanstalk created, and scrolling down to "Attributes"? If that's 60s, it's likely that the ELB is the root cause of the timeout.

We don't have direct control over that in the beanstalk recipe unfortunately.

salsakran on 1 Oct 2016

Hi @camsaul
I've made that change - but it still is automatically timing out after 60 seconds. @brianspolarich have you had any luck getting this to work?
screen shot 2016-10-03 at 9 52 16 am

AlexandraLouise on 3 Oct 2016

HI @camsaul

Why is Meta base telling us that the query is taking too long if this is an Elastic Beanstalk issue - what database response causes this error?

Alexa

AlexandraLouise on 3 Oct 2016

@camsaul @brianspolarich found the solution!

We were getting a 504 Gateway Timeout error using Nginx as Proxy.

For Nginx as Proxy for a Metabase Docker web server, this is what you have to try to fix the 504 Gateway Timeout error:

Add these variables to nginx.conf file:

proxy_connect_timeout 600;
proxy_send_timeout 600;
proxy_read_timeout 600;
send_timeout 600;

This fixed our error.

Alexa

AlexandraLouise on 4 Oct 2016

👍9

@salsakran my cards now generate without a timeout which is great - only issue is the dashboard which times out after 60 seconds :( Is there anywhere I can change this time out duration?

AlexandraLouise on 4 Oct 2016

@AlexandraLouise as of v0.18.0 we shouldn't be timing out queries on dashboards. Instead you should see a slow query warning. Can you tell me what you're seeing in the UI and network inspector (specifically the /api/dataset requests)

tlrobinson on 4 Oct 2016

+1 to this feature. It's long awaited now. No time-outs on questions can really help our marketing and sales team to pull data with metabase.

riser101 on 4 Jan 2017

I'm seeing this problem as well with latest metabase version and manually configured ELB to 300 second timeout

```Your question took too long
We didn't get an answer back from your database in time, so we had to stop. You can try again in a minute, or if the problem persists, you can email an admin to let them know.

tacchino on 23 Jan 2017

Got it working with the addition of long timeouts in nginx config. Any chance of getting that moved into the beanstalk recipe?

tacchino on 23 Jan 2017

definitely... what changes did you make to your nginx config?

salsakran on 23 Jan 2017

Proxy timeouts as specified by @AlexandraLouise here: https://github.com/metabase/metabase/issues/1749#issuecomment-251315512

tacchino on 23 Jan 2017

opening a PR to include those settings and keep things a little more sensible. #4217

salsakran on 23 Jan 2017

👍5

+1 for this feature

veve90 on 25 Apr 2018

👍18 👎1

Was this pr merged?

saneelb on 7 May 2018

Just FYI, we managed to fix this ("Your question took too long" error) by checking timeouts all along the way.

There was a default 5m timeout on ELB (AWS Elastic Load Balancer), then we also had a default 1m timeout in our Rancher Load Balancer.

JoeSham on 10 Jul 2018

👍1

Hi, is this the property DATASET_TIMEOUT? tried setting it as an env variable in elastic bean stalk, but still same issue..
the database queries run within 60 second duration
the elb idle time out is 300 seconds..
Any help appreciated..

FYI, I checked out the code, but could not find any property related to DATASET_TIMEOUT..

kaskrish on 31 Jan 2019

Are there any updates on managing dataset timeouts for both questions and dashboards? It seems like the behavior of this has changed under the hood but there's not a lot of room for adjusting these settings.

kyleabeauchamp on 29 Mar 2019

I am running metabase dcoker on aws ec2 with 80:3000 port configuration (without nginx and any app server). This is getting timeout after 60 secs. How can I increase the timeout?

siddhant-mohan on 10 Jun 2019

@siddhant-mohan
Can you look in your browser console to check the response of the request giving you the timeout.
Both error code and any text (including headers) are relevant, since it should point you to what exactly is doing the timeout (by giving a name of the service or specific text).

flamber on 10 Jun 2019

Hi, I had the same request timeout issue when using metabase with a apache druid database, But I was managed to fix the issue by increasing the timeout by editing the timeout range mentioned in query_processor.clj file.

Screenshot from 2019-11-14 15-47-30