Victoriametrics: Query EOF/Timeout

Created on 12 Mar 2020 · 9Comments · Source: VictoriaMetrics/VictoriaMetrics

Describe the bug
Frequent EOF / false timeout errors when querying from Grafana

To Reproduce
Deploy VictoriaMetrics, use grafana to query

Expected behavior
No errors on a lightly loaded vmselect deployment

Screenshots
Thu_12_Mar_11:45:42_CET_2020

Version
v1.31.0

Hi,

I have a victoriametrics-cluster deployment which is working very, very well - sometimes when opening a grafana page I see a bunch of errors regarding EOF/timeouts for some/all queries made within a page.

I took a look over the concurrency limiter and, without being crass/accusatory, the implementation seems complex for what it is doing? The servers themselves aren't at all stressed or running at high load. Queries almost always return in less than a few hundred milliseconds - there are absolutely no queries which would hog a whole connection to the point where new queries are queuing.

bug

Source

AeroNotix

Most helpful comment

Still so far so good!

AeroNotix on 17 Mar 2020

👍2

All 9 comments

Could it possibly be an inress controller issue, or are you accessing the service directly?

stigok on 12 Mar 2020

Queries are made through grafana, which access the kubernetes service DNS names directly.

AeroNotix on 12 Mar 2020

@AeroNotix , could you upgrade to v1.31.5 and check for this error? This release has improved error logging, which could help determining the root cause of the issue.

valyala on 12 Mar 2020

Sure I can upgrade to that version and see if the logging/behaviour is any better. Will do this in a few hours and get back to you.

AeroNotix on 12 Mar 2020

👍1

@valyala are there fixes in v1.31.5? I've just upgraded to this version and I no longer see the same errors in the logs or in the grafana dashboard.

I don't want to mark this as resolved, unless there was a specific bug that has been fixed in 1.31.5 - as I am concerned the issue takes time to start appearing.

AeroNotix on 13 Mar 2020

This can be related to #281 , which has been fixed in v1.31.5. Let's keep this issue open for some time and see whether it triggers again.

valyala on 13 Mar 2020

So far so good, still not seen this reappear with v1.31.5

AeroNotix on 15 Mar 2020

Still so far so good!

AeroNotix on 17 Mar 2020

👍2

Then closing the issue. Feel free re-opening it if it appears again.

valyala on 27 Mar 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Errors on some queries - since 1.22.0

EricAntoni · 3Comments

vminsert ignoring indefinitely one of vmstorage pods after sudden restart

abualy · 3Comments

Add /graph page for PromQL debugging

valyala · 4Comments

High startup time for vmagent

pmitra43 · 3Comments

escape char in `by` not work

n4mine · 3Comments