Describe the bug
Frequent EOF / false timeout errors when querying from Grafana
To Reproduce
Deploy VictoriaMetrics, use grafana to query
Expected behavior
No errors on a lightly loaded vmselect deployment
Screenshots

Version
v1.31.0
Hi,
I have a victoriametrics-cluster deployment which is working very, very well - sometimes when opening a grafana page I see a bunch of errors regarding EOF/timeouts for some/all queries made within a page.
I took a look over the concurrency limiter and, without being crass/accusatory, the implementation seems complex for what it is doing? The servers themselves aren't at all stressed or running at high load. Queries almost always return in less than a few hundred milliseconds - there are absolutely no queries which would hog a whole connection to the point where new queries are queuing.
Could it possibly be an inress controller issue, or are you accessing the service directly?
Queries are made through grafana, which access the kubernetes service DNS names directly.
@AeroNotix , could you upgrade to v1.31.5 and check for this error? This release has improved error logging, which could help determining the root cause of the issue.
Sure I can upgrade to that version and see if the logging/behaviour is any better. Will do this in a few hours and get back to you.
@valyala are there fixes in v1.31.5? I've just upgraded to this version and I no longer see the same errors in the logs or in the grafana dashboard.
I don't want to mark this as resolved, unless there was a specific bug that has been fixed in 1.31.5 - as I am concerned the issue takes time to start appearing.
This can be related to #281 , which has been fixed in v1.31.5. Let's keep this issue open for some time and see whether it triggers again.
So far so good, still not seen this reappear with v1.31.5
Still so far so good!
Then closing the issue. Feel free re-opening it if it appears again.
Most helpful comment
Still so far so good!