Describe the bug
vminsert fails to start up - too many open files after starting up debug handlers.
Environment
Pretty standard helm deployment of VictoriaMetrics.
remote_writeProbably connection storm after vminsert restart, making it open >1024 file descriptors for open connections.
Version
vminsert-20190828-210741-tags-v1.27.0-cluster-0-g8f81908b
Additional context
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:11 build version: vminsert-20190828-210741-tags-v1.27.0-cluster-0-g8f81908b
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:12 command line flags
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "graphiteListenAddr" = ""
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "http.disableResponseCompression" = "false"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "httpListenAddr" = ":8480"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "influxMeasurementFieldSeparator" = "_"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "influxSkipSingleField" = "false"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "loggerLevel" = "INFO"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "maxConcurrentInserts" = "16"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "maxInsertRequestSize" = "33554432"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "maxLabelsPerTimeseries" = "30"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "memory.allowedPercent" = "60"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "opentsdbHTTPListenAddr" = ""
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "opentsdbListenAddr" = ""
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "rpc.disableCompression" = "false"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "storageNode" = "victoria-metrics-vmstorage-0.victoria-metrics-vmstorage.default.svc.cluster.local:8400"
2019-11-08T14:01:56.279+0000 info lib/logger/flag.go:20 flag "version" = "false"
2019-11-08T14:01:56.279+0000 info app/vminsert/main.go:41 initializing netstorage for storageNodes [victoria-metrics-vmstorage-0.victoria-metrics-vmstorage.default.svc.cluster.local:8400]...
2019-11-08T14:01:56.280+0000 info lib/memory/memory.go:33 limiting caches to 4797053337 bytes, leaving 3198035559 bytes to the OS according to -memory.allowedPercent=60
2019-11-08T14:01:56.280+0000 info app/vminsert/main.go:47 successfully initialized netstorage in 557.877碌s
2019-11-08T14:01:56.281+0000 info lib/httpserver/httpserver.go:48 starting http server at http://:8480/
2019-11-08T14:01:56.281+0000 info lib/httpserver/httpserver.go:49 pprof handlers are exposed at http://:8480/debug/pprof/
2019/11/08 14:02:32 ERROR: cannot open /proc/self/stat: open /proc/self/stat: too many open files
2019/11/08 14:03:02 ERROR: cannot open /proc/self/stat: open /proc/self/stat: too many open files
2019-11-08T14:03:03.960+0000 error lib/httpserver/httpserver.go:359 error in "/insert/1/prometheus": cannot read prompb.WriteRequest: cannot read compressed request: unexpected EOF
2019-11-08T14:03:06.960+0000 error lib/httpserver/httpserver.go:359 error in "/insert/1/prometheus": cannot read prompb.WriteRequest: cannot read compressed request: unexpected EOF
2019-11-08T14:03:08.466+0000 error lib/httpserver/httpserver.go:359 error in "/insert/1/prometheus": cannot read prompb.WriteRequest: cannot read compressed request: unexpected EOF
2019-11-08T14:03:13.465+0000 error lib/httpserver/httpserver.go:359 error in "/insert/1/prometheus": cannot read prompb.WriteRequest: cannot read compressed request: unexpected EOF
2019-11-08T14:03:20.746+0000 info app/vminsert/main.go:67 service received signal terminated
2019-11-08T14:03:20.746+0000 info app/vminsert/main.go:69 gracefully shutting down the service at ":8480"
Any ideas how to easily bump open file limit on helm deployment of VM?
@tenmozes , could you help with this?
Probably this would help increasing ulimit -n on K8S.
@zytek hi, thank for an issue, this is docker setting you need to increase limits for docker daemon on worker machines.
try to set ulimit in docker config
also you may find useful information here: https://github.com/kubernetes/kubernetes/issues/3595
As this happened to me even though we run VM on a quite small scale I would propose an ops friendly solution that does not require to tweak underlying host (changing system default limits, or default dockerd limits applied to ALL containers running on the node)
We could add support for increasing limits straight from our helm chart:
ulimit -n before starting vminsert from shell wrapper script or straight from binary (https://stackoverflow.com/questions/17817204/how-to-set-ulimit-n-from-a-golang-program)That would be the most ops friendly option. Cassandra went this way: https://github.com/instaclustr/cassandra-operator/pull/269/files
This sounds reasonable!
@tenmozes , could you look into this?
I can work on implementing chart/docker changes but I can't find original Dockerfile in any of the repos, can you point me to one ? _edit: found them, plus the ones on "cluster" branch_
Also, after thinking about this for a while I have some other improvement ideas:
@zytek, thank you
the cluster version is in separate branch https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster/
deployment folder, you find Dockerfile in it@valyala what do you think about improvements
Delay starting http / tsdb listeners to after vminsert is fully initialised (debug http server started, metrics exposed, etc), so that open file limit would manifest itself with new connection throttling rather than failure to start the process.
This is hard to achieve, since debug http server expects all the VictoriaMetrics components are already initialized when it starts - it should return valid response with all the metrics from all the components when /metrics page is requested.
Wrap http listeners with netutil.LimitListener so we can always limit max number of client connections. We could either expose max client connections as a flag, or calculate it by reading process limits (syscall.Getrlimit() )
This sounds reasonable. Let's clarify the purpose of this change - it will limit the number of incoming TCP connections to vminsert / vmselect, so spare connections can be always established to debug http server for reading /metrics or /deubg/pprof/* pages. It this correct?
Wrap http listeners with netutil.LimitListener so we can always limit max number of client connections. We could either expose max client connections as a flag, or calculate it by reading process limits (syscall.Getrlimit() )
This sounds reasonable. Let's clarify the purpose of this change - it will limit the number of incoming TCP connections to
vminsert/vmselect, so spare connections can be always established to debug http server for reading/metricsor/deubg/pprof/*pages. It this correct?
After careful inspection of attached logs I believe that the problem cannot be fixed by limiting number of accepted connections by http listener. None of the errors in the logs seem to trigger os.Exit / log.Fatal etc. The app receives termination signal after almost 2 minutes, so it seems that it is due to livenessProbe failing - for the same reasons as open() call failed earlier - no free file handles, so vminsert did not responds to ping from kubernetes and was terminated.
Adding support for increasing nofile limits on container startup should be sufficient here.
If you use OS with systemd, then try increase parameters LimitNOFILE & LimitNPROC.
ex:
cat /etc/systemd/system/victoriametrics.service
[Unit]
Description=VictoriaMetrics
After=network.target
[Service]
Type=simple
Environment="GOMAXPROCS=24"
User=victoriametrics
LimitNOFILE=500000 #!
LimitNPROC=500000 #!
StartLimitBurst=5
StartLimitInterval=0
Restart=on-failure
RestartSec=1
PIDFile=/run/victoriametrics/victoriametrics.pid
ExecStart=/usr/local/bin/victoria-metrics-prod -storageDataPath /vm/dt -maxConcurrentInserts 1024 -memory.allowedPercent 60 -retentionPeriod 24
ExecStop=/bin/kill -s SIGTERM $MAINPID
[Install]
WantedBy=multi-user.target
closed due to staleness, feel free to reopen if relevant
Most helpful comment
As this happened to me even though we run VM on a quite small scale I would propose an ops friendly solution that does not require to tweak underlying host (changing system default limits, or default dockerd limits applied to ALL containers running on the node)
We could add support for increasing limits straight from our helm chart:
ulimit -nbefore starting vminsert from shell wrapper script or straight from binary (https://stackoverflow.com/questions/17817204/how-to-set-ulimit-n-from-a-golang-program)That would be the most ops friendly option. Cassandra went this way: https://github.com/instaclustr/cassandra-operator/pull/269/files