Describe the bug
Observed Vault process left a lot of sockets in CLOSE_WAIT state. Socket in CLOSE_WAIT state are supposed to be released by the process owning the socket by issuing a close() system call. For any reason, vault is not releasing the sockets and reach the MAX OPEN FILE limit leading to a service failure.
To Reproduce
Steps to reproduce the behavior:
systemctl start vaultExpected behavior
Vault process should release any socket in CLOSE_WAIT state.
Environment:
vault status): Version 0.11.3vault version):Vault server configuration file(s):
cluster_name = "dc1"
max_lease_ttl = "768h"
default_lease_ttl = "768h"
listener "tcp" {
address = "foo.bar.foobar.io:8200"
tls_cert_file = "/etc/vault.d/tls/vault.crt"
tls_key_file = "/etc/vault.d/tls/vault.key"
tls_min_version = "tls12"
tls_cipher_suites = "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA"
tls_prefer_server_cipher_suites = "false"
tls_disable = "false"
}
api_addr = "http://foo.bar.foobar.io:8200"
storage "dynamodb" {
ha_enabled = "true"
region = "eu-west-1"
endpoint = ""
table = "vault-data"
session_token = "Hashicorp Vault"
max_parallel = "128"
read_capacity = "5"
write_capacity = "5"
}
Additional context
[root@foo ~]# systemctl status vault -l
● vault.service - Vault secret management tool
Loaded: loaded (/usr/lib/systemd/system/vault.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2018-11-20 15:37:34 UTC; 1 weeks 1 days ago
Main PID: 25708 (vault)
CGroup: /system.slice/vault.service
└─25708 /usr/local/bin/vault server -config=/etc/vault.d/vault_main.hcl -log-level=info
Nov 29 15:15:26 foo.bar.foobar.io vault[25708]: 2018-11-29T15:15:26.175Z [INFO ] http: Accept error: accept tcp 172.18.32.152:8200: accept4: too many open files; retrying in 1s
Nov 29 15:15:27 foo.bar.foobar.io vault[25708]: 2018-11-29T15:15:27.175Z [INFO ] http: Accept error: accept tcp 172.18.32.152:8200: accept4: too many open files; retrying in 1s
Nov 29 15:15:28 foo.bar.foobar.io vault[25708]: 2018-11-29T15:15:28.175Z [INFO ] http: Accept error: accept tcp 172.18.32.152:8200: accept4: too many open files; retrying in 1s
Nov 29 15:15:29 foo.bar.foobar.io vault[25708]: 2018-11-29T15:15:29.176Z [INFO ] http: Accept error: accept tcp 172.18.32.152:8200: accept4: too many open files; retrying in 1s
Nov 29 15:15:30 foo.bar.foobar.io vault[25708]: 2018-11-29T15:15:30.176Z [INFO ] http: Accept error: accept tcp 172.18.32.152:8200: accept4: too many open files; retrying in 1s
Nov 29 15:15:31 foo.bar.foobar.io vault[25708]: 2018-11-29T15:15:31.176Z [INFO ] http: Accept error: accept tcp 172.18.32.152:8200: accept4: too many open files; retrying in 1s
Nov 29 15:15:32 foo.bar.foobar.io vault[25708]: 2018-11-29T15:15:32.177Z [INFO ] http: Accept error: accept tcp 172.18.32.152:8200: accept4: too many open files; retrying in 1s
Nov 29 15:15:33 foo.bar.foobar.io vault[25708]: 2018-11-29T15:15:33.177Z [INFO ] http: Accept error: accept tcp 172.18.32.152:8200: accept4: too many open files; retrying in 1s
Nov 29 15:15:34 foo.bar.foobar.io vault[25708]: 2018-11-29T15:15:34.177Z [INFO ] http: Accept error: accept tcp 172.18.32.152:8200: accept4: too many open files; retrying in 1s
Nov 29 15:15:35 foo.bar.foobar.io vault[25708]: 2018-11-29T15:15:35.177Z [INFO ] http: Accept error: accept tcp 172.18.32.152:8200: accept4: too many open files; retrying in 1s
[root@foo ~]# netstat -antplu | grep 8200
tcp 129 0 172.18.32.152:8200 0.0.0.0:* LISTEN 25708/vault
tcp 0 0 172.18.32.152:8200 172.18.32.161:58322 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.161:58334 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.89:14386 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.161:58344 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.152:52634 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.33.40:38230 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.33.40:38218 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.161:58324 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.89:14366 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.89:14384 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.152:52638 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.33.40:38238 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.161:58356 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.33.40:38208 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.89:14398 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.33.40:38254 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.89:14408 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.161:58372 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.33.40:38244 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.89:14368 SYN_RECV -
tcp 0 0 172.18.32.152:8200 172.18.32.161:58352 SYN_RECV -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37580 CLOSE_WAIT -
tcp 0 246 172.18.32.152:52638 172.18.32.152:8200 FIN_WAIT1 -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57878 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37416 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13932 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57708 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37534 CLOSE_WAIT -
tcp 148 0 172.18.32.152:8200 172.18.32.161:15181 ESTABLISHED -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13994 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37512 CLOSE_WAIT -
tcp 148 0 172.18.32.152:8200 172.18.33.40:29859 ESTABLISHED -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37440 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37336 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13838 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13984 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37542 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37346 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37392 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37504 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37348 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:14004 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57844 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37434 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37386 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37530 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57750 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37540 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37478 CLOSE_WAIT -
tcp 301 0 172.18.32.152:8200 172.18.33.40:37428 CLOSE_WAIT -
tcp 148 0 172.18.32.152:8200 172.18.32.89:39293 ESTABLISHED -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57934 CLOSE_WAIT -
tcp 246 0 172.18.32.152:8200 172.18.32.152:44684 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37522 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13964 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13850 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37394 CLOSE_WAIT -
tcp 0 246 172.18.32.152:52634 172.18.32.152:8200 FIN_WAIT1 -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37458 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37374 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37474 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57716 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37588 CLOSE_WAIT -
tcp 246 0 172.18.32.152:8200 172.18.32.152:44688 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37486 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57868 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37418 CLOSE_WAIT -
tcp 246 0 172.18.32.152:8200 172.18.32.152:44690 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37480 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37578 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13802 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:14022 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13960 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13818 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37372 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37500 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57906 CLOSE_WAIT -
tcp 246 0 172.18.32.152:8200 172.18.32.152:44668 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37490 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37568 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37370 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37576 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37412 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13922 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37524 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57886 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37538 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37586 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13828 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37574 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37596 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57728 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37496 CLOSE_WAIT -
tcp 246 0 172.18.32.152:8200 172.18.32.152:44694 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37464 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37264 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37368 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37508 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37278 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37572 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37528 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57860 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37430 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37454 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37532 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37514 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37248 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57922 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37268 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37450 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37436 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37390 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37362 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37502 CLOSE_WAIT -
tcp 0 246 172.18.32.152:52632 172.18.32.152:8200 FIN_WAIT1 -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37582 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37598 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37484 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.89:13902 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37452 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57830 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37462 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.32.161:57810 CLOSE_WAIT -
tcp 149 0 172.18.32.152:8200 172.18.33.40:37408 CLOSE_WAIT -
set vault address parameter in listen tcp stanza using a fqdn instead of ip address
Does this make a difference?
Also, can you comment on the IP addresses on the other end? Are they random clients, do they correspond to other Vault servers, etc.?
Can you take a look at Vault server logs/client logs and see if you're seeing errors? e.g. are you seeing any panics?
Sorry, I mislead you, my mistake. The problem is not related to a vault parameter in the configuration file. I just described what I did to face the error.
I dive deeper and the problem looks to be related to DynamoDB backend storage.
As you can see at the vault configuration level we are using DynamoDB backend storage with the following DynamoDB stanza parameters:
read_capacity = "5"
write_capacity = "5"
On AWS DynamoDB table configuration side read/write capacity was set as follow:
Provisioned read capacity units: 5 (Auto Scaling Disabled)
Provisioned write capacity units: 5 (Auto Scaling Disabled)
Basically, vault read/write capacity was set to DynamoDB provisioned read/write capacity. But as we have 3 nodes configured, every node configured with a read/write capacity of 5 we actually exceed the DynamoDB table capacity.
And here is what was visible in the /var/log/messages system log file as vault is logging here as a systemd unit:
2018-11-26T16:54:55.871050+00:00 foo.bar.foobar.io vault: 2018-11-26T16:54:55.870Z [INFO ] core: security barrier not initialized
2018-11-28T17:05:41.567852+00:00 foo.bar.foobar.io vault: 2018-11-28T17:05:41.564Z [INFO ] core: vault is unsealed
2018-11-28T17:05:41.568139+00:00 foo.bar.foobar.io vault: 2018-11-28T17:05:41.566Z [INFO ] core: entering standby mode
2018-11-28T17:25:57.139296+00:00 foo.bar.foobar.io vault: 2018-11-28T17:25:57.136Z [ERROR] core: barrier init check failed: error="failed to check for initialization: ProvisionedThroughputExceededException: The level of configured provisioned throughput for the table was exceeded. Consider increasing your provisioning level with the UpdateTable API.
2018-11-28T17:25:57.139578+00:00 foo.bar.foobar.io vault: status code: 400, request id: FO1EVFHQQ9TO2NHBVS083F8G03VV4KQNSO5AEMVJF66Q9ASUAAJG"
2018-11-28T17:25:57.139794+00:00 foo.bar.foobar.io vault: 2018-11-28T17:25:57.137Z [ERROR] core: error checking health: error="failed to check for initialization: ProvisionedThroughputExceededException: The level of configured provisioned throughput for the table was exceeded. Consider increasing your provisioning level with the UpdateTable API.
2018-11-28T17:25:57.140005+00:00 foo.bar.foobar.io vault: status code: 400, request id: FO1EVFHQQ9TO2NHBVS083F8G03VV4KQNSO5AEMVJF66Q9ASUAAJG"
2018-11-28T17:26:06.524964+00:00 foo.bar.foobar.io vault: 2018-11-28T17:26:06.524Z [ERROR] core: barrier init check failed: error="failed to check for initialization: ProvisionedThroughputExceededException: The level of configured provisioned throughput for the table was exceeded. Consider increasing your provisioning level with the UpdateTable API.
2018-11-28T17:26:06.525299+00:00 foo.bar.foobar.io vault: status code: 400, request id: TP3TO5DJA3DAUTCOGIGICP4C3RVV4KQNSO5AEMVJF66Q9ASUAAJG"
2018-11-28T17:26:06.525530+00:00 foo.bar.foobar.io vault: 2018-11-28T17:26:06.524Z [ERROR] core: error checking health: error="failed to check for initialization: ProvisionedThroughputExceededException: The level of configured provisioned throughput for the table was exceeded. Consider increasing your provisioning level with the UpdateTable API.
2018-11-28T17:26:06.525756+00:00 foo.bar.foobar.io vault: status code: 400, request id: TP3TO5DJA3DAUTCOGIGICP4C3RVV4KQNSO5AEMVJF66Q9ASUAAJG"
2018-11-28T17:26:16.526527+00:00 foo.bar.foobar.io vault: 2018-11-28T17:26:16.526Z [ERROR] core: barrier init check failed: error="failed to check for initialization: ProvisionedThroughputExceededException: The level of configured provisioned throughput for the table was exceeded. Consider increasing your provisioning level with the UpdateTable API.
2018-11-28T17:26:16.526795+00:00 foo.bar.foobar.io vault: status code: 400, request id: 9SRQ1V6P0U2MA4O796AH9Q4FOFVV4KQNSO5AEMVJF66Q9ASUAAJG"
So to solve the issue I just triple the DynamoDB table configuration read/write capacity to make it consistent with the number of vault nodes
Provisioned read capacity units : 15 (Auto Scaling Disabled)
Provisioned write capacity units: 15 (Auto Scaling Disabled)
Basically DynamoDB read/write capacity must be compliant with following formulas:
AWS DynamoDB table provisioning read capacity = (Vault read capacity * number of vault node)
AWS DynamoDB table provisioning write capacity = (Vault write capacity * number of vault node)
This is the solution to solve the issue I face but the problem remains. Vault dynamoDB storage backend should handle the problem differently and should correctly release the resources in case of exception. As it looks like vault doesn't release the socket because it is not handling this throttling error correctly.
So I would propose to:
- Enhance the dynamoDB backend storage documentation to take into account the number of nodes to set the read/write capacity
- Update the dynamoDB backend storage code to better handle capacity limit reach problems so that if the vault node exceeds the read/write capacity after the retries, it releases the resources correctly.
The Dynamo code uses the aws sdk so this is _probably_ a bug at that layer, though unsure.
I was not able to reproduce this and I believe that in any event it may be resolved using OS specific tuning (like below) if not being addressed in DynamoDB plugin portion of Vault. Other suspects here that differed to my own testing which was outside AWS are LB & OS (I did not use RHEL).
# /etc/sysctl.conf
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
sysctl -p ;
Is this issue still applicable?
Most helpful comment
Sorry, I mislead you, my mistake. The problem is not related to a vault parameter in the configuration file. I just described what I did to face the error.
I dive deeper and the problem looks to be related to DynamoDB backend storage.
As you can see at the vault configuration level we are using DynamoDB backend storage with the following DynamoDB stanza parameters:
On AWS DynamoDB table configuration side read/write capacity was set as follow:
Basically, vault read/write capacity was set to DynamoDB provisioned read/write capacity. But as we have 3 nodes configured, every node configured with a read/write capacity of 5 we actually exceed the DynamoDB table capacity.
And here is what was visible in the /var/log/messages system log file as vault is logging here as a systemd unit:
So to solve the issue I just triple the DynamoDB table configuration read/write capacity to make it consistent with the number of vault nodes
Basically DynamoDB read/write capacity must be compliant with following formulas:
This is the solution to solve the issue I face but the problem remains. Vault dynamoDB storage backend should handle the problem differently and should correctly release the resources in case of exception. As it looks like vault doesn't release the socket because it is not handling this throttling error correctly.
So I would propose to: