Packer: Unset VAULT_CACERT or VAULT_CAPATH freezes packer

Created on 4 Jul 2019  ยท  10Comments  ยท  Source: hashicorp/packer

log: https://gist.github.com/andoriyu/8f9b6ca994fe6b3d73862a8f64966380#file-unset-vault_cacert
template: https://gist.github.com/andoriyu/8f9b6ca994fe6b3d73862a8f64966380#file-template-json

I forgot to export VAULT_CACERT environment variable and it broke whole thing without explanation.

Looks like packer doesn't check if any of the required variables for vault are set at all before trying.

bug integratiovault

All 10 comments

Oof, sorry about that.

This one isn't as easy to solve as the other issue you opened because there may be a valid situation where there is no VAULT_CACERT set. Does this still hang infinitely for you if you set a VAULT_CLIENT_TIMEOUT? I'm wondering if we can always set that from inside packer to make this more likely to at least fail in a reasonable period of time.

@SwampDragons I haven't tried with VAULT_CLIENT_TIMEOUT. What interesting is that after ~30 minutes it actually does reach out to vault.

Shouldn't this be a quick error path because packer can't establish secure connection with vault?

I assumed so too, but on a quick read of the code it looks like the golang vault api bindings are using a pretty robust request retry wrapper, which I bet is the source of this hang. That retry wrapper looks like it'll keep retrying for a long time if the server response code is within the 500 range, unless the timeout has been set.

It's a little surprising to me that failing to set the CACERT would be returning a 5xx error.

I'll need to figure out how to set up an environment to directly reproduce this so I can verify that's what is going on here.

Well, it won't be a 500, this will be client not able to validate server certificate without private CA certificate?

Basically my vault has my own CA for it's certificates and clients required to have CA certificate to validate server's identity because server certificate signed by that CA.

Here is how curl fails in this scenario:

* About to connect() to blahblah port 443 (#0)
*   Trying x.x.x.x...
* Connected to blahblah (x.x.x.x) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* Server certificate:
*   subject: OU=Vault,O=blahblah,L=Los Angeles,ST=CA,C=US
*   start date: Dec 07 22:01:00 2018 GMT
*   expire date: Dec 04 22:01:00 2028 GMT
*   common name: (nil)
*   issuer: CN=blahblah,L=Los Angeles,ST=CA,C=US
* NSS error -8179 (SEC_ERROR_UNKNOWN_ISSUER)
* Peer's Certificate issuer is not recognized.
* Closing connection 0
curl: (60) Peer's Certificate issuer is not recognized.
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

However, I pass --cacert it will work without issues.

Earlier I said that it works after 30 minutes, but I think I was wrong since error could have been swallowed by issue you fixed earlier.

I was finally able to scrabble together an appropriate dev environment to reproduce this, and I can happily confirm that with a build of master after having merged the fix for the other issue you opened, this one is resolved too. Instead of hanging for half an hour and then silently failing, Packer correctly errors with Error initializing core: template: root:1:3: executing "root" at <vault/secret/hellofoo>: error calling vault: Error reading vault secret: Get https://localhost:8200/v1/secret/hello: x509: certificate signed by unknown authority

The mechanism wasn't the one I initially thought it was, it was a retry built into Packer's template interpolation code that we weren't aborting properly if we got a legit error from a template func.

I'm going to close since the fix is merged. Thanks for reporting this and bearing with me as I worked through it.

@SwampDragons thank you for fixing this so quick!

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tleyden picture tleyden  ยท  3Comments

shashanksinha89 picture shashanksinha89  ยท  3Comments

wduncanfraser picture wduncanfraser  ยท  3Comments

mvermaes picture mvermaes  ยท  3Comments

sourav82 picture sourav82  ยท  3Comments