Cockroach: cli: untangle "client" and "server" cert handling

Created on 8 Apr 2020  路  4Comments  路  Source: cockroachdb/cockroach

The cockroach cert create-client command (and perhaps the other cert creation routines) complain if an unrelated invalid cert exists in the --certs-dir. This is a UX surprise. My expectation as a user is that cert create-client wouldn't be validating another cert in the --certs-dir. Reproduction steps:

~ ./cockroach cert create-ca --certs-dir=certs --ca-key=certs/ca.key
~ COCKROACH_CERT_NODE_USER=node.crdb.io ./cockroach cert create-node localhost --certs-dir=certs --ca-key=certs/ca.key localhost "*.local" 127.0.0.1
~ ./cockroach cert create-client root --certs-dir=certs --ca-key=certs/ca.key
ERROR: failed to generate client certificate and key: client/server node certificate has principals ["node.crdb.io" "localhost" "localhost" "*.local"], expected "node"
Failed running "cert create-client"

The workaround is to move the problematic cert (node.crt) out of the certs directory temporarily while the client cert is being created.

Cc @dbist

C-bug S-3-ux-surprise

Most helpful comment

I'd like to fix this, but before I do I want to get agreement on the desired semantics. My proposal is that the various non-server cockroach commands (i.e. cert, node status, sql, etc) only check the validity of the cert they are using, not the validity of all certs in the --certs-dir directory.

@aaron-crl, @knz, @bdarnell Comments? Feel free to give a thumbs-up response if this sounds good.

All 4 comments

I'd like to fix this, but before I do I want to get agreement on the desired semantics. My proposal is that the various non-server cockroach commands (i.e. cert, node status, sql, etc) only check the validity of the cert they are using, not the validity of all certs in the --certs-dir directory.

@aaron-crl, @knz, @bdarnell Comments? Feel free to give a thumbs-up response if this sounds good.

Yes, all commands (including starting the server) should ignore certs that they aren't going to use. This involves plumbing some sort of mode/purpose flag and a username through CertificateManager.LoadCertificates.

The reason this has become an issue recently is the introduction of the cert principals map, which is configured on the command line instead of in the certs directory itself, so it's no longer true that validating all the certs is zero-configuration. It might have been better (or at least more in keeping with the existing design) if the cert principals map were a file written to the certs directory instead of given to each process separately on the command line.

But instead of changing that, I think it's better to just change things so we don't examine every cert at once. (We already have to swallow permission errors because some certs may exist that are not accessible to our unix user. We'd rather not swallow those errors for the cert we are actually going to use)

This issue is now the placeholder for untangling the client cert handling from server cert handling. In particular, we should either deprecate or discourage the use of --certs-dir for client commands such as sql, and node status, and we should ensure that --url can be used for these commands without flowing through CertificateManager.

Was this page helpful?
0 / 5 - 0 ratings