Along with the destination IP address and port, the destination service should indicate whether the service supports TLS using the conduit CA. I propose that this indicator be an enumeration something like this:
enum TLSSupport {
// Assume it doesn't support TLS.
Unknown = 0;
// Assume it supports TLS using the default mesh settings.
MeshDefault = 1;
}
Initially the Destination service should use TLSSupport::MeshDefault for any endpoint that has the conduit proxy injected into it (as determined from the EndPoints API), and TLSSupport::Unknown for destinations that aren't discovered by the EndPoints service.
This would make the Destinations service will be part of the TCB for the mesh. Later when routing is added we will revisit this, hopefully by fully moving the TLS decision making to routing.
Initially the Destination service should use TLSSupport::MeshDefault for any endpoint that has the conduit proxy injected into it (as determined from the EndPoints API)
Note: this should be _any endpoint that has the conduit proxy injected into AND is controlled by the same controller.
Based on discussion yesterday the current plan is to use TLS when the destination's controller namespace is equal to the current pod's controller namespace.
In the immediate term, the plan described yesterday is fine. We should encapsulate this logic in a single place that turns a destination endpoint pair (SocketAddr, labels) into a (SocketAddr, bool) where that boolean flag indicates whether the labels indicate that TLS will work. We should write some tests that take responses from the Destination service and verify that we're setting this flag correctly.
Later we'll need to support pods using different controllers communicating with each other, e.g. to handle the case where the user is trying to upgrade Conduit by creating a new namespace for the new version of the controller and then migrating pods incrementally to the new controller. In that scenerio, we don't want to lose TLS protections when such a migration happens. We will need to change how we handle the distribution of the the Conduit CA bundle and how we determine whether or not to use TLS (and what CA bundle to use, when we do) to account for this. However, we can do that separately, and it should be straightforward to do so if we encapsulate and isolate this logic as described in the preceding paragraph.
in a single place
Note that the flag needs to be available, ultimately, as an argument to Connection::connect().
@briansmith
Note that the flag needs to be available, ultimately, as an argument to Connection::connect().
Thanks, I was about to ask where you'd want this to be wired through to.
Another question:
We should encapsulate this logic in a single place that turns a destination endpoint pair
(SocketAddr, labels)into a(SocketAddr, bool)where that boolean flag indicates whether the labels indicate that TLS will work.
In the issue description, you describe an enum with variants Unknown and MeshDefault, this implies that we might eventually want to add more variants describing additional TLS configurations. Is it sufficient for the TLS flag to be a bool, or should we use an enum now so it will be easier to add more variants in the future?
It's fine for it to be a boolean for now. Definitely we're going to change it later to address the longer-term issues, which will require more than a single enum can express anyway. The important thing for now is to to abstract away the labels and to test that our short-term namespace matching works as we expect.
Based on our conversation today, here is my proposal for the changes we intend to make to how we determine whether to use TLS for a given connection:
tls_verification_strategy (name not final) to the Destination service's WeightedAddrSet protobuf message:message WeightedAddr {
common.TcpAddress addr = 1;
uint32 weight = 3;
map<string, string> metric_labels = 4;
TlsVerificationStrategy tls_verification_strategy = 5;
}
message TlsVerificationStrategy {
KubernetesPod kubernetes_pod = 1;
message KubernetesPod {
string controller_namespace = 1;
string pod_name = 2;
}
}
tls_verification_strategy field. Otherwise, the field will not be set (nil/None).Connect function with a tls_enabled boolean, it will be provided with an Option<TlsVerificatonStrategy>.tls_verification_strategy is Some(TlsVerificationStrategy::KubernetesPod), and the endpoint's controller namespace is equal to the CONDUIT_PROXY_CONTROLLER_NAMESPACE env var, thentls="true" labeltls="false" label.tls label, rather than the meshed label, to determine if a connection was secured.Does this sound correct?
/cc @olix0r, @briansmith, @klingerf
Edit: changed to reflect @olix0r's suggestions
if the endpoint's tls_verification_strategy is Some(TlsVerificationStrategy::NameAndNamespace), then the proxy will set the meshed="true" label, ...
else, the proxy will set the meshed="false" and tls="false" labels
This is surprising. What does the meshed flag have to do with TlsVerificationStrategy? I would expect this decision to be entirely unrelated to this field; and only the tls field should be impacted.
Also, let me suggest some minor changes to your proposed protobuf:
message WeightedAddr {
common.TcpAddress addr = 1;
uint32 weight = 3;
map<string, string> metric_labels = 4;
TlsVerificationStrategy tls_verification_strategy = 5;
}
message TlsVerificationStrategy {
NameAndNamespace name_and_namespace = 1;
message NameAndNamespace {
string controller_namespace = 1;
string pod_name = 2;
}
}
The main benefit of this is that TlsVerificationStrategy messages need not be scoped under WeightedAddr -- there's nothing about the message type that needs to be scoped onto an individual address. Furthermore, oneofs need not be introduced early -- they can be introduced later once there are more variants.
Also, what does pod_name mean in this context? Is it specific to a kubernetes resource? If so, we should change the name to be KubernestesPodValidation or something. If it's not kubernetes-specific, we should avoid using pod_ prefixes
@olix0r
if the endpoint's tls_verification_strategy is Some(TlsVerificationStrategy::NameAndNamespace), then the proxy will set the meshed="true" label, ...
else, the proxy will set the meshed="false" and tls="false" labelsThis is surprising. What does the
meshedflag have to do with TlsVerificationStrategy? I would expect this decision to be entirely unrelated to this field; and only thetlsfield should be impacted.
This is a good point. That field should still be set based on the presence of a controller namespace label (if we still care about _having_ a meshed label in addition to the tls label?).
Also, what does
pod_namemean in this context? Is it specific to a kubernetes resource? If so, we should change the name to be KubernestesPodValidation or something. If it's not kubernetes-specific, we should avoid usingpod_prefixes
Also a good point. This strategy would be kubernetes-specific, as it uses the pod namespace. Will change the name to make that clear. 馃憤
Oh, also @olix0r:
Furthermore,
oneofsneed not be introduced early -- they can be introduced later once there are more variants.
I thought it was worth making this a oneof to begin with so that prost will generate a Rust enum with one variant, in the hopes of minimizing the necessary proxy changes if we want to add more variants later. WDYT?
@hawkw i think it's fine to make it a oneof to start, just noting that protobuf allows you to make a single field a oneof later without breaking wire compat
@briansmith, if possible, I'd like to get your approval on this proposal before going too much farther on this.
@hawkw Thanks for writing this up. I've added issues for the controller work in #1039 and #1040. I also just wanted to verify my understanding of how we're handling metric labels. Specifically:
meshed labeltls label if the connection uses TLS; the value of the label will be tls="true"tls label if the connection does not use TLSIs that consistent with your understanding?
It's my expectation that the _meshed_ label is still useful (and distinct from the _tls_ label), but that we may not have a use case for consuming this label yet. Unless it's costly to maintain it, I have a slight preference for leaving the _meshed_ label unchanged and only introducing a new label.
@olix0r
Unless it's costly to maintain it, I have a slight preference for leaving the _meshed _label unchanged and only introducing a new label.
Currently, there is no code for setting a meshed label in the proxy. By "leaving it unchanged", do you mean that we should not add this label, or that we _should_ add code to add that label?
@klingerf
- The proxy will not set a
meshedlabel- The proxy will only set a
tlslabel if the connection uses TLS; the value of the label will betls="true"- The proxy will not set a
tlslabel if the connection does not use TLS
Modulo any decision regarding whether or not we set a meshed label, this is correct. Regardless of whether or not we eventually add such a label, it should _not_ indicate that a connection used TLS.
By "leaving it unchanged", do you mean that we should not add this label, or that we should add code to add that label?
My mistake. It's fine to omit for now and we can introduce this label if we need it later.
@olix0r Okay, thanks for clearing that up! 馃憤
I've renamed this issue, as the actual changes to the Destination service should now be tracked by #1040.
I believe this was completed a fairly long time ago. :)