Envoy: Leak of private key inline_string when specifying multiple tls certificates

Created on 17 Oct 2018  Â·  32Comments  Â·  Source: envoyproxy/envoy

Issue summary

Specifying more than one auth.TlsCertificate causes Envoy to dump the proto of the affected listener to the logs; if there is an inline private key, it is not redacted prior to logging.

Api Config Source for LDS = REST, not gRPC/file

Error produced by Envoy

newline separated to enhance readability

[9][warning][config] bazel-out/k8-opt/bin/source/common/config/_virtual_includes/http_subscription_lib/common/config/http_subscription_impl.h:87] 

REST config update rejected: Proto constraint validation failed 
(ListenerValidationError.FilterChains[i]: ["embedded message failed validation"] | 
caused by FilterChainValidationError.TlsContext: ["embedded message failed validation"] | 
caused by DownstreamTlsContextValidationError.CommonTlsContext: ["embedded message failed validation"] | 
caused by CommonTlsContextValidationError.TlsCertificates: ["value must contain no more than " '\x01' " item(s)"]): name: "TestListener"
<insert proto dump here>

Example configuration

{
  "@type": "type.googleapis.com/envoy.api.v2.Listener",
  "name": "TestListener",
  "address": {
    "socket_address": {
      "address": "0.0.0.0",
      "port_value": 443,
      "protocol": "TCP"
    }
  },
  "filter_chains": [
    {
      "filters": [
        {
          "config": {
            "access_log": [
              {
                "config": {
                  "path": "/var/log/envoy/access.log"
                },
                "name": "envoy.file_access_log"
              }
            ],
            "codec_type": "AUTO",
            "http_filters": [
              {
                "name": "envoy.gzip"
              },
              {
                "name": "envoy.router"
              }
            ],
            "rds": {
              "config_source": {
                "api_config_source": {
                  "api_type": "REST",
                  "cluster_names": [
                    "controlplane"
                  ],
                  "refresh_delay": "5s"
                }
              },
              "route_config_name": "service_instances_rds"
            },
            "stat_prefix": "backends"
          },
          "name": "envoy.http_connection_manager"
        }
      ],
      "tls_context": {
        "common_tls_context": {
          "tls_certificates": [
            {
              "certificate_chain": {
                "inline_string": "-----BEGIN CERTIFICATE-----"
              },
              "private_key": {
                "inline_string": "-----BEGIN PRIVATE KEY-----"
              }
            },
            {
              "certificate_chain": {
                "inline_string": "-----BEGIN CERTIFICATE-----"
              }
            },
            {
              "certificate_chain": {
                "inline_string": "-----BEGIN CERTIFICATE-----"
              }
            }
          ]
        }
      }
    }
  ]
}

bug help wanted

Most helpful comment

FYI this annotation approach will be very useful for tap scrubbing also once tap supports proto reflection. Big +1 from me.

All 32 comments

Seems like this would happen with gRPC or file as the LDS source as well. And also in CDS for tls_context data in Cluster.

Private keys (and possibly other sensitive information) can be also leaked via /config_dump (I could swear there was a bug for it, but I cannot find it now).

I imagine there should be a way to sanitize sensitive data at the protobuf level. @htuch any ideas?

@PiotrSikora , I found this acceptable only because I thought it's the intention to only expose this interface to administrators, since it also has the /quitquitquit endpoint that literally kills the process... Although it would be nice to hide these things. I haven't kept up to date with SDS (Secret Discovery Service) how is it handled with that? Still exposed?

@PiotrSikora I don't know if there is a standard way to do this in OSS protobuf today. It could be done as an annotation though, where we store encrypted data in-memory (and for the standard DebugString()), only making it available to C++ via additional accessor methods that a protoc plugin generates.

I just got bitten by this.

I don't have the expectation that increasing the log level up one from the default would start exposing private key material. I don't see a reference to this behavior in the docs nor in the help text.

   -l <string>,  --log-level <string>
     Log levels:
     [trace][debug][info][warning][error][critical][off]

     Default is [info]

I agree that we should scrub private keys from the config dump and log messages. Turning up logging though is still going to be a minefield for PII and other data leakage, this should be expected; if you switch to trace level, you will start seeing data plane traffic unencrypted. This is WAI, we want to be able to debug at trace level, trace level shouldn't be used in production outside of very controlled debug scenarios.

@htuch I understand. I definitely expect things like this from trace. At debug we expected more logging around timings, memory pressures, and the like.

@terinjokes PRs welcome. My comment in https://github.com/envoyproxy/envoy/issues/4757#issuecomment-451942197 is aspirational, I think we can just do some manual redaction of protos in config dump and debug log statements for now.

I'd like to help out here and see if I can send up a PR. Any notes or pointers to help get started? Should I start with just the config dump, then move from there with separate PR's?

@stevesloka I would advocate for having a sanitizeProto method that takes something like a Listener config that we're about to dump and redacts the keys. So, most call sites should have a single line change to wrap the dumped proto with this.

Do you have any other tips of where to integrate @htuch? I'm new to the code base and still digging through.

@stevesloka a good start would be to look at the uses of DebugString() for proto in grpc_mux_impl.cc. The main challenge there is it's dealing with generic protobuf Message; you'll need to downcast to the relevant protos to do scrubbing.

Hey @htuch, I'm not following where DebugString is used for the specific use case that this issue is trying to solve. I can see usages of this coming from util.cc, but I'm just not clear on the higher level approach to how we might fix this.

ag DebugString source/common/config
source/common/config/delta_subscription_impl.cc
101:    ENVOY_LOG(trace, "Sending DiscoveryRequest for {}: {}", type_url_, request.DebugString());

source/common/config/grpc_mux_impl.cc
60:  ENVOY_LOG(trace, "Sending DiscoveryRequest for {}: {}", type_url, request.DebugString());
156:                                         resource.type_url(), type_url, message->DebugString()));

source/common/config/grpc_stream.h
42:    ENVOY_LOG(debug, "Establishing new gRPC bidi stream for {}", service_method_.DebugString());
44:      ENVOY_LOG(warn, "gRPC bidi stream for {} already exists!", service_method_.DebugString());

source/common/config/utility.cc
75:                    error_prefix, local_info.node().DebugString()));
98:                    api_config_source.DebugString()));
105:                                       api_config_source.DebugString()));
110:                                       api_config_source.DebugString()));
117:                      api_config_source.DebugString()));
122:          api_config_source.DebugString()));
251:                                     api_config_source.DebugString()));

source/common/config/utility.h
72:        throw EnvoyException("Unable to unpack " + resource.DebugString());

source/common/config/subscription_factory.h
62:            config.DebugString());

source/common/config/filesystem_subscription_impl.cc
52:    ENVOY_LOG(debug, "Filesystem config update accepted for {}: {}", path_, message.DebugString());

is a good start. Basically, anywhere we are dumping things and they aren't the xDS config but instead xDS resources, there is something possibly suspect.

I suggest approaching this empirically. Create and Envoy configuration with certificates in LDS and CDS, then run with -l debug and see which messages contain them. This will capture the happy path dumping, which is a good start. There is also the sad path, but let's do that as a next step.

/cc @incfly

This might help https://github.com/envoyproxy/envoy/issues/7111 since we need to strip the private key in config_dump for dynamic secretes fetched via SDS.

I'm starting to think that approaching this problem structurally, by having proto annotations and a protoc plugin capable of generating scrubbing/sanitization code is the way to go, at least for config_dump purposes.

FYI this annotation approach will be very useful for tap scrubbing also once tap supports proto reflection. Big +1 from me.

@htuch: I think this is describing the same thing we're looking for: https://github.com/protocolbuffers/protobuf/issues/1160, WDYT?

@mergeconflict yep. Although, whether it's core proto's responsibility or something well suited to a protoc plugin is probably worth discussing there.

FWIW I think this is better suited to an annotation that we can then process and utilize in different ways including logging, tap, etc.

Hi folks, we're considering to contribute a workaround fix by hardcode traversal the listener/cluster/secrets, before the desired approach (proto annotation && reflection) landed. Thus the admin config dump interface won't be subjected to leak private key.

Something like an utility function as void RedactSecurityData(protobuf::Message& message).

WDTY? @htuch @mergeconflict

@incfly Funny timing... I actually just picked this back up. I have an implementation that's probably about halfway done. Let's chat tomorrow?

Glad that I checked. :) Yeah let's chat tomorrow then

On Tue, Dec 10, 2019, 7:05 PM Dan Rosen notifications@github.com wrote:

@incfly https://github.com/incfly Funny timing... I actually just
picked this back up. I have an implementation that's probably about halfway
done. Let's chat tomorrow?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/envoyproxy/envoy/issues/4757?email_source=notifications&email_token=AAMVXB6CZMAJBC6TIWWGZKLQYBKIHA5CNFSM4F4NCM7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGRW57Y#issuecomment-564358911,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAMVXBZV5I2M55NPL6AKFDLQYBKIHANCNFSM4F4NCM7A
.

@incfly @mergeconflict Awesome. We were just talking about this bug today, since it effects are ability to debug some issues.

FWIW I feel like we have enough code in the codebase now dealing with annotations, parsing, etc. that it shouldn't be that hard to do this "the right way."

I have a PR up: #9315. It still needs some work, and doesn't fix this issue yet (currently it only redacts SSL certs in the admin config_dump) but I intend to apply it to this as well.

Thanks @mergeconflict ! Yeah, #9315 will satisfy our needs. I'll follow up on that PR.

I just took another look at this... I think this might be trickier than just inserting a call to MessageUtil::redact() before logging, because the error message is actually coming from protoc-gen-validate. That looks like:

bool Validate(const ::envoy::WhateverMessage& m, pgv::ValidationMsg* err);

I can't think of an obvious way to have pgv redact messages before serializing and populating the error... Any suggestions @mattklein123 @htuch?

I can't think of an obvious way to have pgv redact messages before serializing and populating the error... Any suggestions @mattklein123 @htuch?

The only way that I can think of is to have PGV become aware of the redaction annotation, which seems like a reasonable thing to work towards?

It could be fixed in PGV. In the interim, we could redact before passing to PGV. Then we continue to use the original message. The main downside there is you then skip PGV of redacted fields.

It could be fixed in PGV. In the interim, we could redact before passing to PGV. Then we continue to use the original message. The main downside there is you then skip PGV of redacted fields.

@htuch I would guess the main downside to that approach is the runtime cost of preemptively redacting everything (making a copy and iterating over it). There could also be cases where a PGV annotation expresses some constraint that is violated by redaction.

@mergeconflict yep, I think the long term solution is fixing in PGV.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

karthequian picture karthequian  Â·  3Comments

lps0535 picture lps0535  Â·  3Comments

vpiduri picture vpiduri  Â·  3Comments

jmillikin-stripe picture jmillikin-stripe  Â·  3Comments

justConfused picture justConfused  Â·  3Comments