Envoy: Add support for sending data in Zipkin v2 format

Created on 24 Oct 2018 · 27Comments · Source: envoyproxy/envoy

Title: Add support for sending data in Zipkin format

Description:
It would be really helpful to support sending data in zipkin v2 json for reasons of efficiency and reduced tech debt in trace data pipelines. This could go alongside the existing functionality. While proto3 is even more efficient, it could be optional as most don't use that.

Zipkin's data format has been historically criticized for its heft. Through community effort, last August 2017 we formalized a compact v2 json, accepted on all transports including http, kafka, rabbit, etc. Later, we introduced a proto3 encoding of the same, also accepted on all transports.

Between last August and now, this format has become the primary and preferred format, especially to those trying to work with the data. For example, at a recent meeting we've found that envoy is a key piece of the network that still emits the v1 format, requiring "rosetta stone" style proxies.

This is fine in pure zipkin installs as the server reads all historical formats, but is limiting for those using different pipelines. A switch to v2 would also work with zipkin clones such as jaeger who also support that format.

original spec for zipkin and envoy
json v2 encoding
proto3 encoding
trace data pipeline meeting
jaeger's v2 support

enhancement help wanted

Source

adriancole

👍9

Most helpful comment

OK, I am happy to help the community.

zyfjeff on 13 Dec 2018

🎉8

All 27 comments

@zyfjeff wondering if you have some time to help with this. It would be really appreciated by the community, especially those who write their own proxies.

adriancole on 6 Dec 2018

👍2

sorry to nag you @zyfjeff but you've done the only major work on zipkin side recently. there are few people with c++ experience in the ecosystem.. would others be able to help you with something you need in order to clear time to help them? For example, I recently help with alibaba dubbo. I think others may be able to help you'all some how to make mutual benefit. I fear no one will work on this unless someone like you does.

adriancole on 13 Dec 2018

@adriancole
Sorry, I have been doing Envoy's secondary development for the company recently. I will take some time this month to familiarize myself with the Zipkin v2 format. I am willing to help the community to complete this issue, but I can't guarantee that it will be completed soon.

@mattklein123 please assign to me

zyfjeff on 13 Dec 2018

@zyfjeff thanks for even considering this.. starting is the first step in finishing (or so my span says)

adriancole on 13 Dec 2018

if any questions ask on https://gitter.im/openzipkin/zipkin I've alerted the team to watch for you!

adriancole on 13 Dec 2018

OK, I am happy to help the community.

zyfjeff on 13 Dec 2018

🎉8

@zyfjeff If you can help get me started on where I would want to change I might be able to start the ball rolling on this. I've got a working dev environment from my small contribution about a year ago, so I only have to update that.

devinsba on 17 Dec 2018

@devinsba I can help you.

zyfjeff on 17 Dec 2018

@devinsba Are you still continue to work? I recently ready to solve this problem

@adriancole
If the v2 format is supported, how is the format of v1 handled? Or is it made configurable?

zyfjeff on 19 Dec 2018

If you have the bandwidth go for it. I was going to have to squeeze it into downtime during the holidays.

devinsba on 19 Dec 2018

Any progress on this issue? @zyfjeff

foshantiger on 18 Apr 2019

If the v2 format is supported, how is the format of v1 handled? Or is it made configurable?

@zyfjeff if you are asking about Zipkin Server: v1 write endpoints are still available on the server, in addition to v2 read/write endpoints. If you mean tracers, they typically include a configuration option for the endpoint and encoding to send spans.

shakuzen on 19 Apr 2019

@Dudi119 any chance you can help with this? everyone in the ecosystem including zipkin clones are affected by lack of v2 support and it seems the project maintainers are still not doing this on own

adriancole on 23 Apr 2019

ps the next version of zipkin (2.13) will also have a grpc endpoint with exact same proto3 message as http post /api/v2/spans

https://github.com/apache/incubator-zipkin-api/blob/master/zipkin.proto#L224

that said I think most clones likely will want the json endpoint (not that it is mutually exclusive)

adriancole on 23 Apr 2019

Another option may be https://github.com/envoyproxy/envoy/pull/5387 - as I believe there is a zipkin v2 exporter. Looks pretty close to being merged.

objectiser on 23 Apr 2019

If I were a maintainer of envoy, I would not merge that request due to the upcoming merger of census and OT, I would let that settle and then bring in the merged c++ client. But that is just my 2p

devinsba on 23 Apr 2019

@adriancole sure, will be happy to help.

Dudi119 on 23 Apr 2019

We've found issues with the OpenCensus and OT tracers, as they do not add the B3 propagation headers to the request sent upstream. I think that supporting the Zipkin v2 format with the existing Zipkin tracer should not be that difficult?? If someone can point me at the files I would need to change, maybe I can make a PR for it.

cetanu on 14 Aug 2019

my guess is someone could pickup where these last stalled due to people's lives getting too busy. I know @basvanbeek was interested in this, too https://github.com/envoy-zipkin/envoy/pull/3

adriancole on 14 Aug 2019

Yeah. Sorry for this. @cetanu let me prepare some playground for it and let you know. I’ll sync with @basvanbeek as well.

dio on 14 Aug 2019

So, this is effectively overcome if you use the opencensus driver, which will export via zipkin v2, like this:

EDIT: "effectively overcome" is an overly-ambitious qualifier. This example is only relevant if you want to implement tracing via opencensus, and at the time of writing, this is the only way to ship off _any_ spans from envoy in zipkin v2 format.

tracing:
  http:
    name: envoy.tracers.opencensus
    typed_config:
      "@type": type.googleapis.com/envoy.config.trace.v2.OpenCensusConfig
      zipkin_exporter_enabled: true
      zipkin_url: http://127.0.0.1:19000/trace
      zipkin_service_name: myservice
      outgoing_trace_context: [ "TRACE_CONTEXT", "GRPC_TRACE_BIN" ]

stdout_exporter_enabled: true is also useful here for debugging.

My configuration is such that I also require adding an API key to the trace server, so I feed it through envoy as a listener (I got this from another gh issue, but thought it'd be handy here too):

static_resources:
  listeners:
  - name: trace
    address:
      socket_address: { address: 0.0.0.0, port_value: 19000 }
    filter_chains:
      filters:
      - name: envoy.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
          stat_prefix: zipkin_http
          route_config:
            name: local_route
            request_headers_to_add:
            - header: { key: "Api-Key", value: "MYAPIKEY" }
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match: { path: "/trace" }
                route: { auth_host_rewrite: true, cluster: "zipkin_outbound" }
          http_filters:
          - name: envoy.router
  clusters:
    - name: zipkin_inbound
      connect_timeout: 1s
      type: STATIC
      http_protocol_options: {}
      load_assignment:
        cluster_name: zipkin_inbound
        endpoints:
        - lb_endpoints:
          - endpoint:
              address:
                socket_address:
                  address: 127.0.0.1
                  port_value: 19000
    - name: zipkin_outbound
      connect_timeout: 1s
      type: LOGICAL_DNS
      lb_policy: ROUND_ROBIN
      dns_lookup_family: V4_ONLY
      respect_dns_ttl: true
      http_protocol_options: {}
      tls_context: {}
      load_assignment:
        cluster_name: zipkin_outbound
        endpoints:
        - lb_endpoints:
          - endpoint:
              address:
                socket_address:
                  address: ACTUAL_ZIPKIN_SERVER.com
                  port_value: 443

The added benefit here is that if this is acting as a sidecar, the actual upstream server can also use this exposed listener and not have to be concerned with the api key or other configuration.

mattbailey on 14 Aug 2019

This is sortof an advertisement for something else. notice the example
also suggests to use TRACE_CONTEXT and not B3 which would definitely
break almost everyone's propagation setup.

While it is a shame folks have been unable to muster writing json, it
seems a bit overkill to suggest replacing the entire component as a
solution.

On Wed, Aug 14, 2019 at 11:04 PM Matt Bailey notifications@github.com wrote:
>

So, this is effectively overcome if you use the opencensus driver, which will export via zipkin v2, like this:

tracing:
http:
name: envoy.tracers.opencensus
typed_config:
"@type": type.googleapis.com/envoy.config.trace.v2.OpenCensusConfig
zipkin_exporter_enabled: true
zipkin_url: http://127.0.0.1:19000/trace
zipkin_service_name: myservice
outgoing_trace_context: [ "TRACE_CONTEXT", "GRPC_TRACE_BIN" ]

My configuration is such that I also require adding an API key to the trace server, so I feed it through envoy as a listener:

static_resources:
listeners:

name: trace
address:
socket_address: { address: 0.0.0.0, port_value: 19000 }
filter_chains:
filters:

name: envoy.http_connection_manager

typed_config:

"@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager

stat_prefix: zipkin_http

route_config:

name: local_route

request_headers_to_add:

- header: { key: "Api-Key", value: "MYAPIKEY" }

virtual_hosts:

- name: local_service

domains: ["*"]

routes:

- match: { path: "/trace" }

route: { auth_host_rewrite: true, cluster: "zipkin_outbound" }

http_filters:

name: envoy.router

clusters:

name: zipkin_inbound

connect_timeout: 1s

type: STATIC

http_protocol_options: {}

load_assignment:

cluster_name: zipkin_inbound

endpoints:

lb_endpoints:

endpoint:

address:

socket_address:

address: 127.0.0.1

port_value: 19000

name: zipkin_outbound

connect_timeout: 1s

type: LOGICAL_DNS

lb_policy: ROUND_ROBIN

dns_lookup_family: V4_ONLY

respect_dns_ttl: true

http_protocol_options: {}

tls_context: {}

load_assignment:

cluster_name: zipkin_outbound

endpoints:

lb_endpoints:

endpoint:

address:

socket_address:

address: ACTUAL_ZIPKIN_SERVER.com

port_value: 443

The added benefit here is that if this is acting as a sidecar, the actual upstream server can also use this exposed listener and not have to be concerned with the api key or other configuration.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

adriancole on 15 Aug 2019

Uh I’m not sure what you mean. I’m not “advertising” anything. I have no existing trace setup, but a constraint on reporting to a zipkin v2 endpoint. I thought it might be useful to someone in a similar situation.

mattbailey on 15 Aug 2019

I called the "use this instead thing instead" approach advertising, but I can see how that's kindof a distracting term for the technique. It is good that there are parts of the codebase people are willing to maintain, and it is helpful to know they exist.

The "use this thing instead" approach is quite a huge hammer considering the small amount of effort this needs. If it weren't cpp, I think this would have been done ages ago.

adriancole on 15 Aug 2019

One other problem I had with the OpenCensus tracer is that it doesn't support 64 bit trace ids, whereas I believe the current zipkin tracer does allow this to be set via a boolean field.

I am hopeful that this will remain an option if changes are made to support the Zipkin v2 format via the original, or new, tracer.

Also, I don't want to just be a burden and push for this change, If I can help in any way I can be set on a small task. I am very inexperienced with cpp.

The above pattern with processing via another listener opens some doors for me which I may use in future, but it wouldn't address allow me to modify the trace id, especially since envoy would have already sent propagation headers to the upstream prior to emitting a trace to the tracing cluster.

cetanu on 15 Aug 2019

@adriancole you're right, I should have qualified my example with more specific caveats about what it's appropriate for. I dropped it in this issue mostly because this is the issue I found when I came across the problem.

mattbailey on 15 Aug 2019

@cetanu if you want to contribute to the current PR, I'll add you to the https://github.com/envoy-zipkin/envoy repo.

dio on 21 Aug 2019

Was this page helpful?

0 / 5 - 0 ratings