Envoy: Dump/Tap support

Created on 8 Aug 2017 · 15Comments · Source: envoyproxy/envoy

Dump/tap is an area that we haven't gone into yet in Envoy but potentially has a huge amount of value. Opening this issue to gather a feature set that would be generally interesting. Some things to think about:

L4 and L7 dumping
Dumping to file
Dumping to network
Controlling dumping via filters (like tcpdump)
Controlling dumping via admin endpoint
Controlling dumping via new xDS API (for system-wide dumping w/ filters)

What else?

enhancement no stalebot

Source

mattklein123

👍1

Most helpful comment

I'd love to see dev-time introspection for client engineers on par with what ngrok used to offer. We currently use Charles for that (which is good for debugging un-cooperating endpoints), but it'll be nice if it was an integral part of the backend.

ikonst on 13 Aug 2017

❤3

All 15 comments

This is a cool idea. For both file and disk dumping/tapping I'd suggest that an important feature is encryption with credentials obtained by the API.

dnoe on 8 Aug 2017

One useful application of this is trace+replay for performance work. The idea is you record in a format that can later be used to recreate requests, allowing representative real world workloads to be captured and analyzed offline.

htuch on 8 Aug 2017

ikonst on 13 Aug 2017

❤3

If dumping can be easily segregated by flows (HTTP2 or TCP), that would be a huge win. This doesn't have to exist inside Envoy's dump logic its self, but tooling around it can make it significantly more usable.

I regularly use https://github.com/simsong/tcpflow in some tricky situations, for an example of a de-multiplexer.

Also dumping internal telemetry along with the data (i.e. stats recorded, status of circuit breakers, retries etc) in a framed format would be an interesting addition.

theatrus on 13 Aug 2017

This would be a great feature. While we don't have specific tooling today, it would be useful to have this tap capability in place to selectively route to security monitoring tools. We wouldn't want full contents, but being able to select specific headers would give us great visibility.

YanceySlide on 14 Aug 2017

Very clear and unsurprising handling of back-pressure is important for features like this. If mirroring of traffic is not possible or violates data-plane QoS, it should be very obvious that Envoy has fallen back to sampling or dropping packets on the floor.

ryancox on 14 Aug 2017

Agree with @ikonst that this could be great for client debugging, and it would be nice to remove intermediary tools. One challenge with the clients is that there's a few hoops to jump through to point them at a specific instance. If we could enable this via headers (and handle collection via something like dump-to-network), that overhead might be avoided. There are some security concerns (e.g. DDOS potential), but perhaps that can be addressed with per-instance ratelimiting/circuit breaking.

goaway on 15 Aug 2017

Are we going to dump to pcap here btw?

htuch on 23 Oct 2017

I haven't thought about output formats yet. I don't think we can dump explicitly to pcap since we don't have full packet data. I need to investigate different formats to see what would be the best option.

mattklein123 on 23 Oct 2017

About a year after opening this, I'm ready to get going. I wrote a short design doc. Please comment! https://docs.google.com/document/d/1fgVAH8BMrq_5dt8m54Rxp0OGXt6WlNUsAW0xGlBOkJs/edit#

mattklein123 on 17 May 2018

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

stale[bot] on 28 Jun 2018

Still planning on working on this.

mattklein123 on 28 Jun 2018

stale[bot] on 28 Jul 2018

Still planning on working on this. Got sidetracked but sometime in the next few months.

mattklein123 on 28 Jul 2018

With https://github.com/envoyproxy/envoy/pull/6105 I'm calling this high level issue done. I'm going to be opening a series of small issues for work that can be parallelized.

mattklein123 on 27 Feb 2019

Was this page helpful?

0 / 5 - 0 ratings