Envoy: DNS Filter: high latency when using upstream resolvers

Created on 23 Oct 2020 · 4Comments · Source: envoyproxy/envoy

Title: DNS Filter - high latency when using upstream resolvers

Description:
When using the DNS filter's upstream resolvers there's an additional ~5sec latency with resolves.

Using host and dig don't appear to be affected, but when performing a HTTP request where the requests first performs a DNS lookup, I'm noticing a consistent 5sec additional latency. The latency isn't there if I define a static domain for Envoy to return.

Performing a tcpdump it appears that there's two upstream DNS requests. Both requests that envoy forwards to upstream return a successful resolve. The first response for some reason doesn't get picked up by the client, where a second request is then made which the client appears to successfully pick up. After the second resolve, the HTTP clients then continue with the requests now that it has an IP to use. The time between the first and second DNS request is where I'm seeing the additional latency.

Repro steps:

Configure the DNS filter with an upstream resolver (eg. 8.8.8.8, 1.1.1.1).
Set your client to use envoy as DNS.
Perform a curl, wget or even a python requests against a non-static entry so that the upstream resolver is used.

Config:
https://gist.github.com/skiptomyliu/0ae0959d5f2d6b6c225b393ed145fb73

aredns bug help wanted

Source

skiptomyliu

All 4 comments

@abaptiste do you mind taking a look at this?

mattklein123 on 23 Oct 2020

I am able to reproduce the delay. Using dig or any tool (python3-dnspython) to interact with the filter directly does not show the issue.

Let me dig into this a bit and I'll let you know what I find out.

abaptiste on 26 Oct 2020

👍1

/assign abaptiste

abaptiste on 26 Oct 2020

👍1

It turns out that the client is sending 2 queries with two different Query ID's. When the response for the first query is being generated it is erroneously using the ID for the second query. The client waits until the ID in the response matches the ID of the query.

I am working on tests for this and will create a PR with the fix

abaptiste on 26 Oct 2020

🎉1

Was this page helpful?

0 / 5 - 0 ratings