It would be good to have individual config knobs to control timeouts for TLS connections that are in waiting for handshake to start and are in the middle of the handshake. AFAIK there is an existing connection idle_timeout config but it tends to be relatively long because it is meant to cover cases where an idle H1 connection is waiting for the next request.
The handshake timeout should cover the full operation since it is reasonable for clients to complete handshakes relatively quickly(within 5-15secs) once started. A longer timeout may be appropriate for connections that are waiting for the handshake to start in order to accommodate for connection prefetching.
The effective value of these timeouts could be reduced based on memory pressure to increase resiliency to memory exhaustion attacks and improving the proxy's ability to accept legitimate connections/requests. Config strawman: min/max timeouts and high/low memory thresholds at which the timeouts apply or high/low connection count at which timeouts apply
This seems like something that should be added to the DownstreamTlsContext proto. If that sounds reasonable, I'd be happy to address the first part of this issue (adding the configuration knobs).
CC @envoyproxy/api-shepherds
DownstreamTlsContext proto seems like the right place, but you should double check with the api-shepherds
Makes sense to me, @lizan WDYT?
For upstream TLS the handshake is included in connect_timeout, wondering whether we should do this similarly?
A longer timeout before handshake starts is appropriate in some cases to allow for TCP prefetches.
If i have a tls listener it seems that it will sit there permanently if the client just connects and doesn't ever send anything? is that what we expect? (when I read "For TLS connections, the connect timeout includes the TLS handshake." on https://www.envoyproxy.io/docs/envoy/latest/faq/configuration/timeouts#tcp I mistakenly expect that it applies to my inbound connection too but it isn't?) am I missing a listener side/tls filter side timeout setting?
I think that the idle HTTP connection timeout applies in this case. Unfortunately it has a very long default value of 1 hour. https://github.com/envoyproxy/envoy/blob/706d9761dd1336ce0b650b34dae649312ec85c0b/api/envoy/config/core/v3/protocol.proto#L78
thanks @antoniovicente but lets say I would set the idle_timeout to 10s; after the tls handshake completes, I don't want to drop established connections that idle for 10s - I really need a "please complete the layer 7 connection (ie post tls) within X" not a idle timeout for the whole duration of the flow (and/or the ability to change the timeout after tls is done)
thanks @antoniovicente but lets say I would set the idle_timeout to 10s; after the tls handshake completes, I don't want to drop established connections that idle for 10s - I really need a "please complete the layer 7 connection (ie post tls) within X" not a idle timeout for the whole duration of the flow (and/or the ability to change the timeout after tls is done)
I agree with you. I think we need more specific timeouts for various operations that happen prior to a request being active on the connection. Currently those specific timeouts don't exist. See https://github.com/envoyproxy/envoy/issues/11427 for a related feature request.
I see the other issue is closed but I can't quite tell if I can now set a tls establishment timeout or not?
Yep, #13610 added that. #13800 adds the ability to scale the timeout in response to overload, which I think should close this issue out completely.
Most helpful comment
Yep, #13610 added that. #13800 adds the ability to scale the timeout in response to overload, which I think should close this issue out completely.