Is there a recommended way to rate limit the web socket connections ? I went through the docs and some other resources and seems like there's no straight forward way to have the rate limiting for web socket connections.(Maybe use the network rate limit filter? ) So basically my requirement is to capture web socket frames and possibly decode them to extract the information I need for rate limiting. Is it possible to achieve my requirement by writing a network filter to capture, extract and send information to an external rate limiting service ?
cc @alyssawilk I think using the HTTP ratelimit filter in the upgrade filter chain would work here? cc @phlax also for potential doc opportunities.
So your first request was to limit WebSocket connections, which as Matt says can be done with the existing rate limit filter. Your follow-up comment makes it look like you actually want to rate limit not just WebSocket connections but actual WebSocket work, which can't be done with Envoy today as Envoy doesn't frame WebSocket. Which are you hoping for? You could arguably write a filter which does the second, if that's the granularity you're looking for.
@mattklein123 @alyssawilk Thanks a lot for the quick response. So yeah I want to rate limit web socket work not just the number of web socket connections. So can I achieve that by writing a network filter to capture the web socket frames ? I mean will the web socket frames hit the network filters before proxying to upstream clusters ?
Again I think you want to write an HTTP filter, not a network filter, since you want to take the HTTP payload and parse WebSocket frames from there.
There may also be some work to get RateLimit code to do what you want - I'm less familiar with that part of the stack.
@alyssawilk okay but will the web socket frames go that far to the http filters ? As I read only the initial upgrade request go to http filters and then after the connection establishment it behaves more like a raw TCP proxy.
You can use an upgrade filter chain to use normal HTTP filters before the data hits the tcp_proxy layer.
I think your best bet would be to write a custom filter that sets dynamic metadata with some business logic, and then uses the normal HTTP rate limit filter to rate limit on dynamic metadata.
@mattklein123 thanks a lot. will try it out and see.
@mattklein123 @alyssawilk I created a custom http filter and trying to send the web socket frames through that after the upgrade. The connection gets upgraded successfully but the custom filters are not applied. Only the usual http filter chain is applied. Anything wrong with the configuration ? Thanks
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: envoy.http_connection_manager
config:
upgrade_configs:
- upgrade_type: websocket
- filters:
- name: sample
config:
key: via
val: sample-filter-upgrade
- name: envoy.router
stat_prefix: http_proxy
codec_type: AUTO
route_config:
name: all
virtual_hosts:
- name: allbackend_cluster
domains:
- '*'
routes:
- match: { prefix: "/admin"}
direct_response: {status: 403, body: { inline_string: "Forbidden, yo"} }
- match: { prefix: "/"}
route: { host_rewrite: echo.websocket.org, cluster: ws_cluster, upgrade_configs: {upgrade_type: websocket, enabled: true}}
http_filters:
- name: sample
config:
key: via
val: sample-filter
- name: envoy.router
clusters:
- name: ws_cluster
type: logical_dns
dns_lookup_family: V4_ONLY
connect_timeout: 10s
lb_policy: ROUND_ROBIN
hosts: [{ socket_address: { address: echo.websocket.org, port_value: 80 }}]
The custom filter just logs the key and the value passed so I can know which filter is applied.
That looks correct to me at a glance, would have to defer to @alyssawilk for any missing details.
I was confused too - we have regression tests for this!
After a whole bunch of frustrating debugging, it turns out the difference is between
- upgrade_type: websocket
filters:
- name: sample
config:
key: via
val: sample-filter-upgrade
Which configures a custom filter chain for "websocket"
and
- upgrade_type: websocket
- filters:
- name: sample
config:
key: via
val: sample-filter-upgrade
Which configures a websocket upgrade with no custom filter chain, and a custom filter chain for upgrade type "" due to the extra "-". Removing the "-" in front of filters should fix things for you.
For Envoy, cc @htuch, I know tweaking the config after the fact is disallowed, so how can we disallow empty upgrade types in Envoy? Can we just enforce in Envoy and comment in the API it should be non-empty?
Actually cc @envoyproxy/api-shepherds since there's an @ for that.
Would adding a min_len PGV constraint work?
Yeah IMO we should just add the constraint as it is implied and broken without it.
+1, but I wasn't sure if adding constraints after the fact was "legal"
Agree it's broken if you're doing any zero length fields, just not sure if failing to accept on invalid-but-previously-accepted-config is OK.
I think in this case it is ok to just add the constraint. I would not expect anyone to use empty upgrade type.
@mattklein123 @alyssawilk As matt said I am trying to set up a custom header and use that header to send the web socket frames to an external rate limit service using the global rate limit filter. (See the upgrade filter chain).
But only the initial upgrade request hits the global rate limit filter. The web socket frames contain the custom header "via: sample-filter-upgrade" and I have configured to catch the request header name "via" but it does not hit the global rate limit filter. Do I have to configure anything additionally ?
I tested for normal http requests and it works fine. I can provide any additional info and trace. Thanks
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 10000
filter_chains:
- filters:
- name: envoy.http_connection_manager
config:
upgrade_configs:
- upgrade_type: websocket
filters:
- name: sample
config:
key: via
val: sample-filter-upgrade
- name: envoy.rate_limit
config:
domain: rl
request_type: external
stage: 0
rate_limited_as_resource_exhausted: true
failure_mode_deny: false
rate_limit_service:
grpc_service:
envoy_grpc:
cluster_name: ws_cluster
- name: envoy.router
stat_prefix: http_proxy
codec_type: AUTO
route_config:
name: all
virtual_hosts:
- name: allbackend_cluster
domains:
- '*'
routes:
- match: { prefix: "/service1"}
route:
cluster: app1_cluster
rate_limits:
- actions:
- request_headers:
header_name: "via"
descriptor_key: "foo"
- match: { prefix: "/"}
route:
host_rewrite: echo.websocket.org
cluster: ws_cluster
rate_limits:
- actions:
- request_headers:
header_name: "via"
descriptor_key: "foo"
http_filters:
- name: sample
config:
key: via
val: sample-filter
- name: envoy.rate_limit
config:
domain: rl
request_type: external
stage: 0
rate_limited_as_resource_exhausted: true
failure_mode_deny: false
rate_limit_service:
grpc_service:
envoy_grpc:
cluster_name: ws_cluster
- name: envoy.router
clusters:
- name: app1_cluster
connect_timeout: 1s
type: strict_dns
lb_policy: round_robin
load_assignment:
cluster_name: app1_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8000
- name: ws_cluster
type: logical_dns
dns_lookup_family: V4_ONLY
connect_timeout: 10s
lb_policy: ROUND_ROBIN
hosts: [{ socket_address: { address: echo.websocket.org, port_value: 80 }}]
Regarding access logging, I think if you want per-WebSocket logging you're going to want to include that in your filter. Both HTTP and TCP proxy sessions log relevant information per-stream, and there's nothing saying your filter (which is doing the framing) couldn't determine the relevant loggers and logger->log(per-WebSocket info) on each frame.
@alyssawilk Thanks a lot. I went through the code and still couldn't figure it out how to include access logger in my filter and call that logger_->log() method.
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.
Most helpful comment
I was confused too - we have regression tests for this!
After a whole bunch of frustrating debugging, it turns out the difference is between
Which configures a custom filter chain for "websocket"
and
Which configures a websocket upgrade with no custom filter chain, and a custom filter chain for upgrade type "" due to the extra "-". Removing the "-" in front of filters should fix things for you.
For Envoy, cc @htuch, I know tweaking the config after the fact is disallowed, so how can we disallow empty upgrade types in Envoy? Can we just enforce in Envoy and comment in the API it should be non-empty?