Rate limit service
No request from envoy goes to my rate limit service
Hello, I am having long term issue and I had to give up. As a rate limit service I am using reference implementation lyft rate limit running on localhost.
I recreated config to small files, so I can demonstrate my issue.
This is e.g. my rate limit config:
domain: rate_per_ip
descriptors:
- key: remote_address
rate_limit:
unit: minute
requests_per_unit: 3
and this is my simple envoy config.yaml
admin:
access_log_path: /tmp/admin_access.log
address:
socket_address: { address: 127.0.0.1, port_value: 9901 }
static_resources:
listeners:
- name: listener_0
address:
socket_address: { address: 0.0.0.0, port_value: 10000 }
filter_chains:
- filters:
- name: envoy.http_connection_manager
config:
use_remote_address: true
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
rate_limits:
- stage: 0
actions:
- remote_address: {}
routes:
- match: { prefix: "/" }
route:
host_rewrite: www.google.com
cluster: service_google
http_filters:
- name: envoy.rate_limit
config:
stage: 0
domain: rate_per_ip
- name: envoy.router
clusters:
- name: rate_limit_cluster
type: STATIC
connect_timeout: 0.25s
lb_policy: ROUND_ROBIN
hosts: [{ socket_address: { address: 127.0.0.1, port_value: 8081 }}]
- name: service_google
connect_timeout: 0.25s
type: LOGICAL_DNS
# Comment out the following line to test on v6 networks
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
hosts: [{ socket_address: { address: google.com, port_value: 443 }}]
tls_context: { sni: www.google.com }
rate_limit_service:
grpc_service:
envoy_grpc:
cluster_name: rate_limit_cluster
timeout: 0.25s
I intend it to do simple rate limit on ip address and then redirect me to google.com, but it never even tries to access my rate limit service.
I spent literally hours in docs, so sorry if I miss some newbie thing.
Thanks for any response.
@HappyStoic from your config, I dont see anything odd. You say but it never even tries to access my rate limit service what stats are you looking at to assert this? Can you start up your setup, send a request, and then print out a dump of the stats and cluster output?
@junr03 Yeah, I can run this setup and my request to localhost:10000 really gets me to google, however I am not limited with 3requests per minute. Also debug log of rate limit service localhost:6070/stats says that my domain rate_per_ip has 0 hits.
I even tried to redirect envoy limitrate cluster to special port on which I prepared socat tool (sending request to real ratelimit port) to monitor requests and nothing ever goes there from envoy (I expected atleast some non readable data since it is gRpc)
@HappyStoic sorry what I meant by my last question Can you start up your setup, send a request, and then print out a dump of the stats and cluster output? was if you could do that and post here the output of your envoy admin stats and cluster dump.
Well this lead me to finding my point of failure. Since it was docker container I had to correct addresses in config to get to host machine. Nevertheless I cannot deal now with fact, that envoy and lyft rate limit service seem to use default different protocols?
viz command line logs on rate limit:
2018/05/16 15:34:07 transport: http2Server.HandleStreams received bogus greeting from client: "POST /pb.lyft.ratelimit."
and on envoy docker container:
[2018-05-16 15:34:07.097][13][info][client] source/common/http/codec_client.cc:117] [C145] protocol error: http/1.1 protocol error: HPE_INVALID_CONSTANT
/clusters
rate_limit_cluster::default_priority::max_connections::1024
rate_limit_cluster::default_priority::max_pending_requests::1024
rate_limit_cluster::default_priority::max_requests::1024
rate_limit_cluster::default_priority::max_retries::3
rate_limit_cluster::high_priority::max_connections::1024
rate_limit_cluster::high_priority::max_pending_requests::1024
rate_limit_cluster::high_priority::max_requests::1024
rate_limit_cluster::high_priority::max_retries::3
rate_limit_cluster::added_via_api::false
rate_limit_cluster::10.0.4.75:8081::cx_active::0
rate_limit_cluster::10.0.4.75:8081::cx_connect_fail::15
rate_limit_cluster::10.0.4.75:8081::cx_total::96
rate_limit_cluster::10.0.4.75:8081::rq_active::0
rate_limit_cluster::10.0.4.75:8081::rq_error::96
rate_limit_cluster::10.0.4.75:8081::rq_success::0
rate_limit_cluster::10.0.4.75:8081::rq_timeout::0
rate_limit_cluster::10.0.4.75:8081::rq_total::81
rate_limit_cluster::10.0.4.75:8081::health_flags::healthy
rate_limit_cluster::10.0.4.75:8081::weight::1
rate_limit_cluster::10.0.4.75:8081::region::
rate_limit_cluster::10.0.4.75:8081::zone::
rate_limit_cluster::10.0.4.75:8081::sub_zone::
rate_limit_cluster::10.0.4.75:8081::canary::false
rate_limit_cluster::10.0.4.75:8081::success_rate::-1
/stats
cluster.rate_limit_cluster.bind_errors: 0
cluster.rate_limit_cluster.internal.upstream_rq_503: 96
cluster.rate_limit_cluster.internal.upstream_rq_5xx: 96
cluster.rate_limit_cluster.lb_healthy_panic: 0
cluster.rate_limit_cluster.lb_local_cluster_not_ok: 0
cluster.rate_limit_cluster.lb_recalculate_zone_structures: 0
cluster.rate_limit_cluster.lb_subsets_active: 0
cluster.rate_limit_cluster.lb_subsets_created: 0
cluster.rate_limit_cluster.lb_subsets_fallback: 0
cluster.rate_limit_cluster.lb_subsets_removed: 0
cluster.rate_limit_cluster.lb_subsets_selected: 0
cluster.rate_limit_cluster.lb_zone_cluster_too_small: 0
cluster.rate_limit_cluster.lb_zone_no_capacity_left: 0
cluster.rate_limit_cluster.lb_zone_number_differs: 0
cluster.rate_limit_cluster.lb_zone_routing_all_directly: 0
cluster.rate_limit_cluster.lb_zone_routing_cross_zone: 0
cluster.rate_limit_cluster.lb_zone_routing_sampled: 0
cluster.rate_limit_cluster.max_host_weight: 0
cluster.rate_limit_cluster.membership_change: 1
cluster.rate_limit_cluster.membership_healthy: 1
cluster.rate_limit_cluster.membership_total: 1
cluster.rate_limit_cluster.retry_or_shadow_abandoned: 0
cluster.rate_limit_cluster.update_attempt: 0
cluster.rate_limit_cluster.update_empty: 0
cluster.rate_limit_cluster.update_failure: 0
cluster.rate_limit_cluster.update_no_rebuild: 0
cluster.rate_limit_cluster.update_success: 0
cluster.rate_limit_cluster.upstream_cx_active: 0
cluster.rate_limit_cluster.upstream_cx_close_notify: 0
cluster.rate_limit_cluster.upstream_cx_connect_attempts_exceeded: 0
cluster.rate_limit_cluster.upstream_cx_connect_fail: 15
cluster.rate_limit_cluster.upstream_cx_connect_timeout: 0
cluster.rate_limit_cluster.upstream_cx_destroy: 0
cluster.rate_limit_cluster.upstream_cx_destroy_local: 0
cluster.rate_limit_cluster.upstream_cx_destroy_local_with_active_rq: 81
cluster.rate_limit_cluster.upstream_cx_destroy_remote: 0
cluster.rate_limit_cluster.upstream_cx_destroy_remote_with_active_rq: 0
cluster.rate_limit_cluster.upstream_cx_destroy_with_active_rq: 81
cluster.rate_limit_cluster.upstream_cx_http1_total: 96
cluster.rate_limit_cluster.upstream_cx_http2_total: 0
cluster.rate_limit_cluster.upstream_cx_idle_timeout: 0
cluster.rate_limit_cluster.upstream_cx_max_requests: 0
cluster.rate_limit_cluster.upstream_cx_none_healthy: 0
cluster.rate_limit_cluster.upstream_cx_overflow: 0
cluster.rate_limit_cluster.upstream_cx_protocol_error: 81
cluster.rate_limit_cluster.upstream_cx_rx_bytes_buffered: 0
cluster.rate_limit_cluster.upstream_cx_rx_bytes_total: 1184
cluster.rate_limit_cluster.upstream_cx_total: 96
cluster.rate_limit_cluster.upstream_cx_tx_bytes_buffered: 0
cluster.rate_limit_cluster.upstream_cx_tx_bytes_total: 25839
cluster.rate_limit_cluster.upstream_flow_control_backed_up_total: 0
cluster.rate_limit_cluster.upstream_flow_control_drained_total: 0
cluster.rate_limit_cluster.upstream_flow_control_paused_reading_total: 0
cluster.rate_limit_cluster.upstream_flow_control_resumed_reading_total: 0
cluster.rate_limit_cluster.upstream_rq_503: 96
cluster.rate_limit_cluster.upstream_rq_5xx: 96
cluster.rate_limit_cluster.upstream_rq_active: 0
cluster.rate_limit_cluster.upstream_rq_cancelled: 0
cluster.rate_limit_cluster.upstream_rq_maintenance_mode: 0
cluster.rate_limit_cluster.upstream_rq_pending_active: 0
cluster.rate_limit_cluster.upstream_rq_pending_failure_eject: 15
cluster.rate_limit_cluster.upstream_rq_pending_overflow: 0
cluster.rate_limit_cluster.upstream_rq_pending_total: 96
cluster.rate_limit_cluster.upstream_rq_per_try_timeout: 0
cluster.rate_limit_cluster.upstream_rq_retry: 0
cluster.rate_limit_cluster.upstream_rq_retry_overflow: 0
cluster.rate_limit_cluster.upstream_rq_retry_success: 0
cluster.rate_limit_cluster.upstream_rq_rx_reset: 0
cluster.rate_limit_cluster.upstream_rq_timeout: 0
cluster.rate_limit_cluster.upstream_rq_total: 81
cluster.rate_limit_cluster.upstream_rq_tx_reset: 0
cluster.rate_limit_cluster.version: 0
I don't want to be debugging here, but envoy docs do not say anything about stating protocols on rate limit service, gRpc is the only one supported, right?
@HappyStoic ah! actually just realized what might be the problem. Thanks for the data, it helped me realize this. The ratelimit cluster is not specified as an h2 cluster. Sorry I missed it in my original read of the config.
You want to add http2_protocol_options: {} to your cluster definition. Like:
- name: rate_limit_cluster
type: STATIC
connect_timeout: 0.25s
lb_policy: ROUND_ROBIN
hosts: [{ socket_address: { address: 127.0.0.1, port_value: 8081 }}]
http2_protocol_options: {}
@junr03 Yeah, that was the problem! Thank you very much.
Hi HappyStoic
I am trying to achieve Rate limiting functionality. I am little confuse on the below config. The sample configuration you provided (as below), where this should will go?
domain: rate_per_ip
descriptors:
Can you please share config?
Thanks
Asisranjan Nayak
Hi HappyStoic
I am trying to achieve Rate limiting functionality. I am little confuse on the below config. The sample configuration you provided (as below), where this should will go?
domain: rate_per_ip
descriptors:
- key: remote_address
rate_limit:
unit: minute
requests_per_unit: 3Can you please share config?
Thanks
Asisranjan Nayak
I have this exact problem, is it solved yet? where the configuration should go
Hi HappyStoic
I am trying to achieve Rate limiting functionality. I am little confuse on the below config. The sample configuration you provided (as below), where this should will go?
domain: rate_per_ip
descriptors:
- key: remote_address
rate_limit:
unit: minute
requests_per_unit: 3Can you please share config?
Thanks
Asisranjan Nayak
Hi, having the same problems. Could not understand, how to combine those configs.
Please someone share the example of configuring the rate limiter. There is nothing helpful in google regarding this.
Thanks!
@Asisranjan @ggalihpp @gleb-s
Hi guys, sorry for my late response.
The configuration you're mentioning is configuration of your rate limit service (service the envoy connects to). In this case reference implementation by lift. You can see how to proceed with configuration in their documentation.
Basically if you start the rate limit service with provided steps in Building and testing section, the configuration is supposed to be in /home/user/src/runtime/data/ratelimit. I hope I'm not mistaken, it's been a long time and I don't have this setup available anymore.
I hope I helped atleast a little. :)
I documented wiring up a minimal working example of this https://medium.com/dm03514-tech-blog/sre-resiliency-bolt-on-sidecar-rate-limiting-with-envoy-sidecar-5381bd4a1137
This issue was crucial in getting the information necessary! Thank you!
Most helpful comment
@HappyStoic ah! actually just realized what might be the problem. Thanks for the data, it helped me realize this. The ratelimit cluster is not specified as an h2 cluster. Sorry I missed it in my original read of the config.
You want to add
http2_protocol_options: {}to your cluster definition. Like: