Envoy: REST xDS API with HTTP GET requests

Created on 25 Sep 2018 · 29Comments · Source: envoyproxy/envoy

Description:

In some cases, some of Envoy users want to use a object storage like Amazon S3 as a control plane. This is mainly for that simplicity and most of Envoy configurations are almost static in that case. For example, if one wants to use only Envoy's failure recovery features, the one can deliver that configuration to Envoy instances via RDS and CDS. That CDS response can point to a DNS name to central (server-side) load balancer and it's not so dynamic. Also, RDS can response route config with timeouts and retries and they are not updated frequently in this case.

Considerations:

URL mapping
How to encode DiscoveryRequest into HTTP GET requests
- v2 XDS has version communication mechanism (as far as I know)

I don't know yet we can have well design and implementation for this feature, so this needs comments and further investigation.

Relevant Links:

Conversation on Slack: https://envoyproxy.slack.com/archives/C78M4KW76/p1537878857000100
One use case of using object storage as a v1 CDS/RDS API: https://blog.envoyproxy.io/service-mesh-and-cookpad-ba4d5d915dbd

enhancement help wanted

Source

taiki45

🚀2 ❤1

Most helpful comment

I'm planning to take this. At first, I'll make a design proposal for URL mapping and request paramter encoding.

taiki45 on 4 Oct 2018

👍2

All 29 comments

@taiki45 is the issue here basically that the v2 xDS REST mappings use POST? Is that the only issue?

mattklein123 on 25 Sep 2018

@mattklein123 Yes, current v2 xDS REST mappings only support POST requests. This issue is a proposal for having GET requests support along with current POST support.

taiki45 on 25 Sep 2018

OK, sure that sounds reasonable to me and should be a really simple change w/ configuration.

mattklein123 on 25 Sep 2018

Can someone who understands the nuance of REST HTTP verb mapping chime in? Is it valid to use GET if we have stateful changes, my high-level understanding of this topic indicates maybe not? Would we offer a simplified version of the API without the version + ACK/NACK dance?

htuch on 25 Sep 2018

TBH it really doesn't make any difference, it just angers hard core REST people. The HTTP standard says that GET requests can have a body, etc. IMO we should just add a config flag to switch the method and document why it's there.

mattklein123 on 25 Sep 2018

This would be really nice, we'd like to use S3 as the control plane too. One potential gotcha might be auth, which I don't think is supported by the REST client? (correct me if I'm wrong). Which means the S3 bucket needs to be publicly readable, which isn't great, especially for secrets. Unless there's a workaround?

tekumara on 26 Sep 2018

@tekumara is this just the ability to add some extra headers with static tokens in the Envoy config for auth? That exists today for gRPC but not REST as you say. Or, do you need something more sophisticated, to deal with token lifetimes, refresh, integration with AWS SDK etc?

htuch on 26 Sep 2018

To @tekumara point, we need to calculate the value for the Authorization header[1] and it seems it has lifetimes as pointed by @htuch.

Do you think to let envoy somehow to get the value e.g. from file worth to have (map a value of a header entry from an external source)?

[1] The format would be something like: Authorization: <Algorithm> Credential=<Access Key ID/Scope>, SignedHeaders=<SignedHeaders>, Signature=<Signature>, there are some values to be calculated (and I think that can be done outside envoy process).

dio on 26 Sep 2018

At some point you want to be as expressive as https://github.com/envoyproxy/envoy/blob/master/api/envoy/api/v2/core/grpc_service.proto#L35, including the ability to write arbitrary extension plugins for credential handling. I'm wondering if we can head towards this incrementally and what the minimum needed would be?

htuch on 26 Sep 2018

I missed one important point in PR description: I think changing HTTP method of xDS API requests from Envoy is not enough. As far as I know, most of object storage HTTP endpoints don't parse request body. Instead we have to encode Envoy's node identifier and/or xDS API type into HTTP request path like v1 xDS does. Something like: /v2/discovery:clusters/${cluster_name}/${node_id}. In v2 xDS API
we have more node information than cluster name and node id, so we might need other encoding mechanism of node information into request path.

taiki45 on 27 Sep 2018

For authorization, it’s good Envoy has that extensibility, but for now I think we can work-around with source-IP based restrictions (S3 has the feature) or simple HTTP proxy which handles authorization.

taiki45 on 27 Sep 2018

👍1

@htuch yes as mentioned by @dio for AWS at least, as per https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.htm, additional authorization headers aren't static and involve generating a signature from elements of the request, using your credentials (which in the case of instance credentials, are fetched from the metadata endpoint)

@taiki45 makes a good point tho, in our case restricting S3 buckets to a specific VPC (or VPC endpoint) is a reasonable workaround.

tekumara on 27 Sep 2018

BTW @taiki45 when using S3 with v1 xDS, did you ever run into issues with S3's eventual consistency guarantees?

tekumara on 27 Sep 2018

@tekumara No, envoy's eventual consistency model covers S3's limitation well.

taiki45 on 28 Sep 2018

👍1

I'm planning to take this. At first, I'll make a design proposal for URL mapping and request paramter encoding.

taiki45 on 4 Oct 2018

👍2

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

stale[bot] on 4 Nov 2018

Working on this but probably I’ll take another approach.

taiki45 on 4 Nov 2018

@taiki45 thought on this other approach?

htuch on 5 Nov 2018

Sorry for confusing, I'm just wondering.

The another approach I mentioned is a proxy approach. That is, prepare community made proxy and users who want to convert POST to GET use the proxy.

The reason is I'm not sure we can make a well-designed feature for Envoy's standard ranther than current gRPC/POST REST design. In other words, I'm wondering this feature can be worth to be merged into mainline.

Upsides:

By this feature, users can be setup xDS more easily than writting control plane.

Downsides:

Limitation of effective xDS series: the feature works well for CDS and RDS (probably also for LDS), but not for EDS etc because of its dynamicity.
Limitation of node specification: gRPC/POST REST version can use core.Node information effectively, but GET version has to use specific core.Node encoding like using only Node.cluster or etc.
- I noticed we can make this configurable, so this point can be ignored.

NOTE: for feasibility, I'm writting a proxy which translate v2 CDS/RDS POST requests to GET requests and it works including xDS version negotiation. So I think it's OK.

taiki45 on 7 Nov 2018

@taiki45 why not have Envoy itself be the proxy and configure a loopback for xDS requests? :)

htuch on 7 Nov 2018

👍1

stale[bot] on 7 Dec 2018

Sorry for being late but I'm still working this.

taiki45 on 7 Dec 2018

stale[bot] on 7 Jan 2019

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted". Thank you for your contributions.

stale[bot] on 15 Jan 2019

Oops, I'd like to comment the design proposal in a few days.

taiki45 on 15 Jan 2019

@taiki45 given the work that AWS is doing to have direct AWS signature integration, do you think this is still necessary (https://github.com/envoyproxy/envoy/pull/5580)? cc @lavignes

mattklein123 on 15 Jan 2019

I think https://github.com/envoyproxy/envoy/pull/5580 is for authentication not for xDS protocol translation described in this issue. Probably, the PR implementation can help building better authentication mechanism in this issue area.

taiki45 on 16 Jan 2019

@htuch I'm so sorry for bothering you, but could you remove "no stalebot" label from this issue and add "help wanted" label so that any others can take this issue again? I can't take this issue due to situation change.

Left my design note for this:

Summary of previous conversation

a. To enable Envoy to get its xDS config from object storage like Amazon S3, we need some translations to embed current
REST POST request body data (node id, cluster etc..) to the HTTP request URI.
b. In this issue, authentication/authorization extension to xDS REST is out of scope. It's beecause we can handle this with object
storage source-IP restriction etc.

Design considerations

We have some considerations to design REST-GET xDS API.

This design note is beased on xDS POST-GET translation server implemtation: https://github.com/cookpad/itacho

Request translation

Currently, REST-POST sends below information:

Request URI: /v2/discovery:clusters

Request body

{
  "version_info": "...",
  "node": "{...}",
  "resource_names": [],
  "type_url": "...",
  "response_nonce": "...",
  "error_detail": "{...}"
}

https://www.envoyproxy.io/docs/envoy/v1.9.0/api-v2/api/v2/discovery.proto#envoy-api-msg-discoveryrequest

In REST-GET xDS, we can't use request body in REST-GET xDS, so we have to translate and embed those information into request URI. We can extract specific information like node.cluster or it might be configurable.

Request URI: /v2/discovery/${xds_type}/${node.cluster}

e.g. /v2/discovery/clusters/user

xDS version negotiation

As far as I know, envoy does not use responded xDS version. So we can send responses immidiately to envoy requests even
if the responses don't contain any updates. The downside is increase of traffic.

This version negotiation translation from gRPC to REST is not obvious in case that an envoy instance which already knows latest xDS version sends an xDS request. In that case, gRPC xDS server just waits until xDS contents will be changed, but REST xDS server can't wait because most of REST environments don't support long polling (AFAIK). Current go-control-plane implementation will response internal server error in that situation: https://github.com/envoyproxy/go-control-plane/blob/c15526a457a61287f6ab2561195577b261fcadb8/pkg/server/gateway.go#L83-L86

Supported xDS type

I think only CDS and RDS should be supported in terms of typical use cases. Other xDS like v2 SDS can be supported but I don't have a specific use case now.

taiki45 on 29 Jan 2019

Since this is still being worked out. This is my (a bit dirty) workaround:
WARNING: since it's forwarding to its own listener, the dynamic listener wil start after the 2nd config request.

node:
  id: id_1
  cluster: test
static_resources:
  listeners:
  # the following listener will:
  # - rewrite a HTTP POST to a HTTP GET
  # - forward traffic to some url (currently S3)
  - name: config_proxy
    address:
      socket_address:
        address: 127.0.0.1
        port_value: 8000
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        config:
          stat_prefix: config_proxy
            name: local_route
            virtual_hosts:
            - name: config_proxy
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  host_rewrite: <bucket-name>.s3-<zone>.amazonaws.com
                  cluster: s3
          http_filters:
          - name: envoy.filters.http.lua
            typed_config:
              "@type": type.googleapis.com/envoy.config.filter.http.lua.v2.Lua
              # this takes a http request object
              # and performs a small transformation on it
              # everything will be written to use HTTP GET
              # headers prefixed with ":" are special case headers
              inline_code: |
                function envoy_on_request(request_handle)
                  request_handle:headers():replace(":method", "GET")
                end
          - name: envoy.router
  # this is the actual listener
  # it will try to get it's configuration from config_proxy
  - name: reverse_proy
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 10000

    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        config:
          access_log:
            name: envoy.access_loggers.file
            config:
              path: /dev/stdout
          stat_prefix: ingress_http
          rds:
            route_config_name: vacansoleil_dynamic_routes
            config_source:
              initial_fetch_timeout: 5s
              api_config_source:
                api_type: REST
                refresh_delay:
                  # first request fails because it's pointing to a listener which hasn't started yet
                  seconds: 5 
                cluster_names:
                - config_proxy
          http_filters:
          - name: envoy.router
  clusters:
  - name: config_proxy
    connect_timeout: 0.25s
    type: LOGICAL_DNS
    dns_lookup_family: V4_ONLY
    lb_policy: ROUND_ROBIN
    hosts:
     - socket_address:
        address: 127.0.0.1
        port_value: 8000
  - name: s3
    connect_timeout: 0.25s
    type: LOGICAL_DNS
    dns_lookup_family: V4_ONLY
    lb_policy: ROUND_ROBIN
    hosts:
     - socket_address:
        address: <bucket-name>.s3-<zone>.amazonaws.com
        port_value: 443
    tls_context:
      sni: <bucket-name>.s3-<zone>.amazonaws.com