Probably we can add a PostgreSQL-aware filter? Especially on handling the conn pooling (the infamous "FATAL: remaining connection slots are reserved for non-replication superuser connections" often happens when client tries to pool the conn inside the client directly).
Similar with:
It's certainly possible to implement a PostgreSQL filter. Let's use this ticket to put together a set of features you'd like to see and to gauge interest.
I'd like to see what crunchy-proxy has, the most wanted perhaps these two:
I would love to see support for SSL (Postgres calls it SSL instead of TLS), so that operators of PostgreSQL do not need to maange and rotate the TLS certificates. Technically, SSL would be done by istio, but istio relies on support for different filters in envoy.
@f21 so in that case the client would be just any postgres client doing SSL, i.e. no client side proxy? With sidecar this wouldn't be postgres specific, would it? I.e. plain text to proxy, TLS between proxies and then plain text again to postgres (or any other) server.
I have started a very early design doc here: https://docs.google.com/document/d/15iXAIJN1QFV3Dz86yybyqhgEMpY7wae2bZeKPyFZG5Y
@loewenstein At the moment, I am just interested in having TLS between proxies, because automating certificates issuance and rotation for database services in Kubernetes is not an easy problem to solve. I was not able to find anything in the envoy or istio docs, but does istio's TLS still work if the layer 7 protocol does not currently have a filter (ex: postgres protocol or mysql protocol).
@dio Will the design work with distributed DBs that uses the postgres or mysql protocols, but are clustered and distributed themselves? For example CockroachDB using the postgres protocol and TiDB using the mysql protocol.
@F21 hopefully those systems could benefit from having a proxy in front of it, while query-routing probably is not a concern, but things like: stats, connection-pooling and query-rewriting might give plus points. If you have comments, inputs or anything please let me know. It will be very appreciated and helpful. Thanks!
Anyone else working on an implementation? I'm nearly done adding load balancing for reads and writes (coincidentally needed that feature for an experiment). More than happy to open up a PR and work on additional features.
@sjvanrossum please do so. Happy to help to review them 😄
This proposal feels like putting wings on coelacanthiforms! Very very nice! Read the @dio design doc! Just do it! :wink:
Working on it. The experimental extension mentioned above was scrapped and what I'm writing for the PR is a connection manager, like HttpConnectionManager and ThriftProxy, with a filter chain to leave room for extension (enabling us to address design goals 3 and 7 in the doc as sub-extensions).
Expect the base of that (MVP as suggested by Matt Turner) to be opened as a PR within the next two weeks. ;)
I took a peak under the hood at TCP and HTTP connection pooling, but haven't come up with a solid plan for implementation yet.
Thought I'd mention this here … this DropBox blog post details their gRPC framework and associated infrastructure. Apart from noticing how desperately Envoy is missing from their infrastructure (😬), I noticed they are using SQL comments to carry metadata in proxied requests:
This context can travel even outside of the RPC layer! For example, our legacy MySQL ORM serializes the RPC context along with the deadline into a comment in the SQL query. Our SQLProxy can parse these comments and KILL queries when the deadline is exceeded. As a side benefit, we have per-request attribution when debugging database queries.
I see using SQL comments mentioned once in @dio's design document, but thought I'd point out this instance of it being used in the wild. I think this mainly covers situations in design goals 2, 4, and 5.
To the best of my knowledge there's nothing in the PostgreSQL wire protocol regarding attaching metadata, but what if users could set metadata in comments on a per-statement level similarly to how they use HTTP headers with Envoy today? That seems a little friendlier for users than serializing a context object. E.g.
-- x-envoy-retry-on: connect-failure
-- x-envoy-max-retries: 5
select * from big_table
In terms of replies, I know the PostgreSQL wire protocol specifies error and notice message fields that could be used in a synthetic response injected by the proxy. That could include upstream "headers" added to the multi-line detail fields. I haven't read the docs closely enough to know if a successful response can also be annotated in a similar way.
Anyhow, just a couple of thoughts, I apologize if this is covering ground folks have already discussed at length.
Have we made any progress on this FR? I am planning to work on this as we need it for my current project. If someone has started work, please reach out so that we can collaborate.
Yep, I'm implementing this. I had planned to wrap up the PR two weeks back, but got caught up in my day to day.
@hagmonk that syntax is a neat way to hint or specify envoy specific connection properties.
@dio do we have any solid requirements for specifying envoy connection properties? Seeing as we can stuff them in different places, we could allow various implementations of routers to exist.
@sjvanrossum is there a repo containing your work so far that some of us can poke at? I'm sure whatever state it's in, we have enough interested parties here to provide some feedback :)
@sjvanrossum I'm afraid we don't have it yet. Think we can specify. As noted by @hagmonk probably we can start to collaborate on the code? 😄 (sorry was out because of sick).
if you repurpose the mysQL filter, you could stuff this retry info in the SQL comments and do the appropriate thing in the SQL filter.
See https://github.com/envoyproxy/envoy/issues/9107 for a stats only filter proposal. I would suggest we start there and then extend with other features as resources become available.
Given that we landed initial support in https://github.com/envoyproxy/envoy/pull/10642 I think it makes sense to close this and we should open individual issues for discrete future features. Awesome work everyone! cc @cpakulski
Most helpful comment
Anyone else working on an implementation? I'm nearly done adding load balancing for reads and writes (coincidentally needed that feature for an experiment). More than happy to open up a PR and work on additional features.