Caddy: Improve performance of FastCGI transport (and clean up code)

Created on 18 Oct 2020  路  14Comments  路  Source: caddyserver/caddy

The current fastcgi implementation is functional, but clunky and slow.

Go does not have a standard FastCGI client implementation (note that we are not a _responder_ we are a client), so in January 2015 we forked the code from http://bitbucket.org/PinIdea/fcgi_client (which is forked from https://code.google.com/p/go-fastcgi-client/) -- and I am very grateful to the original authors for publishing it under a liberal license, so I did not have to go learn the nitty-gritty of the FastCGI protocol and implement it myself. However, the code was never really intended for heavy use.

But now, Caddy is used in heavy production environments where performance matters. Most of our code base could benefit from optimizing, but the FastCGI client should still have lots of low-hanging fruit that makes significant optimizations without too much trouble. For example, we could reduce buffering and pool buffers.

The client.go file needs the most attention.

This file would benefit _greatly_ from a refactoring to clean it up, and I've determined that a refactor will be necessary to support optimizations like buffer pooling.

I don't know how soon I will be able to get around to it, so if someone wants to take up the challenge of working on this file, please feel free to comment below and discuss your plans. The FastCGI protocol is not _particularly_ complex in its most basic form, so you may find that rewriting the client from scratch and using the existing code (and the spec) as a guide could be the most straightforward solution.

This code is used by a lot of people in a lot of important environments. Benchmark tests (and regular unit tests) should support any changes.

Thank you to anyone who helps!

help wanted optimization

All 14 comments

For instance, the newWriter() function is extremely inefficient currently:

https://github.com/caddyserver/caddy/blob/97caf368eea8d2c33a7786fbe3471b83b5b294dc/modules/caddyhttp/reverseproxy/fastcgi/client.go#L310-L314

It appears at the top of all memory profiles as a huge allocator, because the buffer isn't pooled. (Look closely and you'll see why using sync.Pool is impossible without a refactor.)

Hey 馃

I've been following caddy since the release of 2.0, hoping to contribute here over time and where possible.

Would love to start with the low hanging fruit and work my way through with guidance. 馃捇鈿欙笍馃搷

First question, how exactly are we profiling Caddy?

@raygervais there's a pprof endpoint that's built in to the admin endpoint - at http://localhost:2019/debug/pprof by default

Yep, using that would be easiest; and it might also not hurt to write benchmark tests as well: https://golang.org/pkg/testing/#hdr-Benchmarks

Doing a profile and benchmark before changes, then a profile after them (doing identical things) should be a good indication as to the improvement.

Thanks for the advice, going to look into how the code works and will try to improve it starting next week (currently AFK). if you take it on before I do, I'd be happy to help in anyway and learn as I go.

Happy to assist in anyway I can, I'm not very familiar with fastcgi but am willing to help with implementation and testing!

I have looked at this in the past and found a few interesting projects that do something similar:

  1. https://golang.org/pkg/net/http/fcgi/ which links to this page with a nicely formatted spec
  2. roadrunner.dev is a Go project for a PHP runtime with a lot of functionality. Not quite sure if it uses a FastCGI or a PSR-7 implementation
  3. https://github.com/yookoala/gofast is another FastCGI implementation in Go

We might be able find some inspiration in one of these projects?

@jasonmccallister Thanks for the links!

https://golang.org/pkg/net/http/fcgi/

Note that this is a responder, not a client.

roadrunner.dev is a Go project for a PHP runtime with a lot of functionality. Not quite sure if it uses a FastCGI or a PSR-7 implementation

It's PSR-7.

https://github.com/yookoala/gofast is another FastCGI implementation in Go

That's good to know, I haven't seen that one.

@mholt I did some "light" reading on the FastCGI spec last night, did not realize the difference between a responder and client.

Reading the net/http/fcgi and seeing that FastCGI is offline and the spec is offline. Makes me wonder the future of FastCGI or if I should even worry/care :).

Regardless, hope those help!

Actually yookoala/gofast looks very interesting, it has some features some people have asked for like FastCGI Authorizers. Since it's basically just a handler, maybe we could swap it in? We'd need to make sure it conforms to the assumptions we've made in our implementation though (note the path manipulation logic in the handler code)

/cc @yookoala FYI, if you'd be interested in helping out!

Ideally, what we want is a RoundTripper from a FastCGI client library. All it has to do is take a request and give us a response.

Actually yookoala/gofast looks very interesting, it has some features some people have asked for like FastCGI Authorizers. Since it's basically just a handler, maybe we could swap it in? We'd need to make sure it conforms to the assumptions we've made in our implementation though (note the path manipulation logic in the handler code)

/cc @yookoala FYI, if you'd be interested in helping out!

I'd love to. The only catch was I couldn't find any FOSS implementation that uses the authoizer with. We'd need some way to verify our implementation. (I could have mis-read the specification)

Oh, okay; that's fine, I wasn't trying to be specific about the authorizers, just an example of something your implementation has that ours doesn't. Thought we could compare notes and hoping that you might be interested in helping us improve our implementation 馃檪

Apologies for delay, work and other obligations are taking quite a bit of time. If someone else wants to take it, by all means. I'll try to contribute to it if I get a chance

Was this page helpful?
0 / 5 - 0 ratings