Go: net/http/httptrace: add ServerTrace hooks

Created on 8 Feb 2017 · 59Comments · Source: golang/go

Problem

A very common pattern in Go HTTP servers is to implement an http.ResponseWriter that wraps another http.ResponseWriter and captures the status code. This is often used for logging and metrics collection.

For example,

type statusCaptureWriter struct {
    http.ResponseWriter
    status int
}

func (scw *statusCaptureWriter) WriteHeader(status int) {
    scw.status = status
    scw.ResponseWriter.WriteHeader(status)
}

type loggedHandler struct {
    handler http.Handler
    logger SomeLogger
}

func (h *loggedHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    scw := &statusCaptureWriter{ResponseWriter: w}
    h.handler.ServeHTTP(scw, r)
    h.logger.Logf("status=%d ...", scw.status)
}

I've written something like this a bunch of times. You can find examples in nearly every Go web framework out there. One example is https://github.com/urfave/negroni/blob/master/response_writer.go#L13-L26.

There are some issues with this approach. For instance, my statusCaptureWriter doesn't implement other interfaces like http.Flusher, http.CloseNotifier or http.Pusher. I can't determine at compile time whether the underlying http.ResponseWriter implementation implements any of these interfaces, so if I choose to implement them I might lie to callers at higher levels of the stack and inadvertently break things. (This particularly a problem with CloseNotifier.)

Proposal (rejected, see below)

I'd like to propose an additional interface, http.Statuser (better name welcome) that exposes the status code within a http.ResponseWriter implementation. The internal http.(*response) implementation already tracks the status code written, so this can just be exposed and it will automatically implement this interface.

Software could avoid wrapping the http.ResponseWriter by instead type asserting it to http.Statuser and getting the status as needed there. (And it could optionally continue to wrap the ResponseWriter as needed until this is widely deployed.)

type loggedHandler struct {
    handler http.Handler
    logger SomeLogger
}

func (h *loggedHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    // Could check for http.Statuser implementation here and wrap http.ResponseWriter 
    // if necessary, but omitting for brevity

    h.handler.ServeHTTP(w, r)

    status := 0
    if s, ok := w.(http.Statuser); ok {
        status = s.Status()
    }

    h.logger.Logf("status=%d ...", status)
}

Alternative proposal

Implement an httptrace.ServerTrace struct that is analogous to the ClientTrace already there. See https://github.com/golang/go/issues/18997#issuecomment-279611928 for more info.

FeatureRequest early-in-cycle

Source

joeshaw

👍7

Most helpful comment

Rather than a dozen optional methods, then, I'd prefer to see something like a State() method that returned a concrete struct type to which we could add fields over time.

bradfitz on 8 Feb 2017

👍13

All 59 comments

More examples:

https://github.com/gobuffalo/buffalo/blob/5b5e01469cfd064cad0a56b1eee4728a41d2ec46/response.go

https://github.com/gin-gonic/gin/blob/2dae550eb5392006a4582ce9c90016a9b5a74e8b/response_writer.go#L19-L48

https://github.com/goji/glogrus2/blob/master/writer_proxy.go

https://github.com/rikonor/go-status-code-tracker

joeshaw on 8 Feb 2017

These exist apparently, but why?

Code should know what it did previously.

bradfitz on 8 Feb 2017

@bradfitz The example I posted is a simplified instance: you create a wrapper that handles logging of all requests and then the individual handlers don't need to do or understand logging themselves. They are free to just call w.WriteHeader(http.StatusOK) and things get logged at a higher level. This plays nicely with middleware, too.

Other things that might get logged there are the URL/path, the duration of the call, etc. It would be a pain to have to do that at essentially every exit site of every middleware or handler. I think it's also a valid separation of concerns.

joeshaw on 8 Feb 2017

👍1

Another example beyond logging is collection of metrics. You need to know when you serve 5xx errors, for instance. Again, you don't necessarily want to be doing this everywhere it happens (and in the case of panics, I'm not sure offhand it's possible.)

joeshaw on 8 Feb 2017

Rather than a dozen optional methods, then, I'd prefer to see something like a State() method that returned a concrete struct type to which we could add fields over time.

bradfitz on 8 Feb 2017

👍13

CL https://golang.org/cl/36647 mentions this issue.

gopherbot on 9 Feb 2017

Given that the primary target of this feature is logging, I suggest that the length of the response body would be a good addition to the output of State()

urandom on 12 Feb 2017

What if we just provide one canonical ResponseWriter wrapper instead?

And if this is about logging, we should (re-)consider the HTTP logging issues/proposals.

bradfitz on 13 Feb 2017

👎1

@bradfitz the thing that worries me about providing a wrapper type is also wrapping the other interfaces that may optionally be supported by the underlying type. For instance, http.Pusher is something that's only supported by the http2 ResponseWriters. This wrapper type either has to eschew support for Pusher or it has to "lie" about supporting it.

joeshaw on 13 Feb 2017

I've written a wrapper type like this as well, except instead of just logging the response code, my wrapper type also logged the number of response body bytes written, an md5sum of the response body, the time between the first and last Write call, and a few other miscellaneous things. Skimming through the examples in this comment, I see that I'm not alone -- many of the wrappers provide more than just the response code. I worry that adding types like http.Statuser is an endless game of whack-a-mole. Even with Brad's State() idea, I wouldn't expect the http library to compute an md5sum of the response body.

@joeshaw, have you considered designing a server-side version of httptrace? We considered this at one time, but never worked through the details. Also see #18095 and #3344.

These exist apparently, but why?
Code should know what it did previously.

It is convenient to put all logging code in one place. One way to do this is by wrapping the http.ResponseWriter. This approach becomes more compelling when the logger tracks a bunch of things such as timing info or a hash of the response body, rather than just the response code (which is easy to remember by itself).

tombergan on 14 Feb 2017

👍4

@tombergan i hadn't considered an httptrace-equivalent, but perhaps that is a better way to go in order to support different use cases like hashing response bodies.

joeshaw on 14 Feb 2017

CL https://golang.org/cl/37010 mentions this issue.

gopherbot on 14 Feb 2017

I just uploaded a CL that has the very start of this by introducing an httputil.ServerTrace struct. At the moment it only implements a WroteHeader hook. It'd be nice if people could take a look and give an opinion on the general approach.

I expect to add StartHandler and EndHandler hooks as well. Since it's implemented in the server, it could contain connection-level hooks as well.

One things that's a little strange about it is that the data isn't accessible from the handlers, which I imagine could be a problem if logging code wanted to include additional request-specific data. One possibility might be to pass the http.Request into the hooks, have handlers add values to the req.Context and have an EndHandler hook extract and log there?

joeshaw on 14 Feb 2017

I think you're right that the server has both connection-level events and request-level events, and those may need to be handled separately. We could start by supporting request-level events only, since that seems to be what is wanted the most.

A nice thing about ClientTrace is that it's easy to create a new ClientTrace struct for each request. For example, here's how I use ClientTrace:

info := &reqInfo{}
req = req.WithContext(httptrace.WithClientTrace(req.Context(), &httptrace.ClientTrace{
  GotConn: info.GotConn,
  ...
}))
transport.RoundTrip(req)
...
log(info)

This lets me accumulate all info about a single request into a single struct (reqInfo). It would be ideal if I could a similar pattern for server traces. I think we could do this be changing your proposal slightly:

packge http
type Server struct {
  ...
  NewTrace func() *httptrace.ServerTrace   // optional callback to create a trace for a new request
}

One things that's a little strange about it is that the data isn't accessible from the handlers, which I imagine could be a problem if logging code wanted to include additional request-specific data. One possibility might be to pass the http.Request into the hooks, have handlers add values to the req.Context and have an EndHandler hook extract and log there?

That would work. Another idea is to add an opaquely-typed field to ServerTrace that can be set by NewTrace. Perhaps UserData interface{}. Then, the request handler could extract the trace info using httptrace.ContextServerTrace(req.Context()).

tombergan on 14 Feb 2017

@joeshaw It is theoretically possible to dynamically decide, whether to wrap pusher or not:

func wrap(wrap, with http.ResponseWriter) http.ResponseWriter {
    if p, ok := wrap.(http.Pusher); ok {
        return struct {
            http.ResponseWriter
            http.Pusher
        }{ with, p }
    }
    return with
}

The trouble is a) the combinatoric explosion. You have to provide a branch for every combination of interface that could exist. And b) that the wrapping code must know about all possible optional interfaces. You can get around these by providing generic wrapping-code, but that also only solves the issue for all the optional interfaces in net/http itself, not potential third-party packages using the same trick.

This is why I'm so troubled by "optional interfaces" and the fact that they are used in net/http so pervasively. It's a very leaky abstraction and sets a bad example.

Merovius on 27 Feb 2017

@Merovius wow! that is a very interesting hack, and one i hadn't ever considered. I agree that it's not scalable, but thanks for pointing that out.

joeshaw on 27 Feb 2017

@tombergan,

Then, the request handler could extract the trace info using httptrace.ContextServerTrace(req.Context())

Putting things on the context might be interesting. I don't know whether that crosses the line of acceptable uses of context, though. I'm fine with it, but I think others might balk. I think it's better than more optional interfaces at least.

bradfitz on 27 Mar 2017

Context makes some sense for client tracing, in that the client causes it to trace all requests being made, but that's arguably OK. On the server, though, you want to scope it to one request, not all the requests being handled by the server. It's unclear that context can be made to work here.

It sounds like there's not a concrete proposal here at this point and that someone needs to write a design doc that incorporates the feedback and feature discussion above.

Or we can move this issue out of the proposal process and leave it open as a regular issue until someone does want to write a proposal.

rsc on 3 Apr 2017

I think it makes sense to move it out of the proposal process for now, since my original proposal wasn't well-liked.

Using httptrace sounds like a better way to go, but I think more prototyping needs to be done before it's ready to be a formal proposal.

joeshaw on 4 Apr 2017

On the client, we can trace a single request like this:

req = req.WithContext(httptrace.WithClientTrace(req.Context(), &httptrace.ClientTrace{...}))
transport.RoundTrip(req)

On the server, we already create a context for every request. We could add a server hook into that context creation point so that callers can inject a trace object. Something like:

packge http
type Server struct {
  ...
  // A better name is needed.
  UpdateRequestContext func(ctx context.Context) context.Context
}

// server.go:972
ctx, cancelCtx := context.WithCancel(ctx)
if c.server.UpdateRequestContext != nil {
  ctx = c.server.UpdateRequestContext(ctx)
}

package main
...
&http.Server{
  UpdateRequestContext: func(ctx context.Context) context.Context {
    return httptrace.WithServerTrace(req.Context(), &httptrace.ServerTrace{...}))
  },
}

That feels like a direct mirror of how httptrace is used on the client. In practice we may need to move the context creation point into readRequest so that httptrace can capture errors reading the request, but that's an implementation detail.

As a side-effect, this change would allow callers to arbitrarily modify the ctx used for each request. I haven't thought deeply about whether or not this is a bad idea. In practice, I don't imagine it would be useful beyond httptrace, although I'm sure it could be abused.

I also haven't thought much about which methods we'd like on httptrace.ServerTrace -- that's a good question for @joeshaw and others who are hoping to use this feature.

tombergan on 6 Apr 2017

👍1

As a side-effect, this change would allow callers to arbitrarily modify the ctx used for each request.

There was actually a very similar proposal to add such a set-the-initial-context hook earlier, but it didn't go in because we weren't quite sure what the use case would be. But this seems like a good use case.

bradfitz on 6 Apr 2017

👍1

Maybe we should limit the ability to modify the context by doing that behind the scenes in the server code, and have the hook on the http.Server simply return an httptrace.ServerTrace value.

type Server struct {
    // ...
    ServerTrace() *httptrace.ServerTrace
}

// server.go
ctx, cancelCtx := context.WithCancel(ctx)
if c.server.ServerTrace != nil {
    ctx = httptrace.WithServerTrace(ctx, c.server.ServerTrace()) // do we even need to export WithServerTrace?
}

And inside your handler you could still extract the trace from req.Context with a helper function in httptrace if you needed it.

Edit: oh, heh, I see that was your recommendation above, Tom. I think I like the idea of setting a trace more than setting a context value. It's more explicit and doesn't preclude the ability to set an initial context in a later revision.

joeshaw on 6 Apr 2017

I suggested that earlier, but forgot that http cannot import httptrace, so that doesn't work.

tombergan on 6 Apr 2017

The canonical wrapper mentioned by @bradfitz in https://github.com/golang/go/issues/18997#issuecomment-279522713 is actually impossible to do reliably. Not only do you need to consider a host of different combinations, but you also need to check whether the ResponseWriter that you want to wrap already implements those methods. I've had to wrap ResponseWriter's five times so far and the logic can vary depending on the different cases, although there is certainly some overlap. See: 1, 2, 3, 4, and 5. @joeshaw if you restrict yourself to the standard library, the number of combinations isn't too bad; it does, somewhat, scale.

The idea of a httptrace.ServerTrace sounds like a really good solution to this. As for what fields it should capture, the only information I want, so far, is the status code and the response size.

tmthrgd on 15 Apr 2017

How should we best track response size? I don't think we want to expose every write to a ServerTrace call, so maybe it should be gathered up inside the net/http machinery instead.

joeshaw on 21 Apr 2017

👍1

I don't think we want to expose every write to a ServerTrace call

Why not? What if someone wants to scan the response body is some way other than computing the size, such as computing a hash of the response body or tee'ing the response body to a log? Hooking each Write call seems like the most straightforward way to achieve this although I'd be open to other ideas. There should be nearly zero overhead when a Write hook is not installed (just a branch) or very low overhead when a hook is installed (branch + indirect function call).

How about something like:

type ServerTrace struct {
  // Received a bad request (e.g., see errToLarge in net/http/server.go).
  // The ServeHTTP handler will not be called.
  // BadRequestInfo has the status code of the response (the current implementation
  // can return 431 or 400) and perhaps also the response body, which is an error string.
  // This addresses https://github.com/golang/go/issues/18095
  GotBadRequest(BadRequestInfo)

  // Called when receiving a request, just before calling the ServeHTTP handler.
  // RequestInfo would likely include the URL and Headers of the request (with caveats
  // about not mutating those values).
  // This would satisfy https://github.com/golang/go/issues/3344 -- see the linked camlistore code.
  GotRequest(RequestInfo)

  // Called when the handler calls WriteHeader.
  // WriteHeaderInfo includes the status and maybe also the headers (with caveats about
  // not mutating the headers). Or perhaps this is (status, headers) instead of WroteHeaderInfo.
  // This addresses the current bug.
  WroteHeader(WroteHeaderInfo)

  // Called each time the handler calls Write. This is the data fed to the ResponseWriter,
  // e.g., before any transfer encoding. Includes the return values of the Write call.
  // Caveats about mutating data.
  // This addresses the current bug.
  WroteBodyChunk(data []byte, n int, err error)

  // TODO: WroteTrailers?

  // Called when the ServeHTTP handler exits.
  HandlerDone(panicked bool /* maybe? or part of HandlerDoneInfo? */)
}

tombergan on 24 Apr 2017

👍1

I think we're converging on an actual proposal. https://github.com/golang/go/issues/18997#issuecomment-292039169 proposes how to create a ServeTrace object for each request received by the server, and https://github.com/golang/go/issues/18997#issuecomment-296722404 proposes a set of hooks to resolve this bug and two others.

@bradfitz, do you want a more formal proposal or are the above two comments sufficient? I think @joeshaw volunteered to help with the implementation.

tombergan on 24 Apr 2017

SGTM at a high level. I might tweak certain names/docs/signatures, but not much. We can discuss that during the code review too.

bradfitz on 24 Apr 2017

👍

https://github.com/felixge/httpsnoop/blob/master/wrap_generated_gteq_1.8.go
https://github.com/prometheus/client_golang/blob/master/prometheus/promhttp/delegator.go

This needs to stop.

nvartolomei on 12 Jul 2017

😕1 👍1

@joeshaw Are you still interested in helping with the implementation? We could start reviewing the implementation at any time, although we shouldn't submit anything until the Go 1.10 window opens (in August I believe).

tombergan on 12 Jul 2017

@tombergan I am at GopherCon this week and hoping to spend some time on this.

joeshaw on 12 Jul 2017

http cannot import httptrace, so that doesn't work.

Why not? Other than example_test.go, nothing in httptrace imports http so a circular dependency isn't an issue.

I could see us wanting to make the http.Request (perhaps a copy of it?) available in the GotRequest trace hook, though.

joeshaw on 12 Jul 2017

net/http already imports httptrace.

bradfitz on 12 Jul 2017

Here's a sketch of how you might use it in practice:

type requestInfo struct {
    start time.Time
    path string
    statusCode int
    h hash.Hash
}

s := &http.Server{
    UpdateRequestContext: traceRequest,
}

func traceRequest(ctx context.Context) context.Context {
    info := requestInfo{
        h: sha1.New()
    }

    trace := &httptrace.ServerTrace{
        GotRequest: func(gri httptrace.GotRequestInfo) {
            info.start = time.Now()
            info.path = gri.URL.Path
        },
        WroteHeader: func(whi httptrace.WroteHeaderInfo) {
            info.statusCode = whi.StatusCode
        },
        WroteBodyChunk: func(data []byte, n int, err error) {
            if err != nil {
                // ???
            }

            info.h.Write(data)
        },
        HandlerDone: func(hdi HandlerDoneInfo) {
            log.Printf(
                "path=%s status=%d duration=%s hash=%x",
                info.path,
                info.status,
                time.Since(info.start),
                info.h.Sum(nil),
            )
        },
    }

    return httptrace.WithServerTrace(ctx, trace)
}

I'm still not 100% convinced that tying this in with an UpdateRequestContext function is the right way to go, especially if we can't come up with other circumstances to change the request context. A ServerTrace-specific function on the Server seems cleaner to me.

joeshaw on 13 Jul 2017

For my specific use-case(s), the trace would be added by a middleware. Actually, multiple tracing hooks by different middlewares.

Unfortunately this will leave outside GotBadRequest and GotRequest hooks proposed by @tombergan.

nvartolomei on 13 Jul 2017

After discussing use cases last night, I've flip-flopped on using context for that. And that's because you might want to inject things into the context and use them both in the server trace hooks and middleware/handlers. Building on my example from earlier:

type requestInfo struct {
    start time.Time
    path string
    statusCode int
    h hash.Hash
    size int64
}

s := &http.Server{
    UpdateRequestContext: traceRequest,
}

func traceRequest(ctx context.Context) context.Context {
    reqID := generateRequestID()
    ctx = withRequestID(ctx, reqID)

    info := requestInfo{
        h: sha1.New()
    }

    trace := &httptrace.ServerTrace{
        GotRequest: func(gri httptrace.GotRequestInfo) {
            info.start = time.Now()
            info.path = gri.URL.Path
        },
        WroteHeader: func(whi httptrace.WroteHeaderInfo) {
            info.statusCode = whi.StatusCode
        },
        WroteBodyChunk: func(data []byte, n int, err error) {
            if err != nil {
                // ???
            }

            info.h.Write(data)
            info.size += n
        },
        HandlerDone: func(hdi httptrace.HandlerDoneInfo) {
            log.Printf(
                "req-id=%s path=%s status=%d duration=%s size=%d hash=%x",
                reqID,
                info.path,
                info.status,
                time.Since(info.start),
                info.size,
                info.h.Sum(nil),
            )
        },
    }

    return httptrace.WithServerTrace(ctx, trace)
}

This way the request ID is not only logged, but it's passed into the context so handlers can extract it and log it, pass it to other services, etc.

joeshaw on 13 Jul 2017

👍1

For my specific use-case(s), the trace would be added by a middleware. Actually, multiple tracing hooks by different middlewares.

Could you elaborate a little more on this? Are you thinking that the trace would be added in the execution of middleware? Or that setting up the middleware would also attach something to the server?

The reason why I ask is that this hook function is on the http.Server and called only once per request -- it won't be possible to attach to it in the middle of middleware _execution_. For things that the middleware might want to attach, using the provided http.Request WithContext() method is a better approach.

joeshaw on 13 Jul 2017

http cannot import httptrace, so that doesn't work.
Why not? Other than example_test.go, nothing in httptrace imports http so a circular dependency isn't an issue.

You're right. I have no idea what I was thinking. Looking forward to the CL!

tombergan on 13 Jul 2017

Now that we're looking at a partial implementation, I'm not sure that getting all three of data []byte, n int, err error is the right move for WroteBodyChunk. I think that getting both len(byte) and n is confusing in cases where they differ, and it makes it harder to correctly write the two use cases we're considering:

1) Compute a hash of the response body (would need to slice byte[:n]).
2) Calculate the response length (would be a mistake to look at byte).

So maybe what WroteBodyChunk should receive is the slice of bytes that was successfully written.

That made me wonder whether it also needs to get the write error, especially since that error is already being delivered to the application. I feel like it's really weird to get the same error twice. Countering that, @joeshaw suggests you might want to (as one example) do something different with the response hash if the connection was closed by the client.

pteichman on 14 Jul 2017

👍3

That made me wonder whether it also needs to get the write error, especially since that error is already being delivered to the application. I feel like it's really weird to get the same error twice. Countering that, @joeshaw suggests you might want to (as one example) do something different with the response hash if the connection was closed by the client.

Just to clarify, I think checking for an error is helpful in that you could, for example, not log a confusing partial hash value. But I think the most value in getting the write error is to give one the ability to log it in the context of the rest of the request info.

joeshaw on 14 Jul 2017

Panicked is also really appealing in HandlerDone but given that the handler recover() is scoped to the whole connection it doesn't really make sense unless we add a recover() to each ServeHTTP call.. And the application doesn't need any trace support to do that already.

pteichman on 14 Jul 2017

So maybe what WroteBodyChunk should receive is the slice of bytes that was successfully written.

This SGTM.

given that the handler recover() is scoped to the whole connection

This is not entirely true. In HTTP/2, the recover() wraps ServeHTTP:
https://github.com/golang/net/blob/master/http2/server.go#L2037

In HTTP/1, recover() does wrap the connection, but only because of the nature of HTTP/1 -- it's impossible to reuse a connection in HTTP/1 unless the prior request/response was completed exactly, but after a panic, we don't know the state of the connection and must assume it cannot be reused.
https://github.com/golang/go/blob/master/src/net/http/server.go#L1690

That said, the godoc for Handler actually says that a panic will hang up the connection, which is not what's implemented in HTTP/2. Perhaps a godoc update is in order?
https://golang.org/pkg/net/http/#Handler

/cc @bradfitz

tombergan on 14 Jul 2017

Sent you CL 48551 for doc update.

bradfitz on 14 Jul 2017

CL https://golang.org/cl/48551 mentions this issue.

gopherbot on 14 Jul 2017

I have a counterproposal that might resolve some of the root issues.

It sounds like the necessary information can be obtained by wrapping the http.ResponseWriter .. the problem is that wrapping http.ResponseWriter implementations is impractical because new features are exposed by adding methods to the underlying value, so a particular http.ResponseWriter might additionally implement http.Flusher or http.Pusher. A wrapper would need to implement exactly the right set of methods at compile time, since the success or failure of the type assertion is often the only way to know if a particular feature (like flushing) is supported. Furthermore, these wrappers need to be kept up to date with changes to Go to support new features (like http/2 push).

My proposal is to add one last method, which if present will return a struct. The struct would have a field for each optional feature, which could be left nil or set to a function value.

package http // import "net/http"

type ResponseOptionser interface {
  ResponseOptions() *ResponseOptions
}

type ResponseOptions struct {
  CloseNotify func() <-chan bool
  Flush       func()
  Hijack      func() (net.Conn, *bufio.ReadWriter, error)
  Push        func(target string, opts *PushOptions) error
  WriterTo    io.WriterTo
}

This would not allow adding new behaviors outside of net/http, since each feature would need a field on this struct. Is it common to add behaviors in non-core packages?

An alternative that allows for that is to attach the new behaviors as Values on the http.Request's Context, in the same way that *http.Server is currently available:

func myHandler(w http.ResponseWriter, r *http.Request) {
  f, ok := r.Context().Value(http.FlusherContextKey).(http.Flusher)
  if ok {
    f.Flush()
  }
}

Experience report:

I maintain a closed-source package for distributed request tracing that works for Go http clients and servers. I've implemented it as a http.Handler wrapper that in turn wraps the http.ResponseWriter.

It reads some of the inbound http headers to get the request id and places that in the http.Request's Context. When calling the underlying http.Handler, it passes a copy of the *http.Request that has a Context including some additional Values. As soon as its ServeHTTP method is called, it checks the current time and emits an instrumentation event including the timestamp, request id, and portions of the url and headers. When an http.Handler panics, it prints the stack trace with the request id prefixed to each line. On a call to WriteHeader (or the first call to Write), it emits an instrumentation event including the status code. It also counts the number of bytes sent in the response (measured by the arguments to calls to Write).

It used to do a bunch more, before Go 1.7 included context and before Go 1.8 fixed the lifespan of the Context to match the request (vs matching the lifespan of the call to ServeHTTP).

rhysh on 15 Jul 2017

👎5

@rhysh This is an interesting proposal. There are a few things I would tweak about it, like returning interface values for the individual items in the ResponseOptions struct. (ie, Flush func() would be Flusher http.Flusher, but in both cases the values would be nil if the interface weren't implemented.)

There are a few things I like about it:

It makes it explicit what interfaces the ResponseWriter implementation supports.
ResponseWriter wrappers need _only_ to implement the ResponseWriter and ResponseOptionser interfaces.

The things I am not sure how I feel about:

It does not address the very widespread use of ResponseWriter wrappers for common problems like storing status codes, calculating response sizes, or handler durations. (More on this below.)
It effectively deprecates type asserting the ResponseWriter to other interfaces, except ResponseOptionser. In other words, in a handler it's no longer sufficient to do:

if p, ok := rw.(http.Pusher); ok {
    p.Push(...)
}

which is a very standard, idiomatic thing to do. (It would still succeed much of the time, since the underlying private ResponseWriter implementation would support it, but wrappers would not.)

Instead, you have to do:

if ro, ok := rw.(http.ResponseOptionser); ok {
    if ro.ResponseOptions().Pusher != nil {
        ro.ResponseOptions().Pusher.Push(...)
    }
}

which is a little longer and less natural.

Back to the proliferation of ResponseWriter wrappers. Is this a real problem?

It's likely impossible to eliminate the need for ResponseWriter wrappers entirely in all cases. But the vast majority of the duplication out there are to track status codes and, to a lesser extent, things like timings and response body sizes. Perhaps the standard library should provide these in a new State method and interface as @bradfitz first suggested?

This counterproposal doesn't address the proliferation of wrappers, whereas the ServerTrace one likely would.

Unless people are vehemently opposed to this, I may take a stab at it.

joeshaw on 16 Jul 2017

Back to the ServerTrace implementation, the separation of http and httptrace packages is really making the usability of ServerTrace ugly because of circular dependencies. GotRequest would really benefit from having access to http.Request, but instead most of the fields on it have to duplicated. Same for http.Header. It makes me worried about the long-term viability of doing it this way.

To see @pteichman's and my progress on ServerTrace thus far: https://github.com/pteichman/go/tree/servertrace

joeshaw on 16 Jul 2017

An alternative would be, to enable reflect to dynamically create methods (via #15924 and #16522). That way, people who want to create wrappers with optional interfaces could do so, e.g. by

func wrap(res http.ResponseWriter, statusCallback func(int)) http.ResponseWriter {
    st := reflect.StructOf{
        reflect.StructField{
            Type: reflect.TypeOf(res),
            Anonymous: true,
        },
    }
    nt := reflect.NamedOf(st, "wrapped", []reflect.Method{
        Name: "WriteHeader",
        Type: reflect.TypeOf(func(interface{}, int) {}),
        Func: reflect.ValueOf(func(_ interface{}, code int) { statusCallback(code) }),
    }
    v := nt.New()
    v.Field(0).Set(reflect.ValueOf(st))
    return v.Interface().(http.ResponseWriter)
}

Not terribly ergonomic (and I papered over the detail of how NamedOf would work and what the argument of that function would be, given that the struct-type is dynamically created); but it would work.

Merovius on 16 Jul 2017

👎4 👍1

Stepping back a bit, the reason we originally added httptrace.ClientTrace was to expose info that could not be tracked by simply wrapping a RoundTripper. If we focus solely on better ways to wrap a ResponseWriter in this issue, then we lose sight of exposing info that is not available to ResponseWriter. For example, see #18095, which was dup'd into this issue. That issue is about logging bad requests, which we _cannot_ do with ResponseWriter wrappers because bad requests are never forwarded to request handlers in the first place. Further, we might imagine adding HTTP/2 specific hooks in the future, or hooks to get the exact headers written on the wire, which can change slightly compared to what the caller passes to WriteHeader.

There is also #16220, which was not very well motivated in isolation, but would be fixed as a side-effect of adding UpdateRequestContext.

There is also #20956, which is yet another example of people trying to get at the underlying net.Conn (especially see this comment).

It used to do a bunch more, before Go 1.7 included context and before Go 1.8 fixed the lifespan of the Context to match the request (vs matching the lifespan of the call to ServeHTTP).

@rhysh, can you say more about this? It sounds like you're currently unable to do something that you'd like to do as a result of the Context change.

But the vast majority of the duplication out there are to track status codes and, to a lesser extent, things like timings and response body sizes. Perhaps the standard library should provide these in a new State method and interface as @bradfitz first suggested?

I am strongly opposed to this. There is no way the standard library will cover all kinds of stats that every user wants to track. Rather than getting ourselves into the business of tracking stats, we should provide APIs that allow users to track whatever stats they need.

tombergan on 17 Jul 2017

👍3

can you say more about this? It sounds like you're currently unable to do something that you'd like to do as a result of the Context change.

On the contrary: Many of the features that had to be implemented from scratch by that internal package are now done by the standard library. For instance it used to use http.CloseNotifier to learn that the client had disconnected prematurely; as of Go 1.8, the request-scoped Context now provides that signal.

As of Go 1.8, Context in net/http has been working well. The focus of the internal package has narrowed, thanks to improvements in the standard library.

httptrace provides a lot of information about the request lifecycle and very little of it overlaps with information available to http.RoundTripper implementations.

The partial overlap I've encountered is that my internal package would additionally wrap the request body to learn when the full request had been written. This forced chunked transfers, which unfortunately didn't work with HTTP/1.0 servers, but otherwise worked well enough. After the addition of httptrace, I was able to use the WroteRequest hook to get that signal.

Because there's so little overlap, what are the problem statements that this issue should address? Here are some possibilities:

It's not possible [for *http.Server / http.Handler code] to learn about invalid/malformed requests
It's not possible to measure how long it took to read the request headers, from first byte to full headers received
It's not possible to measure how long a client's connection was idle before it sent the first byte of headers, or to determine if the connection was reused from a previous (or concurrent http/2) request.
Measuring the status code returned by a handler is inconvenient
Measuring the wall clock time at which the status code was set is inconvenient
Writing code to measure the status code returned by a handler is likely to break features like http/2 push or sendfile (for http/1.1).

The first three seem like they'd be addressed well with a server-side equivalent to the current httptrace.ClientTrace hooks. The later three can be done by wrapping http.ResponseWriter values as middleware, but it's hard currently too hard to do well.

rhysh on 19 Jul 2017

That is a nice summary. I think all six of your listed use cases could be handled by httptrace.ServerTrace. For 4 & 5, we'd pass through the status code (and possibly headers) from the ResponseWriter to the ServerTrace. 6 should "just work" since ServerTrace doesn't wrap the ResponseWriter.

tombergan on 19 Jul 2017

I think all six of your listed use cases could be handled by httptrace.ServerTrace
It _could_ do any of them. Where should it focus? Where should the discussion focus?

httptrace.ClientTrace doesn't allow getting the status code; that's trivial to do in a http.RoundTripper. But it also doesn't allow getting the sizes of the request body or the response body—both possible, but inconvenient, to get from a http.RoundTripper wrapper.

ClientTrace focused on giving access to net/http internals that it alone could provide.

In the current implementation draft of ServerTrace, the GotBadRequest hook gives access to information that would not otherwise be available to http.Handler authors. This looks like a good way to give access to that information.

The WroteHeader and WroteBodyChunk callbacks give access to data that http.Handler authors already have. It's not very convenient access because of the pattern of adding secret bonus methods on http.ResponseWriters, but it's access.

Should ServerTrace have a callback for when Flush is called and for when http/2 push is used? If sendfile(2) is activated, should a hook be called at the beginning/middle/end of that transfer?

I think it would be helpful to split the discussion into 1) information present in the internals of net/http that is not exposed to users, and 2) information that users have, but need better access to.

Use of the http.Server.UpdateRequestContext field looks clunky. It's called before the http.Request is available, meaning that users would need to coordinate between the code that runs during that hook and a top-level http.Handler in order to match the ServerTrace hooks to an http.Request.

Since it's attached to the http.Server, it can't be used by middlewares. It's set once at the top level by the owner of the service, making it inaccessible to the maintainers of middleware packages. They're likely to continue wrapping http.ResponseWriters, since that would be the only way the code they control is able to access the data they need.

I have a proposal for an alternate design, if we split the functionality more or less into the two types of data I identified above:

1) A new field on http.Server points to a struct that contains the GotBadRequest hook (and maybe some others). Consider this logically as a structured extension of the http.Server.ErrorLog field.

2) A function in net/http that takes the Context value from a http.Request (or a *http.Request directly) and something like a httptrace.ServerTrace struct, registering it for callbacks as the request is served. Any layer of http.Handler middleware could call the function, just as any layer of http.RoundTripper middleware can with ClientTrace.

(The function wouldn't return a Context like the ClientTrace equivalent does; the http.Server implementation would use a private context key to store and access a function of type func(httptrace.ServerTrace). The net/http package's function would extract and call it.)

rhysh on 20 Jul 2017

👎2 👍1

[UpdateRequestContext is] called before the http.Request is available, meaning that users would need to coordinate between the code that runs during that hook and a top-level http.Handler in order to match the ServerTrace hooks to an http.Request.

Not sure I followed? The hooks will be available via http.Request.Context(). A pattern like the following should work:

s := &http.Server{
  UpdateRequestContext: func(ctx context.Context) context.Context {
    sh := newServerHooks()  // a user-defined struct
    ctx = context.WithValue(ctx, key, sh)
    return httptrace.WithServerTrace(ctx, &httptrace.ServerTrace{
      GotRequest: sh.GotRequest,
      ...
    })
}

The handler can then use ctx.Value(key) to extract sh.

Since [UpdateRequestContext is] attached to the http.Server, it can't be used by middlewares. It's set once at the top level by the owner of the service, making it inaccessible to the maintainers of middleware packages.

The middleware needs some entry point in order to apply its http.ResponseWriter wrapper. Couldn't the middleware use a similar entry point to install its UpdateRequestContext wrapper? I am admittedly not familiar with many such libraries (aside from the few I've written myself for internal use), so I am very interested in hearing about libraries where UpdateRequestContext will be difficult to use.

A function in net/http that takes the Context value from a http.Request ...

Can you explain this with a pseudocode example? I did not follow.

A new field on http.Server points to a struct that contains the GotBadRequest hook ...
I think it would be helpful to split the discussion into 1) information present in the internals of net/http that is not exposed to users, and 2) information that users have, but need better access to.

That is a useful distinction for this discussion, however, my main concern is API clutter. I would like to minimize the API surface and avoid cluttering http.Server with callback fields, if possible. I fear we are heading towards API clutter unless the new functions/callbacks/fields we add are sufficiently general.

I am promoting two APIs: (1) Wrapping http.ResponseWriter. This will be the canonical way to affect how the response is produced. (2) UpdateRequestContext. This will be the canonical way to get a read-only view of each request for logging. There may need to be a long list of read-only hooks, but at least those hooks will be hidden in net/http/httptrace rather than cluttered in net/http.

tombergan on 8 Aug 2017

Considering the case where a user wants to log the status code of all responses, here are three possible approaches:

1) No changes to net/http. Logging package authors wrap the http.ResponseWriter, providing a function with signature func(inner http.Handler) http.Handler. Application owners can call it any time they have a http.Handler, sometime before setting their http.Server.Handler field.

2) Add a field to http.Server, UpdateRequestContext. Logging package authors write two pieces of code: their usual middleware (to access the URI path from the http.Request), and a function to add to the UpdateRequestContext field (to access the status code). Application owners call one of the functions from their http.Handler, and the other where they set up the http.Server (which may be in a different file or package).

3) Add a function in http designed to be called from an http.Handler. Logging package authors convert their http.ResponseWriter wrappers to use the new function with no change in the API they provide to users. Applications owners can call it any time they have a http.Handler, sometime before setting their http.Server.Handler field.

The shortcoming of (1) is that it blocks access to additional methods on the http.ResponseWriter. The shortcoming of (2) is that it requires users to coordinate between two different parts of the net/http API. Users who only have access to an http.Handler cannot use it (and would be forced to use the existing option 1).

Code for each follows. Does example 2 match the API design you have in mind? Can you give a more concise demonstration of correct use of the new field?

1) No changes to net/http:

type statusRecorder struct {
    status int
    w      http.ResponseWriter
}

func (srw *statusRecorder) Header() Header {
    return srw.w.Header()
}

func (srw *statusRecorder) Write(p []byte) (int, error) {
    if srw.status == 0 {
        srw.status = http.StatusOK
    }
    return srw.w.Write(p)
}

func (srw *statusRecorder) WriteHeader(status int) {
    srw.status = status
    srw.w.WriteHeader(status)
}

func logStatusMiddleware_ResponseWriter(inner http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        srw := &statusRecorder{w: w}

        defer func() {
            log.Printf("path=%q status=%d", r.URL.Path, srw.status)
        }()

        inner.ServeHTTP(srw, r)
    })
}

2) Add a field to http.Server, UpdateRequestContext:

func logStatusMiddleware_AndServerUpdate(inner http.Handler, srv *http.Server) http.Handler {
    ctxKey := new(int)

    prevFn := srv.UpdateRequestContext
    newFn := func(ctx context.Context) context.Context {
        if prevFn != nil {
            ctx = prevFn(ctx)
        }
        var (
            mu   sync.Mutex
            info httptrace.WroteHeaderInfo
        )
        ctx = httptrace.WithServerTrace(ctx, &httptrace.ServerTrace{
            WroteHeader: func(i httptrace.WroteHeaderInfo) {
                mu.Lock()
                defer mu.Unlock()
                info = i
            },
        })
        ctx = context.WithValue(ctx, ctxKey, func() httptrace.WroteHeaderInfo {
            mu.Lock()
            defer mu.Unlock()
            return info
        })
    }
    srv.UpdateRequestContext = newFn

    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        defer func() {
            info := r.Context().Value(ctxKey).(func() httptrace.WroteHeaderInfo)()
            log.Printf("path=%q status=%d", r.URL.Path, info.StatusCode)
        }()

        inner.ServeHTTP(w, r)
    })
}

3) Add a function in http designed to be called from an http.Handler:

func logStatusMiddleware(inner http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        http.AddServerTrace(r.Context(), &httptrace.ServerTrace{
            WroteHeader: func(info httptrace.WroteHeaderInfo) {
                log.Printf("path=%q status=%d", r.URL.Path, info.StatusCode)
            },
        })

        inner.ServeHTTP(w, r)
    })
}

rhysh on 3 Oct 2017

If I'm reading this correctly, ServerTrace proposal doesn't allow modification of the response body prior to writing. Therefore it will not allow things like compressing the response body prior to it being sent, for example.

Presumably, the ServerTrace struct will need to define another handler: WillWriteBody(body []byte, header http.Header) []byte

Such a handler would need to receive the body before it's being written, as well as the headers, if there's a need to modify them. And it should return the new body for actual writing. I assume that the request headers are already somehow available from within that handler already, since they also usually play a part here.

urandom on 4 Oct 2017

👎2

You make a good point @urandom , that packages like https://github.com/NYTimes/gziphandler are currently implemented by wrapping the ResponseWriter with special behavior for Write and Flush. If packages like that one continue to wrap the ResponseWriter, new features exposed as methods will continue to be shadowed.

That package also has special behavior for the Push method: it ensures that the synthetic request will have the "Accept-Encoding: gzip" header. And, the handler disables connection hijacking. It needs to intercept several methods in order to change their behavior.

Will the ServerTrace struct expand to allow modifying all additional ResponseWriter methods? When new behaviors are added to the ResponseWriter, will they be added to the ServerTrace struct as well? That seems like a lot of API duplication.

rhysh on 6 Oct 2017

Looks like this didn't happen before the Go 1.10 freeze. Bumping to Go 1.11.

bradfitz on 1 Nov 2017

Another data point of a package which has to deal with the combinatorial explosion of optional interfaces:
https://github.com/prometheus/client_golang/blob/1cdba8fdde81f444626530cce66e138d2f68ae1c/prometheus/promhttp/delegator.go#L100

stapelberg on 24 Nov 2017

I have an alternative proposal for avoiding the combinatorial explosion of optional interfaces. It is inspired by the causer interface from github.com/pkg/errors (interface{ Cause() error }). The stdlib documents that if you wrap a http.ResponseWriter you should also implement this interface

type Wrapper interface {
    Wrap() http.ResponseWriter
}

Then for example, if we want to get at the http.Hijacker we can use code like this:

func GetHijacker(w http.ResponseWriter) http.Hijacker {
    for w != nil {
        if h, ok := w.(http.Hijacker); ok {
            return h
        }
        wrapper, ok := w.(http.Wrapper)
        if !ok {
            break
        }
        w = wrapper.Wrap()
    }
    return nil
}

This doesn't solve the problem of getting at additional server event data that ServerTrace would, but would help with the original problem proposed in this issue and the issues with wrapping ResponseWriter in general.