Go-ipfs: commands: Rework `Response` interface (also affects `Command`)

Created on 19 Dec 2016 · 22Comments · Source: ipfs/go-ipfs

The Response type does not allow access to functions of the transport, even though that would be quite important, e.g. for flushing. Also right now there is code that tries hard to properly copy the output of a command into the transport. If we would just pass the transport to the response and make it write-only from the command side, then things would be much easier (and we could actually flush properly). This is what this issue aims at.

At the moment the Response interface looks like this:

type Response interface {
    Request() Request

    // Set/Return the response Error
    SetError(err error, code ErrorType)
    Error() *Error

    // Sets/Returns the response value
    SetOutput(interface{})
    Output() interface{}

    // Sets/Returns the length of the output
    SetLength(uint64)
    Length() uint64

    // underlying http connections need to be cleaned up, this is for that
    Close() error
    SetCloser(io.Closer)

    // Marshal marshals out the response into a buffer. It uses the EncodingType
    // on the Request to chose a Marshaler (Codec).
    Marshal() (io.Reader, error)

    // Gets a io.Reader that reads the marshalled output
    Reader() (io.Reader, error)

    // Gets Stdout and Stderr, for writing to console without using SetOutput
    Stdout() io.Writer
    Stderr() io.Writer
}

This means that it is both being read from and being written to, which is not very nice. Also there is only one concrete type implementing this interface. This is an indicator that this is not very clean code.

The Go stdlib has a very prominent example for how to do handlers: net/http. Here, a handler is passed a request and a response writer. I propose to use a similar approach.
Currently the first thing Command.Call(req) does it create a Response. I propose to instead create a ResponseEmitter beforehand and pass that to Call. A ResponseEmitter could look like this:

type ResponseEmitter interface {
    // closes http conn or channel
    io.Closer

    // Set/Return the response Error
    //SetError(err error, code ErrorType)
    // Automatically convert err to error if it isn't already.
    // Doing this manually is a hassle.
    SetError(err interface{}, code ErrorType)

    // Gets Stdout and Stderr, for writing to console without using SetOutput
    Stdout() io.Writer
    Stderr() io.Writer

    // send value
    // if value is io.Reader we just copy that to the connection
    // other values are marshalled
    Emit(value interface{}) error
}

There will be concrete types for each underlying transport (e.g. http.ResponseWriter or io.Writer). The type for http requests will also have functions like Flush or SetLength.

There are only a few places in which cmd.Call is called: the http handler, in cmd/ipfs and in some tests, so w.r.t. that we don't need to change that much.
NewResponse is also not called very often: in cmd.Call(), in the http client and in a test. So again, not that much to do.

The most work will be to get the ResponseEmitter working and apply changes in core/commands to use the new interface - but even that should be managable.

So the concrete changes I propose are:

use func (c *Command) Call(req Request, resem ResponseEmitter) instead of func (c *Command) Call(req Request) Response.
use func NewResponseEmitter(req Request, w io.WriteCloser) ResponseEmitter instead of func NewResponse(req Request) Response
move that out of Call() and pass the ResponseEmitter to it.
remove the pass-empty-object-as-a-flushing-request from PubsubSubCmd.Run and instead type-assert to HTTPResponseEmitter, which has a Flush method
remove flushCopy and all that stuff

Questions

When I mentioned this before, people said that sometimes the output of a command is used directly. When is that the case? I couldn't find anything regarding that. Anyway, we could make a chanResponseEmitter that sends values passed to Emit to a channel. No problemo.

Are you okay with the general idea here? I'm happy to do changes to the proposal but I really thing the idea to pass the output to NewResponseEmitter and then pass that one to cmd.Call is the way to go.

Yesterday Lars and me met and talked about this. Here is a gist that resulted from that session: https://gist.github.com/keks/450b31fd2032ffc65a01fc911e70b77d.

Phew, long issue. I hope you're still reading :) If you like this I can do it in January.
Looking forward to your comments!

topicommands

Source

keks

👍1

All 22 comments

bump @whyrusleeping @jbenet

keks on 9 Jan 2017

sorry for bumping all the time @whyrusleeping

keks on 19 Jan 2017

@keks thanks for being persistent about getting review on this :)

Overall I like the idea, My only hesitation is that this will be a very large changeset (if i understand correctly, it will touch every single command). To get around that, we could temporarily make the run function on the commands objects an interface, and then type check it to either the old style function or the new style function behind the scenes (similar to how urfave/cli does it here ).

This will allow us to make the changes we want, and the upgrade things incrementally to the new code (while using the new code for any new commands we write).

One thing to also consider (which might be a completely separate issue) is how this would interact with future commands we want to do like ipfs pipe where there is a bidirectional stream of input and output. I think this would work fine with something like that, but i just want to make sure its at least thought about.

Final bit: While we're changing the interface, theres this little thing i've wanted to change for SOOOO long, and its to make the SetError method accept an interface{} instead of an error, then internally create an error type from the input if its not already an error. There are soooooo many places where we do: res.SetError(fmt.Errorf("Some String"), cmds.ErrNormal) when it would be just that much more pleasant to write: res.SetError("Some String", cmds.ErrNormal).

Thanks for the great writeup! I think we can start moving forward with this soon :)

whyrusleeping on 19 Jan 2017

Overall I like the idea, My only hesitation is that this will be a very large changeset (if i understand correctly, it will touch every single command). To get around that, we could temporarily make the run function on the commands objects an interface, and then type check it to either the old style function or the new style function behind the scenes (similar to how urfave/cli does it here ).

This will allow us to make the changes we want, and the upgrade things incrementally to the new code (while using the new code for any new commands we write).

Yeah that would work. I'd be fine doing that even though I'm not sure about the performance impact of type switches.

One thing to also consider (which might be a completely separate issue) is how this would interact with future commands we want to do like ipfs pipe where there is a bidirectional stream of input and output. I think this would work fine with something like that, but i just want to make sure its at least thought about.

This totally should work, as long as you use multipart requests. Just keep reading the multipart reader and emit to the ResponseEmitter. New feature: you can flush in between!

Final bit: While we're changing the interface, theres this little thing i've wanted to change for SOOOO long, and its to make the SetError method accept an interface{} instead of an error, then internally create an error type from the input if its not already an error. There are soooooo many places where we do: res.SetError(fmt.Errorf("Some String"), cmds.ErrNormal) when it would be just that much more pleasant to write: res.SetError("Some String", cmds.ErrNormal).

Yeah that makes sense. I'll update the initial proposal so I don't forget that.

keks on 19 Jan 2017

performance impact of type switches is negligible, especially on user facing code (network latencies and process spawning times VASTLY exceed this cost).

whyrusleeping on 19 Jan 2017

👍1

Swapping out my thoughts about this [here].

keks on 23 Jan 2017

Thanks for pushing for this 👍 commands lib needs much love. ❤️ (Or much fire 🔥 to spring anew 🌱 )

I'm still digesting the proposed changes.

Some immediate warnings:

making sure everything checks out the same way will be non trivial. i think the commands are undertested, and thus changing them may break functionality. I would suggest that adding testing of weird edge case conditions for the use of the API would be an important first step (useful in its own right and enables a much safer/comfortable refactor)
reminder that we want this to work over non HTTP things (like RPC) as well. i think this is super clear and the proposed change here doesnt change that, but something to keep in mind before my next point.

Let me propose something even more radical:

the core/commands stuff right now has lots implementation of core api stuff. this really doesn't belong as "commands", and instead the functionality should all be pushed down through into the core node functionality (ie a set of functions called on a node, not functions responding to a command) and then the core/commands call should just be the glue between the commands lib and that nice api. We need to rework this.
The commands lib is a bit of a disaster. We should consider rewriting it with better interfaces, cleaner functionality, and easier usage. (right now it's super cumbersome).
- It's still the case that our commands lib does a lot of very valuable stuff that other commands libs do not, and i would really like to use that for other tools, but it's just SUCH a pain in its present form... :(
- we should redesign it with UX in mind, and extract it out of go-ipfs into its own package.

This may be much more than you bargained for @keks -- but what do you think?

jbenet on 25 Jan 2017

Thanks @jbenet!

making sure everything checks out the same way will be non trivial. i think the commands are undertested, and thus changing them may break functionality. I would suggest that adding testing of weird edge case conditions for the use of the API would be an important first step (useful in its own right and enables a much safer/comfortable refactor)

Good point. I am thinking about how to implement the legacy API layer best. I have a pretty concrete idea but it seems quite cumbersome and it looks like its easy to mess up things. The problem is that we will face this problem even when we rewrite and rip out the commands lib.

reminder that we want this to work over non HTTP things (like RPC) as well. i think this is super clear and the proposed change here doesnt change that, but something to keep in mind before my next point.

totally!

the core/commands stuff right now has lots implementation of core api stuff. this really doesn't belong as "commands", and instead the functionality should all be pushed down through into the core node functionality (ie a set of functions called on a node, not functions responding to a command) and then the core/commands call should just be the glue between the commands lib and that nice api. We need to rework this.

I think @lgierth started working on that, right? Maybe those commands could be the first ones in core/commands that we update to the new API as they should be the easiest? Together with pubsub because of that flushing hack...

The commands lib is a bit of a disaster. We should consider rewriting it with better interfaces, cleaner functionality, and easier usage. (right now it's super cumbersome).

Actually it is not only cumbersome but some stuff just doesn't work. That was my motivation for these changes.

It's still the case that our commands lib does a lot of very valuable stuff that other commands libs do not, and i would really like to use that for other tools, but it's just SUCH a pain in its present form... :(

I don't really know many other commands libs (I worked with codegangsta's cli package once, but that was it) but I'm not 100% sure what the big features are that commands gives us. I guess handling different transports is important here.
Maybe I just take them for granted but I might be missing some. Would be a shame if I refactor those away.

we should redesign it with UX in mind

Should I look at more than just Response? I mean I'll have to touch Command.Call anyway as it currently returns a Response and I want that gone. Also I'll touch the code that calls Command.Call.

I was trying to fix the one issue I ran into while using it, and not break anything else. I agree that is not always the best approach, but it seemed ok here. I'm up for a rewrite though.

and extract it out of go-ipfs into its own package.

Yeah, that'd be neat! In a way it already mostly is an own package residing in the main tree.

keks on 25 Jan 2017

fyi: work happens at the feat/responseemitter branch.

keks on 26 Jan 2017

I ripped the commands/ subtree out into its own git repo ([go-ipfs-cmds]). It is currently private as there is a lot of code committed by others and I wasn't sure if that is okay to do copyright-wise. I copied the license and changelog from the go-ipfs repo. If you think that is enough, please tell me and I'll make the repo public @whyrusleeping @jbenet.

Splitting into a seperate package allows me to reuse names currently in use in the go-ipfs/commands package. That way we don't have types like LegacyFunction.

I'm currently doing a big API redesign and would like to hear your pain points with the current implementation.

Also I decided not to get rid of the Response type, as we need it for things like transmitting errors and the content length. Instead, each Reponse is coupled to a ResponseEmitter. They may be on the same machine (using e.g. a channel in between) or on distinct machines, e.g. using the http API. The client would receive a Response while the http handler passes a ReponseEmitter to cmd.Call().

keks on 30 Jan 2017

hey! I saw you ask for comments during the call.

So just so you keep it in mind, ipfs-http-api-docs uses the code you are trying to move (https://github.com/ipfs/ipfs-http-api-docs/blob/master/endpoints.go#L12) to generate docs.ipfs.io API documentation. This is done by walking the commands tree from the Root and extracting the information from the fields. I understand this will break it. That's fine, but it would be really nice to still be able to autogenerate API docs after your changes from whatever new code defines the go-ipfs API. As I say, just something to keep in mind that you may have not known.

Automatic docs and automatic tests are not far from each other, and this was in part an effort in that direction.

hsanjuan on 30 Jan 2017

👍1

@keks the extraction should be ok license wise. DCO should still apply.

Kubuxu on 30 Jan 2017

@hsanjuan thanks for the heads-up! I'll try to make sure that things will work smoothly afterwards.
Having the requirement that go-ipfs-cmds should play well with legacy commands introduces quite a lot if interface{} where a *Command used to be (because it could be a *"go-ipfs/commands".Command or a *"go-ipfs-cmds".Command). That is annoying, but I'll try to mitigate that later, maybe by finding an interface that makes sense for both structs. Other than that I currently don't see a lot of trouble coming up, but we'll see :)

@Kubuxu understood. I'll make it public.

keks on 30 Jan 2017

👍1

@keks just to be clear: this isn't related intrinsically to https://github.com/ipfs/go-ipfs/issues/2647, correct?

RichardLitt on 30 Jan 2017

@RichardLitt no, not at all. Great to know that exists though. I'll make that branch as well when the package is done.

keks on 31 Jan 2017

👍1

@keks the test for it is already written: https://github.com/ipfs/go-ipfs/pull/2648

Kubuxu on 31 Jan 2017

So @diasdavid and me just had a chat. He pointed out that commands/http currently is neither very friendly to browser clients nor to cli clients. We had the idea to add an rpc subpackage that provides a generic RPC API (you pass it an encoding like json, xml or protobuf) that can be run over any socket and we let cli run over that (unix socket, tcp, whatever) and also expose an HTTP API interface that is neat to use for browsers. We can also use the RPC interface over WebSockets. One note however: If we introduce a new API we need tooling for documentation.

Also relevant here: https://github.com/ipfs/http-api-spec/issues/116 and https://github.com/ipfs/http-api-spec/issues/122

The question here is: If we move to a true RPC API, for what cases do we need an HTTP API? We can use RPC in native code and websockets in browser code. Am I missing something?

Maybe we could provide a more RESTful API though, I guess that would be better for apiary? Should we try to include all the commands, but more RESTy? Or just a subset that maps well to REST? @hsanjuan what do you think?

Also ping @whyrusleeping @jbenet

keks on 31 Jan 2017

👍1

Hey everyone, I've been working on this for a while now and just wanted to give a quick update.

The ipfs command builds fine, but the sharness tests fail. This is mostly due to the compatibility layer, which tries to mock legacy interfaces so we don't have to reimplement all of core/commands at once.

One problem I had was that I replaced Marshalers, which are passed a Response and return an io.Reader by Encoders, which are passed an io.Writer and an interface{}. What I didn't expect was that existing Marshalers might use e.g. res.Request(), so now the Encoders field in a Command struct is a func(Response) func(io.Writer) Encoder, where Encoder is defined as follows:

type Encoder interface {
  Encode(interface{}) error
}

This is automatically implemented by e.g. "encoding/json".Encoder.

Anyway, I feel like my current approach is very slow and tedious. One idea I had was to simplify some Marshalers to not use res.Request() anymore, but I'm not sure that's possible. Another idea would be to not use Encoder at all but make them a ResponseEmitter that we can send values, errors and lengths to.

Oh yes, and finally I considered removing SetError and instead just doing Emit(&cmdsutil.Error{}) - maybe adding API sugar like EmitError(msg interface{}, code cmdsutil.ErrorType). The reason is that setting a value should usually be done before emitting because it sets header values (like SetLength). Errors can always happen, even after calls to Emit, so I feel it shouldn't be a Set function.

I would love some feedback @Kubuxu @whyrusleeping

go-ipfs branch
go-ipfs-cmds

keks on 16 Feb 2017

Re: emit errors, i think that makes sense. It also means that we can emit multiple errors during the course of a command, right?

About the commands in the marshalers using the request object, This is rather hard to get around. The problem is that we use the request object to check flags to see how things should be printed to the user. You might make the encoders take a request, a writer, and an interface{}, which would be a decent shim to avoid having to change/rewrite a ton of code.

whyrusleeping on 17 Feb 2017

It also means that we can emit multiple errors during the course of a command, right?

Well, it could mean that, however commands/http aborts the connection after SetError and sends the error as a trailer. So we need to behave the same way for compatibility. When doing e.g. a cmds/rpc later on we can obviously decide to diverge from that and send multiple errors. But for now I guess we're stuck with that.

You might make the encoders take a request, a writer, and an interface{}, which would be a decent shim to avoid having to change/rewrite a ton of code.

Okay, I'll revisit the Encoders then. Taking only an io.Writer looked so good from a clean API standpoint, but I understand this might bee too limiting.

keks on 17 Feb 2017

Sorry for being very late in the game, I should have noticed this issue earlier.

I am not at all happy with the current interface and the new one doesn't seam that much better.

In many of the commands that I have written I have been stressing the current infrastructure by streaming data as it comes in and outputting multiple errors. I have been trying to write my commands in a style of traditional unix commands such as "find" or "ls". I would like any rewrite to better support this style. Examples are "ipfs block rm" and "ipfs filestore verify". Here are some of the requirements:

1) Output data as it comes in

2) Ability to output multiple non-fatel errors as they come in, the errors should be allowed to have some structure to them, for example in "block rm" the error should also indicate the key that ipfs was unable to remove.

3) Naively support other output to stderr, other than errors, examples might be warnings and notes

4) Ability to communicate a specific exit code

5) Ability to cancel, for example if the clinet receives a SIGINT.

6) Ideally, the command code should not have to worry about connection errors, unless it needs to abort. That basically means that it should be able to just blindly send data, and errors, and only have to check if it should abort.

This iteration doesn't necessary have to support these requirements, but I would like this to be a short to mid term goal.