Cosmos SDK has been pre-1.0 for a long time and it's probably about time to at least start that conversation. This issue proposes a definition for what v1.0 would mean and how we might get there.
v1.0 for me indicates that things will be stable for an appreciable amount of time. The Cosmos SDK has understandably been under a lot of flux, as has Tendermint Core. With the introduction of protocol buffers in v0.40 and in particular its support for backwards compatibility via schema evolution and breaking change detection.
Here's in my mind what would define v1.0 readiness: a version of the SDK which we can commit to that allows app, module and client developers to use v1.X versions without breaking changes to their code while still receiving upgrades for a meaningful length of time (probably > 1yr).
Here's my proposal for how we get there.
For v0.40 we marked almost all protobuf packages as v1beta1. In the near future after Stargate we want to migrate these all to v1 and enable breaking change detection as a required CI check. We postponed jumping for v1 right away because major changes might still be needed for large pieces. For instance #7242 came in quite late in the game and we may want to give #7122 some consideration.
Stabilization of v1 proto packages can happen one package at a time and as soon as that is reached, this gives stability for client developers.
AppModule interfacesAppModule underwent a lot of flux in v0.40. It is still a mixture of legacy amino and newer protobuf methods and I think there are ways it could greatly be simplified. Arriving at a stable set of AppModule interfaces will bring some stability for module developers. Although module dependencies like x/bank or x/staking still need to be addressed.
We could consider creating a nested go module cosmos-sdk/base or core that reaches v1.0 before the other pieces and just stabilizes AppModule and other "core" pieces.
Beyond stable v1 proto definitions (covered above), module stability to me primarily means stable backwards compatible Keeper interfaces, NewKeeper constructors, and other module specific wiring. This would bring stability to app and **module developers.
One approach to stabilizing modules is to separate them out into separate go modules within this repository. This could allow us to stabilize other pieces like AppModule interfaces and baseapp in base or core modules. Then individual modules could be separately versioned and reach their own v1.0 on their own time frame. One big challenge with this approach is the reliance on simapp for testing. If we had simpler AppModule interfaces and the overall wiring up of apps were easier, it would be more feasible for each module to setup its own minimal simapp-like test harness that just includes the required modules it needs for testing. This requires some research though.
AppModule wiring in general.This will bring stability to app developers and may largely be covered above. Generally though we want to aim for straight-forward but extensible wiring of all this stuff and I think we could improve a lot. Basically if we looked at simapp/app.go and simapp/simd and said wow that's really elegant... to me that would feel like a sign we're close to a v1.0 on that stuff.
/cc @ethanfrey @alessio @clevinson
I think this issue is worth being pinned.
Thank you for this issue @aaronc I look forward to a clear roadmap towards stability (for client and app-developers), while iterating on new features not the foundations.
Approach 2) consider alternate keeper <-> keeper wiring
I think this deserves much more consideration. If fact the same loose coupling idea is already used in IBC (only defining data format between endponts), in CosmWasm (returning messages/queries that are then dispatched by a router), and I even built something similar for "Inter-App Communication" back in Cosmos SDK 0.7/0.8, where modules don't call keepers directly, but rather communicated via the router with the same message types clients send (but different "Signer" addresses)
This will allow us to define the public API of one module via a clear Protobuf API and avoid import cycles, dependency on foreign implementations, and other troubles that have plagued the app composition in the SDK (and break with every update). It also allows us to provide a very clear security perimeter around each module, rather than potentially any module calling any public method, which allows better OCap or other protections.
There are many ways to proceed along this path, and some items do become more difficult to model this way - particularly hooks, where we need to register a listener to receive change events. If there is movement towards this idea, I am happy to participate in design sessions / discussions, ideally real-time as it is hard to get into such depth in issue comments.
I'd love to hear what people think about the possibility to give some API stability guarantees for top-level packages and their subdirs too - at least for some of them: e.g. client, server, version - at Tendermint we have even developed a tool to easily build ci automated checks to prevent from API breakages being introduced in new releases that are meant to be API-backward compatible.
The more I think about it the more I am inclined to believe that #7093 and #7122 are the right way to go. There are benefits for API stability within the SDK and for module developers, for CosmWasm and likely IBC. I can't see a clearer path to get to these sorts of goals across the ecosystem.
Can you say a bit more about hooks @ethanfrey ?
I'd love to hear what people think about the possibility to give some API stability guarantees for top-level packages and their subdirs too - at least for some of them: e.g.
client,server,version- at Tendermint we have even developed a tool to easily build ci automated checks to prevent from API breakages being introduced in new releases that are meant to be API-backward compatible.
I think a tool to prevent API breakage would be great. Are you thinking that we would provide guarantees for just sub-packages of a module @alessio ? What do you think about the idea of separately version go modules within this repo?
Can you say a bit more about hooks @ethanfrey ?
The staking module and the slashing module in the sdk use them to wire together. I have not used them, just know they need to be set in app.go and this is one item that is a different paradigm than process message or process query.
Maybe we don't need them, but the alternate design must have an easy upgrade path for this critical codebase. Probably only @alexanderbez really understands them
I looked at the StakingHooks you're talking about. I think this could be done with a similar message passing approach. The staking module would just need to know the addresses of the modules to send hook messages to. That might actually be a pattern that becomes useful in a number of scenarios.
One other thing this message passing approach opens up is an easier path to first class modules in other languages. Either via cosmwasm as we've discussed @ethanfrey or via sub-processes as I've discussed with @afdudley.
I like the idea of subprocesses. Some local abci-like interface between the router and a module. This can be implemented by a local adaptor (call to Go), or via grpc connection. It should be considered part of the app and always be on localhost, like tendermint-abci app can be two processes.
The major issue there is storage! Every process would need it's own storage/db, which is fine, but... how do we do atomic commits then? Or do we have the module call back to the main process over this grpc-like interface to make all storage queries?
It may be too hard to design such in the near future, and we can just focus on the local version and stabilize interfaces without getting too complex.
The major issue there is storage! Every process would need it's own storage/db, which is fine, but... how do we do atomic commits then? Or do we have the module call back to the main process over this grpc-like interface to make all storage queries?
It may be too hard to design such in the near future, and we can just focus on the local version and stabilize interfaces without getting too complex.
Well FOAM got a gRPC API added to iavl to address this part: https://github.com/cosmos/iavl/pull/296 and this sub-process idea came up in reference to FOAM's use case (/cc @blinky3713). But I agree it's a separate scope, just noting that this architecture would support it.
At this point I'm leaning towards drafting ADRs for the approach discussed here and in #7093 and #7122. It seems like there are quite a few upsides and I haven't heard any serious downsides besides the effort.
Most helpful comment
Thank you for this issue @aaronc I look forward to a clear roadmap towards stability (for client and app-developers), while iterating on new features not the foundations.
I think this deserves much more consideration. If fact the same loose coupling idea is already used in IBC (only defining data format between endponts), in CosmWasm (returning messages/queries that are then dispatched by a router), and I even built something similar for "Inter-App Communication" back in Cosmos SDK 0.7/0.8, where modules don't call keepers directly, but rather communicated via the router with the same message types clients send (but different "Signer" addresses)
This will allow us to define the public API of one module via a clear Protobuf API and avoid import cycles, dependency on foreign implementations, and other troubles that have plagued the app composition in the SDK (and break with every update). It also allows us to provide a very clear security perimeter around each module, rather than potentially any module calling any public method, which allows better OCap or other protections.
There are many ways to proceed along this path, and some items do become more difficult to model this way - particularly hooks, where we need to register a listener to receive change events. If there is movement towards this idea, I am happy to participate in design sessions / discussions, ideally real-time as it is hard to get into such depth in issue comments.