Silverstripe-framework: Epic: Scaling decoupled CMS usage (GraphQL performance)

Created on 24 Feb 2019  路  3Comments  路  Source: silverstripe/silverstripe-framework

Hypothesis

By making SilverStripe's GraphQL faster, early adopters are more likely to repeatedly use it for high profile projects.

Goals

  • Make GraphQL reasonably fast without relying on specialised infrastructure (100ms to 500ms response times)
  • Make GraphQL a solid foundation for exposing all data via a CMS Content Mangement API, even with 100+ models on larger installations.
  • Make GraphQL very fast through caching on specialised infrastructuer such as Apollo Engine (10ms to 100ms response times)
  • Minimise impact on developers and existing deployment workflows
  • Increase the chances of GraphQL case studies in the wider SilverStripe ecosystem which will reassure decision makers

Notes

Epic

Most helpful comment

Sorry, everyone. Didn't realise this was public. The reason it got moved is because the team that is doing this work operates in more of a business context than an OSS one, and a lot of the goals/outcomes we're after are more company focused and include references to internal personnel and plans.

That said, we're very mindful of how this will affect OSS as well, and I'm happy to include a redacted version of the epic here:

Hypothesis

We believe that improving the performance and overall developer experience of the graphql module will remove a key hinderance to more widespread adoption of GraphQL in the CMS and also in high-profile projects. This will allow developers to deliver more competitive, high performing solutions to clients and give content authors a user experience that is more competitive with the peers of Silverstripe CMS.

Idea and approach 聽

The GraphQL module has a sordid history that makes it an outlier from many of its predecessors. It was originally spun up in October 2016 to serve the emergent needs of a newly decoupled paradigm in asset-admin. As features were tacked on, most of its evolution was unplanned and sporadic, until it was finally, somewhat inappropriately, tagged stable ahead of the core 4.0.0 release. This error in judgment applied a permanence to the nascent GraphQL API that made future, necessary adaptations rigid and slow.

This epic therefore begins with a thorough review of what we have built and maintained over the last four years, what's good about it, and where the limitations are. Many of the performance bottlenecks are a direct result of an overly opinionated API that is tightly coupled to the Silverstripe ORM. As previous experiments have shown, a key ingredient to a high performing graphql server is a cachable, deterministically produced schema. Achieving this will require a challenge of our assumptions in many corners of the API.

The inevitable outcome of this epic should be a set of API changes that contribute largely to a new major release of the module (4.0.0).

Validation

Most of the evidence that GraphQL performance is a problem can be gathered objectively through benchmarking. Previous research shows that this module is in no position to scale in its current form, with some outlandishly slow response times (up to 30s) well within the realm of possibility for a large project.

More anecdotally, we've heard from agencies and members of the community that adopting graphql proved costly in the log run as the surprisingly poor performance wasn't apparent until the project had been scaled up leading up to a launch date. This hasn't been a good look for Silverstripe CMS.

We also know through our community research that making the CMS faster is at the top of the wishlist for CMS users and developers alike. While most of the admin still does not rely on GraphQL, we chose it as a solution because it was best suited to mitigating the performance problems in the CMS and leading us to a more API-driven approach. As of now, that investment is dead on arrival, because expanding the GraphQL API is synonymous with slowing its performance.

Recent experiments with static generation (Gatsby) have shown that the module cannot expose all CMS data to a static generator without major performance limitations. While this hasn't been benchmarked, we know from direct conversations with Gatsby that a maximum five second load time is required to implement previews (a non-negotiable feature of static generated sites), and we're nowhere near delivering that kind of performance reliably.

Out of Scope

  • A new stable release of the module
  • Tutorials / upgrade paths
  • Blockers to GraphQL adoption that are not related to performance/DX (e.g. security pitfalls)

cc: @ScopeyNZ @ntd @christopherdarling

All 3 comments

Private :(

Sorry, everyone. Didn't realise this was public. The reason it got moved is because the team that is doing this work operates in more of a business context than an OSS one, and a lot of the goals/outcomes we're after are more company focused and include references to internal personnel and plans.

That said, we're very mindful of how this will affect OSS as well, and I'm happy to include a redacted version of the epic here:

Hypothesis

We believe that improving the performance and overall developer experience of the graphql module will remove a key hinderance to more widespread adoption of GraphQL in the CMS and also in high-profile projects. This will allow developers to deliver more competitive, high performing solutions to clients and give content authors a user experience that is more competitive with the peers of Silverstripe CMS.

Idea and approach 聽

The GraphQL module has a sordid history that makes it an outlier from many of its predecessors. It was originally spun up in October 2016 to serve the emergent needs of a newly decoupled paradigm in asset-admin. As features were tacked on, most of its evolution was unplanned and sporadic, until it was finally, somewhat inappropriately, tagged stable ahead of the core 4.0.0 release. This error in judgment applied a permanence to the nascent GraphQL API that made future, necessary adaptations rigid and slow.

This epic therefore begins with a thorough review of what we have built and maintained over the last four years, what's good about it, and where the limitations are. Many of the performance bottlenecks are a direct result of an overly opinionated API that is tightly coupled to the Silverstripe ORM. As previous experiments have shown, a key ingredient to a high performing graphql server is a cachable, deterministically produced schema. Achieving this will require a challenge of our assumptions in many corners of the API.

The inevitable outcome of this epic should be a set of API changes that contribute largely to a new major release of the module (4.0.0).

Validation

Most of the evidence that GraphQL performance is a problem can be gathered objectively through benchmarking. Previous research shows that this module is in no position to scale in its current form, with some outlandishly slow response times (up to 30s) well within the realm of possibility for a large project.

More anecdotally, we've heard from agencies and members of the community that adopting graphql proved costly in the log run as the surprisingly poor performance wasn't apparent until the project had been scaled up leading up to a launch date. This hasn't been a good look for Silverstripe CMS.

We also know through our community research that making the CMS faster is at the top of the wishlist for CMS users and developers alike. While most of the admin still does not rely on GraphQL, we chose it as a solution because it was best suited to mitigating the performance problems in the CMS and leading us to a more API-driven approach. As of now, that investment is dead on arrival, because expanding the GraphQL API is synonymous with slowing its performance.

Recent experiments with static generation (Gatsby) have shown that the module cannot expose all CMS data to a static generator without major performance limitations. While this hasn't been benchmarked, we know from direct conversations with Gatsby that a maximum five second load time is required to implement previews (a non-negotiable feature of static generated sites), and we're nowhere near delivering that kind of performance reliably.

Out of Scope

  • A new stable release of the module
  • Tutorials / upgrade paths
  • Blockers to GraphQL adoption that are not related to performance/DX (e.g. security pitfalls)

cc: @ScopeyNZ @ntd @christopherdarling

Was this page helpful?
0 / 5 - 0 ratings