Graphql-ruby: [Interpreter] AST Analyzers

Created on 28 Aug 2018 · 5Comments · Source: rmosolgo/graphql-ruby

Query Analyzers are very useful to validate queries with custom rules, to avoid running queries and wasting resources.

However, they come with a performance tradeoff. Currently the tradeoff looks kind of like this:

Slow down queries with analyzers to prevent wasting resources during execution for invalid queries.
Skip the internal representation traversal but use runtime errors to validate the query, potentially wasting resources that could've been statically caught.

The performance problem comes a lot from the fact that it has to traversal the internal rep (rewritten query) of the query, but also just from the plain fact that a large query takes some time to traverse, AST or internal representation.

With the new interpreter used for query execution https://github.com/rmosolgo/graphql-ruby/pull/1394, which traverses the AST, and ditches the internal representation, this potentially gives us the opportunity to move on from analyzers (very tied to irep right now) and use another concept to achieve its goals: validators.

Merging analyzers into validators allows us to remove one traversal from the execution lifecycle, which can be very impactful on large queries. Conceptually, they also aren't really that different! Allowing uses to add validation rules (Post spec rules probably?) could give us greater performance, and less concepts to think about.

Source

xuorig

Most helpful comment

I think an API like that would be great! There's only one change I would make from lessons learned with analyzers. visit_type makes sense from an implementation point of view, but it causes ugly user code.

How about, instead of passing visit_type as an argument, we use two _different_ methods, for example def on_enter_field and def on_leave_field ?

Another option, since we don't have super like you mentioned, is to use yield from inside a method to continue flow. I don't have a preference between the two but I thought I'd mention it.

rmosolgo on 6 Sep 2018

👍3

All 5 comments

Gave this a few more thoughts:

Analyzers are a bit different than analyzers conceptually, they may not return pass or fail, but rather collect data, send metrics to a 3rd party system, or anything that is simply observational.
Some analyzers need to run even with validate: false.
Do analyzers assume the query is otherwise valid? It may be hard to make them fit during the same AST traversal as the basic rules, since we don't know for sure the query will be valid.

xuorig on 4 Sep 2018

A generic solution here would be to add an optional _middle pass_ to execution, that is:

first pass is validation
optional second pass is analysis
third pass is execution

But, if want to run your analyzers during validation, you can. Maybe we need something more pluggable so that we have more granular control than validate: false 🤔

rmosolgo on 4 Sep 2018

👍1

My thoughts also @rmosolgo! Will be working on initial work for the standard 3 pass today, trying to see how we could model this flexible enough so analyzers can be "plugged in" to a different pass for performance.

xuorig on 4 Sep 2018

Here's an API i thought about yesterday, curious to hear your thoughts on it @rmosolgo :

An Analyzer inherits from GraphQL::Analysis::Analyzer, and responds to hooks from an AST visitor:

class Analyzer
  def initialize(query)
    @query = query
  end

  def on_field
    # To be implemented by analyzers
  end

  def result
  end
end

class QueryDepth < Analyzer
  def initialize(query)
    super
    @max_depth = query.max_depth
    @current_depth = 0
  end

  def on_field(visit_type, node, parent, visitor)
    @current_depth += 1 if visit_type == :enter
  end

  def result
    AnalysisError.new("too deep") if @current_depth > @max_depth
  end
end

Each hook receives the visit_type (enter, leave) since it doesn't have the same module system as validators, we can't use super here. It also gets the visitor as the last argument, which can be helpful to get information such as the current type, current field, etc.

An analyzer also responds to result, the equivalent of finished_value in the old API. Since the API is so close to visitors, the analyzer class could be instanciated during any "pass", but the first implementation of this would be during it's own separate pass, using a Analysis::Visitor (very similar to validation's BaseVisitor)

The visitor is in charge of calling the hooks on analyzer classes:

      def on_operation_definition(node, parent)
        object_type = @schema.root_type_for_operation(node.operation_type)
        @object_types.push(object_type)
        @path.push("#{node.operation_type}#{node.name ? " #{node.name}" : ""}")
        call_analyzers(:enter, __method__, node, parent)
        super
        call_analyzers(:leave, __method__, node, parent)
        @object_types.pop
        @path.pop
      end

xuorig on 6 Sep 2018

How about, instead of passing visit_type as an argument, we use two _different_ methods, for example def on_enter_field and def on_leave_field ?

Another option, since we don't have super like you mentioned, is to use yield from inside a method to continue flow. I don't have a preference between the two but I thought I'd mention it.

rmosolgo on 6 Sep 2018

👍3

Was this page helpful?

0 / 5 - 0 ratings