Relay: Implement internal "GraphMode" response writer

Created on 18 Mar 2016 · 14Comments · Source: facebook/relay

Relay currently applies the results of queries, mutations, and subscriptions by traversing the query and payload in parallel. The payload cannot be processed in isolation because it lacks sufficient context - for example, consider the following payload:

{
  ... {
    friends: {
      edges: [...]
    }
  }
}

What does friends.edges signify? It could be a plain List[Friend], it could be the first: 10 friends in a connection, or it could be the first: 10, after: foo - a pagination result that should be _merged_ with any existing edges. Currently, a payload can _only_ be interpreted correctly in the context of a query. This process isn't optimal: a given field such as id may be queried multiple times by sibling fragments, and therefore has to be repeatedly processed. Further, the same object may appear multiple times in the response payload (e.g. the same person authored multiple comments), again causing duplicate processing.

Goals

The primary goal of this proposal is to define a data format that can be efficiently applied to a normalized client-side cache. The format should be capable of describing _any_ change that could be applied to a normalized object graph: i.e. both the results of queries as well as mutations and subscriptions.

Specifically, we have found the following characteristics to be important to ensure efficient processing of query/mutation results:

Normalized data: avoiding duplication of data in the response reduces the time spent processing it.
Data-driven: queries themselves may have duplication (i.e. the same fields may be queried numerous times by sibling or nested fragments). The payload should be self-describing in order to reduce duplicate effort in processing.
First-class support for describing partial results, e.g. to allow pagination without loading all items of a list up-front.

Non-goals include:

Reducing byte-size over the wire in server -> client communication.
Defining a fully generic data response format. This proposal is specifically targeted at describing changes to a normalized object graph with the capabilities necessary for typical client-side applications.

Specification Strawman

We're still figuring this out, but we'd prefer to develop this specification in the open and with input from the community. We'll continue to update this as we iterate, but here's a commented example with Flow annotations:

Example Query:

Relay.QL`
  query {
    node(id: 123) {
      ... on User {
        id
        friends(first: "2") {
          count
          edges {
            cursor
            node {
              id
              name
            }
          }
          pageInfo {
            hasNextPage
          }
        }
      }
    }
  }
`

Standard "Tree" Response:

{
  node: {
    id: '123',
    friends: {
      count: 5000,
      edges: [
        {
          cursor: '...',
          node: {...},
        },
        ...
      ],
      pageInfo: {
        hasNextPage: true,
        ...
      },
    },
  },
}

GraphMode Response:

[
  {
    op: 'root',
    field: 'node',
    identifier: '123',
    root: {__ref: '123'},
  },
  {
    op: 'nodes',
    nodes: {
      123: {
        id: '123',
        friends: {
          __key: '0' // <- can refer back to this in range operations
          count: 5000,
        }
      },
      node1: {
        ...
      }
    }
  },
  {
    op: 'edges',
    args: [{name: 'first', value: 2}],
    edges: [
      {
        cusror: '...',
        node: {
          __ref: 'node1',
        },
      },
      ...
    ],
    pageInfo: {
      hasNextPage: true,
    },
    range: '0', // <- refers to the point in `nodes` with `__key = '0'`
  },
]

Where the shape of the response is:

type GraphModePayload = Array<GraphOperation>;
type CacheKey = string;
type GraphOperation =
  RootOperation |
  NodesOperation |
  EdgesOperation;
type RootOperation = {
  op: 'root',
  field: string;
  identifier: mixed;
  root: GraphRecord | GraphReference;
};
type NodesOperation = {
  op: 'nodes';
  nodes: {[dataID: DataID]: GraphRecord};
};
type EdgesOperation = {
  op: 'edges';
  args: Array<Call>;
  edges: Array<?GraphRecord>;
  pageInfo: PageInfo;
  range: CacheKey;
};
type GraphRecord = {[storageKey: string]: GraphValue};
type GraphReference = {
  __ref: DataID;
};
type GraphScalar = ?(
  boolean |
  number |
  string |
  GraphRecord |
  GraphReference
);
type GraphValue = ?(
  GraphScalar |
  Array<GraphScalar>
);

Next Steps

[ ] Implement proof-of-concept GraphMode response handler and use it in some real applications.
[ ] Refine the specification.
[ ] Use GraphMode for handling existing operations:
- [ ] Transform and apply query payloads via GraphMode.
- [ ] Transform and apply mutation & subscription responses via GraphMode.
[ ] Expose a public method on RelayEnvironment for applying GraphMode payloads to the store (as part of #559).

enhancement

Source

josephsavona

👍3

Most helpful comment

would GraphMode require MutationConfig for adding edges or that information could be captured in GraphMode itself?

@eyston Great question - the idea is GraphMode could describe mutations w/o any additional config. For example a range add might be described with:

{
  ...
  nodes: {
    123: {
      friends: {
        $type: 'connection',
        $data: [
          {calls: 'append', value: {$ref: 'addedID1'}},
          {calls: 'append', value: {$ref: 'addedID2'}}, // <-- append multiple edges at once
        ]
      }
    },
    addedID1: {
      ...
    },
    addedID2: {
      ...
    },

}

josephsavona on 18 Mar 2016

🎉3

All 14 comments

In your example I think you have employees and friends intermixed. I think you just mean friends everywhere, right?

wincent on 18 Mar 2016

👍1

An important aspect of this is that we don't have redundant data. Imagine this query:

Relay.QL {
  query {
    node(id: "123") {
      id
      name
      ... on User {
        cousins {
          edges {
            node {
              id
              name
              ... on User {
                cousins {
                  edges {
                    node {
                      id
                      name}}}}}}}}}}}
`;

The result tree will have lots of duplicates, since my cousins have me as a cousin, and most have each other as cousins. In Graph Mode, we'll only have a single instance of each User.

craffert0 on 18 Mar 2016

👍1

An important aspect of this is that we don't have redundant data.

Surprisingly perhaps not quite as important as you may think, because gzip ends up eating up the redundancy for breakfast.

wincent on 18 Mar 2016

👍1

Sorry if I'm missing this part but would GraphMode require MutationConfig for adding edges or that information could be captured in GraphMode itself?

eyston on 18 Mar 2016

would GraphMode require MutationConfig for adding edges or that information could be captured in GraphMode itself?

@eyston Great question - the idea is GraphMode could describe mutations w/o any additional config. For example a range add might be described with:

{
  ...
  nodes: {
    123: {
      friends: {
        $type: 'connection',
        $data: [
          {calls: 'append', value: {$ref: 'addedID1'}},
          {calls: 'append', value: {$ref: 'addedID2'}}, // <-- append multiple edges at once
        ]
      }
    },
    addedID1: {
      ...
    },
    addedID2: {
      ...
    },

}

josephsavona on 18 Mar 2016

🎉3

What is this...Falcor?!?!?

:trollface:

skevy on 18 Mar 2016

Another question -- would non-node objects be embedded? e.g.

query {
  viewer {
    birthday {
      month day year
    }
  }
}

nodes: {
  123: {
    birthday: {
      month: 1,
      day: 1,
      year: 2000
    }
  }
}

kind of like no $type means interpret literally?

eyston on 18 Mar 2016

@eyston yes, id-less records are inline

josephsavona on 18 Mar 2016

After discussion with @leebyron I started looking for ways to avoid the special $type/$data keys. The main challenge is that connections simply can't be handled as-is: edges almost never just replaces the local edges and always requires some custom processing. A similar constraint holds for root fields, which are currently handled specially.

Here's an example query that demonstrates these challenges and an updated proposal for the data format:

query {
  me {
    id
    name
    address {
      city
    }
    bestFriend {
      name
    }
    friends(first: 2, after: $foo) {
      edges {
        node {
          id
          name }}}}}

The results could be described using operations similar to those in JavaScript Object Notation (JSON) Patch but with semantics specifically tailored to describing GraphQL/Relay object graphs:

[
  {
    // The `root` operation describes how a root field maps to an id.
    // This concept may not be necessary once Relay supports arbitrary
    // root fields.
    // In this case, `me` corresponds to id `123`:
    op: 'root'
    data: {
      field: 'me',
      arguments: null,
      id: '123',
   },
   {
     // The `add` operation denotes new data that should be added/merged into the object graph.
     // This describes scalar fields, plain lists, references (linked records), and lists of references.
     // Other field types such as pagination cannot be represented inline.
     op: 'add',
     data: {
       123: {
         name: '...',
         address: {
           city: '...',  // no cache identifier (`id`), so value is inline
         },
         bestFriend {$ref: '456'}, // single key `$ref` indicates a reference to another record
       },
       456: {
         name: '...',
       },
       friend1: {
         ... // first friend in the connection - note that it isn't linked to within this operation, that's ok
       },
       friend2: {
         ... // second friend in the connection - note that it isn't linked to within this operation, that's ok
       },
     },
   },
   {
     // The `connection` operation describes portions of a list that should be merged into
     // the existing list. It may be necessary to change the `id` key to a "path" in order to
     // allow updating connections on records without an `id`.
     op: 'connection',
     data: {
       id: '123',
       field: 'friends',
       arguments: {first: 2, after: 'fooCursor'},
       edges: [...], // includes `$ref`s to friends 1 and 2
       pageInfo: {...},
     },
   },
]

Note that the add operation does not include the friends field on record 123, because no scalar fields are fetched. The data for the friends field is supplied in a subsequent connection operation.

EDIT: I updated the issue description with a modified version of this proposal.

josephsavona on 20 Mar 2016

sorry, more questions...

Where do you envision the translation from GraphQL query + payload into GraphMode happening?
Any thoughts on how this affects tracking queries -- or is that not related at all? When I say tracking I'm thinking about two scenarios which may not really be tracking (I am hazy on this part of Relay): diff'ing a query and intersecting the fat query. For diff'ing a query isn't type information necessary due to polymorphic fields? For instance just because field age is in the store doesn't mean ... on User { age } is satisfied by the store? or maybe it does? And the second thing is the fat query -- if I insert data directly into the store without the corresponding query wouldn't it be at risk of not being considered intersecting the fat query which could lead to stale data?

thanks!

eyston on 21 Mar 2016

Where do you envision the translation from GraphQL query + payload into GraphMode happening?

For the foreseeable future this transform would happen on the client, possibly on another thread.

Any thoughts on how this affects tracking queries -- or is that not related at all? ... if I insert data directly into the store without the corresponding query wouldn't it be at risk of not being considered intersecting the fat query which could lead to stale data?

Yes, inserting data w/o a query could lead to stale data with the _current_ approach to diffing and mutations. To prevent this, initially only Relay internals will use GraphMode, and we will use a pre/post traversal to update tracked queries along with every payload. We're also exploring an alternate approach to constructing mutation queries that avoids the need to store tracked queries.

josephsavona on 21 Mar 2016

👍1

operations _similar_ to those in JavaScript Object Notation (JSON) Patch but with semantics specifically tailored to describing GraphQL/Relay object graphs

I'm a bit worried about the potential confusion caused by making something that is similar-but-still-different. What's the value of getting rid of $data/$type special keys (but still keeping $ref) if it's only to move to something that isn't actually JSON Patch? We've gotten rid of two special keys, but only at the cost of adding two custom op values.

wincent on 21 Mar 2016