Apollo-tooling: Proposal: Types aren't generated without operations defined

Created on 1 Aug 2017 · 18Comments · Source: apollographql/apollo-tooling

When a GraphQL file contains no operation definitions (or operations are server side and defined under Query/Mutation on schema) then apollo-codegen outputs a blank file.

In the apollo-codegen compiler it looks for operations in order to find "used types" -- however, there won't be any operations (e.g. mutation/subscription/query) in a server-side graphql file unless they're defined under a type that's assigned to the "schema" definition. apollo-codegen doesn't look for operations via the "schema" tag and won't emit any code even if the JSON schema file has the operations defined in it.

Example to reproduce:

type Query {
version:String
}

type Mutation {
version:String
}

schema {
query: Query
mutation: Mutation
}

$(npm bin)/apollo-codegen introspect-schema schema.graphql --output ./schema.json
$(npm bin)/apollo-codegen generate schema.graphql --target typescript --schema ./schema.json --output ./schema.ts

This seems like it would be a basic scenario for apollo-codegen so the graphqls files can be used within the project itself for type safety and not just by a client who might need to introspect it on the consumer side of an API.

Perhaps an option to generate regardless of finding a type in use or otherwise support locating operations via server-side definition rather than client-side introspection.

cli feature

Source

matthewerwin

👍10

Most helpful comment

Server-side types are also my biggest wish atm, especially for isomorphic services in TypeScript. And if the generator eventually expands to other languages, so you can keep the server in sync/consistent with the client, based on the schema itself.

@matthewerwin did you end up dropping the work, or has there been any progress since?

Tehnix on 8 Jun 2018

👍4

All 18 comments

@matthewerwin codegen focuses on generating types for the client side so you obviously need operations in order to do so. We may reconsider your proposal after shipping the 1.0 milestone. I'm going to keep the issue open in order to track a proposal.

rricard on 2 Aug 2017

@rricard thanks for the follow up. I have almost finished writing it and can issue a PR. I've gated any new logic with flags passed in via CLI to avoid side-effects against existing functionality.

The server schema (--schema) is required in apollo-codegen generate (i.e. via previous introspection) so it's surprising that that information isn't then be used to properly set up the generated types using the object definitions and return types. apollo-codegen v0.16.2 doesn't output any custom GraphQLObjectType structures which results in generating all the fields (vs the proper type). These are of course limitations if you ONLY had the client scripts (GQL) and didn't know or couldn't introspect the server schema but the "generate" function requires the schema so much more sophistication is possible.

Here is what I've added thus far:

Option to generate ALL custom struct/enum types regardless of whether they're used in a query/mutation/subscription
Option to generate USED custom struct/enum types based on an operation present (even if defined only in server schema) that references them
Option to use interfaces instead of types
Option for Pascal-case interface/type names
Option to generate Typescript using ONLY server-side schema (i.e. no "operations" needed or previous introspection)
Option for schema to be glob to support case where server splits across files
Support for "extend type"
Option to include any arguments defined by the --schema and utilizes the schema defined return type instead of the expanded list of properties.

A substantial benefit to this is that on the server-side we can use a file generated via this tool against graphql files and the output file can be used on the resolver Typescript files and eliminate possible bugs trying to keep the schema and the resolvers in sync (right now that's a runtime error thrown by apollo-server if no match is found and no error when args don't match...you just get 'undefined').

Obviously this could also be easily expanded to auto-generate proxy pattern for clients (e.g. angular service) thereby keeping everything in sync server-to-client without just getting a compilation error b/c your interface/type file changed and then manually having to correct it.

Anyways, the PR will be in within a few days and if you guys want to merge before/after 1.0 milestone is up to you.

matthewerwin on 3 Aug 2017

@matthewerwin: Trying to make codegen useful on the server sounds awesome!

It would be great to hear more about the changes you've been working on however. I'm in the process of making substantial changes to the compiler, so I want to make sure we don't run into conflicts that would prevent us from using your work. The compiler and IR are very much geared towards the client-side use case, so I suspect it makes sense to split things up if we also want to target the server.

I'm a little confused by some of your comments:

Option to generate USED custom struct/enum types based on an operation present (even if defined only in server schema) that references them

What do you mean by an operation only present in the server schema? It may just be a matter of terminology, but operations (queries, mutations) are not part of the schema and are always specified by the client.

The server schema (--schema) is required in apollo-codegen generate (i.e. via previous introspection) so it's surprising that that information isn't then be used to properly set up the generated types using the object definitions and return types. apollo-codegen v0.16.2 doesn't output any custom GraphQLObjectType structures which results in generating all the fields (vs the proper type). These are of course limitations if you ONLY had the client scripts (GQL) and didn't know or couldn't introspect the server schema but the "generate" function requires the schema so much more sophistication is possible.

It's not really a matter of sophistication, but a conscious choice to generate types for the client use case. We don't want to generate types based on all fields defined in the schema, but we generate query-specific types that give you type safety for results.

For example, we'd want you to only access name and friend's names from the results of this query, even though the schema may contain dozens of fields for those types:

 query Hero {
    hero {
      name
      friends {
        name
      }
   }
}

martijnwalraven on 3 Aug 2017

@martijnwalraven thanks for the feedback. Most of the changes are outside the IR -- i.e. if the client operations aren't supplied then they're just generated off the schema (so in the case operations are supplied it works only generating as you mentioned above)...in the expanded case it generates interfaces for all operations/types. The enhancement to the above is that the client will get the proper type rather than a separate type for every operation (e.g. getAllX():[x] and getX():x will use the same type). Since x will already either have nullable fields or not the generated Type will work even if the operation is limited to 1 or two fields.

To add clarity on the statement you're confused by. In essence that would be kind of like tree-shaking -- if types are defined in the graphql files on the server which aren't referenced by any Query/Mutation/Subscription then the codegen tool won't output them. The operations (client-side) are defined by the schema (server-side) when the entry point is detected (i.e. "schema { query:Query, mutation: Mutation, subscription: Subscription }")

Because of that you can match any supplied operation (i.e. "query Hero { hero { ... } }") to the associated implementation and thereby get the full-signature (return type, nullability, etc) that isn't present in an operation. That massively improves the quality of the generated code.

Most of the IR/Compiler pieces already pass the schema, operations & options so my code is simply additional methods rather than modifying existing code.

Does that clarify?

matthewerwin on 3 Aug 2017

in the expanded case it generates interfaces for all operations/types. The enhancement to the above is that the client will get the proper type rather than a separate type for every operation (e.g. getAllX():[x] and getX():x will use the same type). Since x will already either have nullable fields or not the generated Type will work even if the operation is limited to 1 or two fields.

I think the term operation is a bit confusing here, because operations in GraphQL mean queries or mutations (or subscriptions) and are always specified by the client. I think you want to say field instead of operation here.

Maybe I misunderstand what you mean, but it seems generating types from the schema requires you to make all fields optional in the generated code, even fields that aren't nullable in the schema. That's because it depends on the particular query which fields are selected. That isn't necessarily a problem, but it does force you to deal with the possibility of a field being undefined everywhere.

martijnwalraven on 4 Aug 2017

👍1

@martijnwalraven hopefully the code will clarify. Yes, in the schema definition the "operations" are field definitions but the operations (client-side) can be auto-generated from the schema.

I'm not issuing a PR b/c I haven't written in all the test cases but you can see here what's been done by comparing against master branch:

https://github.com/snaptech/apollo-codegen/tree/server-schema-support

Of particular relevance to our discussion is the methods createDocumentFromServerSchema and createClientOperationFromSchema in generate.ts which makes supplying client-schema optional...i.e. you can use this tool now to generate interfaces for the GraphQL server for use with the resolvers.

There is also a new method called getOperationSchemaDef( ) in compilation.ts which looks up the matching server-side schema definition for a client side operation thereby allowing type re-use rather than generating separate types for each custom selection of fields in the the operations.

Thanks for taking the time to follow what I'm saying. I made sure it had all your latest from /master. Happy to incorporate any thoughts/feedback especially around making any of these changes compatible with the new IR flow you're working on.

As an aside, if you do pull my code down to experiment with it the following will quickly show you what's possible related to the server-side schema:

apollo-codegen generate --schema ./**/*.graphqls --output ./graphql.schema.ts --target typescript --server-schema-only

You can install my build directly using the following:

npm install [email protected] --registry https://myget.org/F/snaptech/npm

matthewerwin on 5 Aug 2017

I still feel I'm missing something, but why go through the trouble of synthesizing operations if we could generate server-side types from the schema directly? Wouldn't that be easier and less error prone?

martijnwalraven on 7 Aug 2017

@martijnwalraven certainly you can generate types for both server/client from the schema. However, the tool is architecturally set up as you described earlier to use client-side operations -- writing it in the way I did allowed both scenarios to follow the same general code-path and gaining benefits in situations which are not all-client or all-server.

For our situation we will be using this tool to auto-generate interfaces to both make the server resolvers (and downstream methods) strongly-typed as well as output custom client operations & auto-generated Angular v4 services to wrap them (i.e. variations on fields requested are config-driven). Obviously these scenarios are different from the React-style fragment composition approach where generating client-side beyond the type definition would be difficult due to the spread of fragments across files. I still would make the case there that using the proper type from the schema is better than custom generating a type for each operation as apollo-codegen does now b/c with composition you really can't say what combination of fields will be requested when dealing with the composition approach especially if it's dynamic.

Back on your question though:
If client operations are supplied then the schema should limit it's generated file accordingly. In other words, the operations+schema are complimentary to each other for many scenarios vs a full dump of server schema and just writing a cleaner "server only" scenario doesn't give flexibility for some more powerful scenarios.

Notably compileToIR() requires both schema and an operations DocumentNode to even be useful so having access to any of the intelligence and capability downstream of the IR requires the operations to be generated. As for error-prone...GraphQL operation generation is pretty clean especially since fragments are unnecessary (i.e. it's a two function implementation) and as I mention above will be very useful for auto-generating operations in a config-driven manner. As schema morphs over time this significantly cuts down programming maintenance and errors in client software due to gql(``) templates becoming out of sync with the API in an centralized api-interface approach vs the composition approach.

matthewerwin on 7 Aug 2017

I have a hard time understanding your use case, but I get the feeling you're using GraphQL in a very different way from most people. In particular, I'm wondering what you mean by 'auto-generating operations in a config-driven manner'.

To clarify where I'm coming from, the usual expectation is that clients write GraphQL queries, recursively selecting just the fields they need. Queries may also contain fragments, which are reusable pieces of data that among other things could represent the data needs of particular component. The types apollo-codegen generates are specific to queries because that allows us to type check data access, as we know exactly what data will be returned for a particular query.

Obviously these scenarios are different from the React-style fragment composition approach where generating client-side beyond the type definition would be difficult due to the spread of fragments across files. I still would make the case there that using the proper type from the schema is better than custom generating a type for each operation as apollo-codegen does now b/c with composition you really can't say what combination of fields will be requested when dealing with the composition approach especially if it's dynamic.

I don't know what you mean by 'you really can't say what combination of fields will be requested', because that is exactly what GraphQL queries give you. Fragment composition semantics are well defined, and knowing what merged fields are selected (for particular runtime object types) is exactly what apollo-codegen has been designed for.

As schema morphs over time this significantly cuts down programming maintenance and errors in client software due to gql(``) templates becoming out of sync with the API in an centralized api-interface approach vs the composition approach.

Could you explain what you mean by this? Both gql templates and .graphql files are usually validated as part of your development workflow, so that means they are checked against the schema and you get descriptive error messages in case a schema introduces backwards incompatible changes (which is not recommended, but may still happen, especially in development).

martijnwalraven on 8 Aug 2017

Thanks for taking the time to try to understand. Also, just noticed you're from the Netherlands - one of my favorite places on earth!

Maybe I'm misunderstanding so thank you for your patience. I really should've separated my suggestion to use server-types versus custom-types into a separate topic/issue from this proposal to add support for proxy pattern and server-schema only.

So first, addressing custom-types/server types and I will add a second comment below to cover the proxy pattern consideration.

Below is purely a suggestion about the current implementation related to type generation.

When I say 'you really can't say what combination of fields will be requested' I'm referring to React/Relay composition. Fragments are spread across JSX (or external graphql files) and depending on what components are visible/active in a rendered screen can change what fragments (and therefore fields) make up the final query. This is what prevents overfetching/underfetching in the Relay design -- generating custom types doesn't seem to offer anything here versus using the full types defined by the server schema. We wouldn't know at the time apollo-codegen runs what query may be ultimately be assembled for a particular state of the application.

In that case (as apollo-codegen does today):
https://github.com/apollographql/apollo-codegen#typescript-and-flow

Why output:

export type CharactersQuery = {
  characters: Array<{
    __typename: 'Human',
    name: string,
    homePlanet: ?string
  } | {
    __typename: 'Droid',
    name: string,
    primaryFunction: ?string
  }>
}

When you could output:

interface Character {
  name: String!
}

interface Human extends Character {
  homePlanet: String
}

interface Droid extends Character {
  primaryFunction: String
}
export type CharactersQuery = {
   characters:Array<Character>
}

or for that last part:

export type CharactersQuery = {
   characters:Array<Human|Droid>
}

depending on how it's defined at the server/introspection.

Then it wouldn't matter the situation in which the query is composed, you can always be certain that everywhere that the same type definition is used and not generate a different definition when it's used with "primaryFunction" field included in one operation, and not included in another operation.

Related to that it seems interfaces would be preferable to types in the generated code giving you access to extends/implements and merged declarations which static types don't have.

All this should really be a separate discussion from the proxy pattern or ability to generate code without supplying operations via CLI.

matthewerwin on 8 Aug 2017

Back on the primary topic! :-)
Proxy pattern vs composition (in essence Angular service/separation of concerns approach vs React fragments embedded in JSX approach).

This I suppose is "using GraphQLin a very different way" as you describe which is subject to overfetching/underfetching but a more direct replacement for REST and definitely a separation of concerns style implementation. In effect all GQL operations are then defined within services (Angular instead of React) versus scattered across the codebase (i.e. you wouldn't embed GQL fragments in the components unlike JSX...though you could with reflection). For that case it's possible to generate the service itself (including all GQL operations) based on a configuration file. The config file could dictate anything we wanted to customize including adding directives @include(if:) onto specific non-scalar sub-types or even function argument-driven composition of fragments. Again overfetching is possible here but also data/api layer is cleanly abstracted, cache-busting is possible, generating the bulk of client-side code is possible. Both models have their advantages for different enterprise solutions and code maintenance vs data-efficiency.

Here would be an example of client side-generated code:

    interface RolePermission {
      id: string,
      role_id?: string | null,
      feature_id?: string | null,
      permission_id?: string | null,
      key?: string | null,
      name?: string | null,
      utc_date_created?: string | null,
      utc_date_deactivated?: string | null,
    }

    interface Role {
      id: string,
      key: string,
      name: string,
      app_id: string,
      feature_id?: string | null,
      instance_id?: string | null,
      template: boolean,
      default: boolean,
      permissions?: Array<RolePermission | null> | null,
      utc_date_created: string,
      utc_date_deactivated?: string | null
    }

    interface GetRolesQuery {
      getRoles(appKey:string | null,featureKey:string | null,instanceId:string | null) : Observable<ApolloQueryResult<Role|null>>
    };

    class GqlEndpoints {
      static getRoles = gql
        `query Security_getRoles($appKey:String,$featureKey:String, $instanceId:ID, $w_permissions:Boolean!) {
            getRoles(appKey:$appKey, featureKey:$featureKey, instanceId:$instanceId) {
              id, key, name, app_id, feature_id, instance_id, template, default, utc_date_created, utc_date_deactivated,
              permissions @include(if:$w_permissions) { id, role_id, feature_id, permission_id }
          }
        }`
    }


    @Injectable()
    export class UserService implements GetRolesQuery {
      constructor(private apollo: Apollo) {
      }

      public getRoles(appKey: string = null, featureKey: string = null, instanceId: string = null,
                      w_permissions:Boolean = false): Observable<ApolloQueryResult<Role | null>> {
        return this.apollo.query({
          query: GqlEndpoints.getRoles,
          variables: {
            appKey,
            featureKey,
            instanceId,
            w_permissions
          }
        });
      }
    }

The gist of what I'm getting at is that there are a number of patterns & the React-composition approach, the service/proxy approach, etc each which have merits for different problem sets. Apollo-codegen can be capable of providing generated code for both without much enhancement.

Lastly, based on the code I referenced way earlier in this thread. I'm able to generate the full schema definition to interfaces which then allows us to use them on the API itself for the resolvers and downstream methods. For this, operations/DocumentNode are of course totally unnecessary except to take advantage of everything downstream from compileIR( ) method in the current apollo-codegen implementation.

Sorry for so much content @martijnwalraven. Doing my best to cover the gap in understanding between us. Really appreciate what you've done with the apollo-codegen project -- Apollo as a whole is quite remarkable & we're very grateful to have GraphQL available in Angular v4 and robustly running server-side made possible by this.

matthewerwin on 8 Aug 2017

That takes some time to digest... But I think I can see where our vantage points differ now.

To start with your first point, I'm afraid that is largely based on a misunderstanding:

When I say 'you really can't say what combination of fields will be requested' I'm referring to React/Relay composition. Fragments are spread across JSX (or external graphql files) and depending on what components are visible/active in a rendered screen can change what fragments (and therefore fields) make up the final query. This is what prevents overfetching/underfetching in the Relay design -- generating custom types doesn't seem to offer anything here versus using the full types defined by the server schema. We wouldn't know at the time apollo-codegen runs what query may be ultimately be assembled for a particular state of the application.

While Relay 1 did rely on dynamically composing queries, Apollo Client never has. And Relay Modern has also abandoned that approach. So queries in modern clients are always statically analyzable. Fragments are included through fragment spreads as part of the query syntax, not based on runtime information.

That means we always know exactly what fields are requested when we generate the types, and that is what gives query-specific types their power. We can use the type system to make sure we only access data that has been requested by a particular query.

The problem with outputting something like:

interface Character {
  name: String!
}

interface Human extends Character {
  homePlanet: String
}

interface Droid extends Character {
  primaryFunction: String
}
export type CharactersQuery = {
   characters:Array<Character>
}

is that this doesn't reflect the reality of the data we get back from the server (or the local cache for that matter). All properties would have to be optional, because we can never be sure they are requested in a particular query. And in practice, schema types could include dozens (or hundreds!) of fields. What's even more important is that fields may take arguments, and we can use aliases to include a field multiple times. Those are just some of the subtleties that disallow generating a faithful representation of a query result from the schema.

martijnwalraven on 9 Aug 2017

The requirement to use operations to generate client-side types seems largely orthogonal to where you decide to put your queries. You don't have to put your queries in individual components, I can see why a service layer makes sense for some approaches. But you would still need to define the queries in your service layer, and you can use those definitions to generate query-specific types (or even complete service classes).

I also see the benefit of generating types for the server side, but the requirements seem totally different from those of the client, and I don't think there's a good reason to attempt to reuse too much of the code. So I would suggest generating resolver types directly from the schema.

martijnwalraven on 9 Aug 2017

Agreed using operations for generating interfaces are orthogonal at best and superfluous at worst. For me it was of practical utility to generate them to make use of compileIR which was operations-based; in line with the original design of apollo-codegen.

Perhaps I will re-visit when the IR-design changes you spoke of are complete if you're not okay making these types of changes to the codebase (or don't feel it's in line with apollo-codegen purpose). Currently it's doing a great job for us internally (pre-build step on the server/client) but as you mention may be worth streamlining without code-reuse. I think it's nice though that I was able to take advantage of the IR and code-generation for the server without much interference with the existing code-base.

Thank you for highlighting some of the edge cases; very helpful for me! Here are my quick thoughts on that:

Aliases
These are just a custom container type. Use of common server types would still be preferable except for the alias structure.

{
  empireHero: hero(episode: EMPIRE) {
    name
  }
  jediHero: hero(episode: JEDI) {
    name
  }
}

is still just

interface Hero {
    name: String,
   ...
}
interface MyAliasedOperation {
    empireHero: Hero,
    jediHero: Hero
}

No matter how nested or crazy things get the types defined by the server would still add value & the additions needed related to the operations are the "custom types" required when aliases are in use.

Colocation
Relay Modern uses a compiler to address co-location issues but that is on the basis that components are always statically assembled (a design choice). I can see how static query assembly is the primary focus of many tools because it's a great initial target vs worrying about edge cases (dynamic/attributed/reflection-based). I believe a big consideration there too is that statically compiled/centralized queries eliminates issues with massive GQL queries being passed vs a token representing them (i.e. a mobile concern) though not an enforcement by Relay Modern (yet). I'm glad to see everyone moving towards statically typed data & thanks for highlighting this related to my comments around combinations of fields.

Field Masking
This is very valuable consideration that you mention. Should a component have intellisense showing fields visible that aren't specified by a fragment/query relevant to that component? I think the data protection side is taken care of (only fields the component fragment specified will be populated). There is a trade-off either way. Using custom interfaces you then have to coerce/cast and suffer a lack of readability distinguishing common types when dealing with more complex programming scenarios. For common types you suffer from having to mark all the properties optional and the programmer must deal with potential nulls if assumption is that the field is present when it's not queried by the fragment.

Specific to the Typescript output you could use the server type (without all fields set to "?") and then provide:

type Partial<T> = {
    [P in keyof T]?: T[P];
}
type CustomHero = Partial<Hero>;

for cases where client-side wishes to treat some required fields as nullable.

Ultimately, I'd love to see apollo-codegen cater to different design patterns including both client operations & server resolver code generation. Perhaps an unopinionated approach that supports different output based on combination of flags passed in -- much like generating Typescript, Flow, Swift for different target platforms. The end-user dictates whether they want all custom-interfaces or to prefer full server-interfaces.

matthewerwin on 9 Aug 2017

I agree with @matthewerwin and would really appreciate the server resolver types - I'm implementing the backend with TypeScript, too and therefore those types would help me a lot.

blissi on 1 Jan 2018

I too came here seeking server-side type support. Writing the graphql server in typescript doesn't seem that unusual these days, and having to manually define typescript types for resolver function inputs and return values is redundant given a strongly typed graphql schema.

Defining types based on operations is exactly what I need for the client, and I hope that doesn't change. But hopefully it is clear why generating an exhaustive set of types is also useful.