I'm building a new source plugin for Sanity.io, and I'm trying to find a suitable workaround for the problem of missing fields and node types. This has been discussed in other issues, such as #3344, but given that we have a GraphQL schema available, I thought maybe there was a way to "trick" Gatsby and solve this in a less generic way than what those discussions usually revolve around.
So far, I've worked on several different paths:
addThirdPartySchemaThis works, but you miss out on certain features - for instance, date formatting is applied in the inferring step, so you miss out on that part. If the type-date.js file was exported, one might be able to grab the defined args for the type and apply them to date fields in the schema.
Also, the usual node fields are no longer queryable - id, parent, children, internal etc. I _suppose_ one could define a GatsbyInternal schema type and define these fields explicitly, but obviously one would have to keep this up to date with any new fields added to Gatbsy. Perhaps Gatsby could define a type for this so we could reference it?
This approach also does not solve the case of missing collections. For instance, if I know there is a blogPost type but there are no blog posts created yet, I would still like to be able to query for all blog posts, but currently the allBlogPost query field is not added without there being one or more nodes of that type available.
setFieldsOnGraphQLNodeTypeBy adding the missing fields in this method, you _might_ be able to improve the situation, but you would have to do quite a bit of lifting yourself (figure out all the defined fields, find the ones that are missing, create GraphQL fields and schema types for them...)
This approach obviously only works for node types that are present, and you once again miss out on things applied in the inferring step (this is a problem since queries using an argument applied in the inferring step will not work when the field is manually created here)
Mocking a node for each schema type with all possible fields and adding the node in sourceNodes, then removing the node once the schema has been generated.
The problem with this approach is that I can't find any lifecycle hooks between the point where the schema has been generated and when pages are created - so you end up returning the mock node when querying for data while building pages.
If you delete the node prior to the page build, pages fail because the collection queries for the schema type no longer runs (allSomething is not added to the query anymore).
Is there anyone with enough insight on the Gatsby codebase that might be able to find any other approaches, or figure out what we need to put in place for something like this to be a reality?
I was informed of https://github.com/gatsbyjs/gatsby/issues/4261#issuecomment-442549881 which would easily solve all of these issues. It's a bit larger in scope than what I outlined here, so while I hope to be proven wrong, I suspect it might take a while to land. If anyone can come up with a simple stopgap solution, I'd be interested in hearing your thoughts!
You have listed pretty much all currently available options. There currently isn't nice way to do this - the issue you linked and https://github.com/gatsbyjs/gatsby/issues/4261 are tickets that are among other things documenting what we need to do to make it possible.
EDIT (27 Aug 2020): Please don't use solution listed below - use https://www.gatsbyjs.com/docs/schema-customization/ instead
But with some hackery (not advised) - you can make mocked nodes work - see https://github.com/pieh/gatsby-mocked-nodes/blob/master/gatsby-node.js for minimal example - but as noted in comments there - it relies on some private gatsby APIs and also using "accidental" api hook (onPreExtractQueries) to make it work, but it kind of does work:

@pieh You are a lifesaver! This hack solved all my problems!
I'm grabbing the GQL schema from a remote source in in onPreBootstrap, then generate example values for all the types using easygraphql-mock, then caching it. In sourceNodes and onPreExtractQueries, I pick up those values and create/delete the nodes.
I make sure to check if emitter is returned to me before I attempt to use it. Worst case scenario, you're back to missing fields, or you'll have an additional node you have to remove. I think this will work nicely until the new API is in place so we can do this in a more proper fashion. Thanks a bunch <3
For anyone that lands on this issue now - please don't use the solution (or rather hacky workaround) I provided few comments above ( https://github.com/gatsbyjs/gatsby/issues/10856#issuecomment-451701011 ). We now (for some time actually) have proper API to handle this problem ( https://www.gatsbyjs.com/docs/schema-customization/ ) which should be used instead.
Most helpful comment
You have listed pretty much all currently available options. There currently isn't nice way to do this - the issue you linked and https://github.com/gatsbyjs/gatsby/issues/4261 are tickets that are among other things documenting what we need to do to make it possible.
EDIT (27 Aug 2020): Please don't use solution listed below - use https://www.gatsbyjs.com/docs/schema-customization/ instead
But with some hackery (not advised) - you can make mocked nodes work - see https://github.com/pieh/gatsby-mocked-nodes/blob/master/gatsby-node.js for minimal example - but as noted in comments there - it relies on some private gatsby APIs and also using "accidental" api hook (onPreExtractQueries) to make it work, but it kind of does work:
