Hey. I was looking for a lot of information on how to protect against malicious requests, and as a result I found several common approaches:
All this approaches we can implement by meddleware and custom backend. But it will be cool if this solution is out of the box.
Also, graphene don't have information about security into the documentation.
ps: I can try to help, but if u have no reason why this is a bad idea.
I have recently been looking into exactly these same things. It would be great to have something built in available, and if not, at least some examples of how someone might best implement these with graphene.
@Arfey this is an area where there aren't any official answers in Graphene at the moment but I'll give you my opinion on the approaches that you've listed:
In another issue I've written up some sample code (not thoroughly tested) that will implement a basic max query depth check: https://github.com/graphql-python/graphene/issues/772#issuecomment-400094405 Something similar could be used for query cost calculations but it's unclear to mean how you would determine the cost of particular fields.
I think there is opportunity for experimentation in user space for this.
The backend functionality is also the place to implement any kind of persisted/query whitelisting. I actually don't think this feature is within the scope of the Graphene project though because it's going to be tightly coupled to the rest of your application. You would need to define where the persisted queries are stored, how the front end part of your app adds new queries, how your server interprets static queries etc. Again this is a place where experimentation can happen in userland and Graphene already exposes the right hooks through backends to implement it.
This sounds like a reasonable suggestion and I think it could be implemented. Any thoughts @graphql-python/core @graphql-python/governors ?
@jkimbo @Arfey Solution 4 sounds good. Easy to implement and will address this vulnerability in a timely manner. I don't see a reason why we would need introspection in production. @Arfey mind submitting a PR with this approach?
yes, i will make pull request some time later
Just my completely unasked for $.02:
About 3, I think it would help if the types that are automatically generated (from SQLAlchemy or Django ORM models) had their fields explicitly included instead of explicitly excluded. Much like how Django's own ModeField works.
Instead of
class MyType(SQLAlchemyObjectType):
class Meta:
model = MyModel
exclude_fields = ('scary_field',)
this:
class MyType(SQLAlchemyObjectType):
class Meta:
model = MyModel
include_fields = ('safe_field',)
while making the absence of both include_fields or exclude_fields an error.
Hiding introspection won't help much if the attacker has access to a copy of a client application, as she would be able to guess enough of the schema to write custom queries - just one cycle in the graph is enough for an attack.
But it would be useful for hiding upcoming features from competitors :-)
I use middleware for that right now
class HideIntrospectMiddleware:
"""
This middleware should use for production mode. This class hide the
introspection.
"""
def resolve(self, next, root, info, **args):
if info.field_name == '__schema':
return None
return next(root, info, **args)
@Arfey this is an area where there aren't any official answers in Graphene at the moment but I'll give you my opinion on the approaches that you've listed:
1 and 2: query cost or resource limitations + limiting query depth
In another issue I've written up some sample code (not thoroughly tested) that will implement a basic max query depth check: #772 (comment) Something similar could be used for query cost calculations but it's unclear to mean how you would determine the cost of particular fields.
I think there is opportunity for experimentation in user space for this.
3. query whitelisting
The backend functionality is also the place to implement any kind of persisted/query whitelisting. I actually don't think this feature is within the scope of the Graphene project though because it's going to be tightly coupled to the rest of your application. You would need to define where the persisted queries are stored, how the front end part of your app adds new queries, how your server interprets static queries etc. Again this is a place where experimentation can happen in userland and Graphene already exposes the right hooks through backends to implement it.
4. hide introspection for production mode
This sounds like a reasonable suggestion and I think it could be implemented. Any thoughts @graphql-python/core @graphql-python/governors ?
Hiya, thanks for the sample code but how can I plug this into graphene please? Is it a setting I can add, something like:
GRAPHENE = {
"SCHEMA": "path.to.my.schema",
"BACKENDS": ["path.to.my.backends.DepthAnalysisBackend"],
}
?
I've also tried adding it to my GraphQL view, which gives me an error when I run a query through graphiql (using the core backend)
graphql_views.GraphQLView.as_view(graphiql=True, backend=GraphQLCoreBackend)
the error is:
document_from_string() missing 1 required positional argument: 'document_string'
Update - sorted it out, needed to provide an instance not the class. I'll leave this here in case it helps someone.
so to add the backend, I've done:
`
from .graphql import backends
....
urlpatterns = [
path(
"v1/graphql",
graphql_views.GraphQLView.as_view(
graphiql=True, backend=backends.DepthAnalysisBackend()
)
)
]
My $.02 for how I handle this:
I have different types in the Schema with different related-node-information based on where the type is being used. This artificially sets a query depth which is dependent on the query.
E.g. If authors have one more books:
Author has field books of type AuthorsBookAuthorsBook _doesn't_ have field author, as that would create a cycleThis works for small to medium sized applications and doesn't take much thought/code to get working.
Thanks @thejcannon - agreed, it makes a lot of sense just design the api not to have cycles in it. Still nice to know no one is able to run queries that join many distinct tables on my server.
@jkimbo - can I check something on the code sample you provided please:
...
# We are only interested in queries
if definition.operation != 'query':
continue
...
This suggests we're only interested in queries, but since a mutation could return an object (like an authentication mutation might return a user object) does it not make sense to run this on both? Wondering if there is some subtlety I'm missing.
Thanks
@sandwichsudo that is a good point, I can't remember why I added that line but I think you could probably just omit it and it would still work.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I am using this validation rule to disable schema introspection:
class NoIntrospection(ValidationRule):
def enter_Field(self, node, key, parent, path, ancestors):
field_name = node.name.value
if field_name == "__schema" or field_name == "__type":
self.context.report_error(
GraphQLError(u"GraphQL introspection is not allowed", [node])
)
@sboisson I found this snippet:
export const introspectionTypes = Object.freeze([
__Schema,
__Directive,
__DirectiveLocation,
__Type,
__Field,
__InputValue,
__EnumValue,
__TypeKind,
]);
Seems like there is more introspection keywords - I think you should be checking for those too.
I think it might be easier to just check for startsWith("__").
On a more general note: More security tools for this would be nice.
Either included or as attachable middleware.
Adding to the initial list:
@FWirtz Nice find!
But checking just for startsWith("__") might be too much.
It seems that__typename is used a lot, by Apollo client for example.
@sboisson yeah I found that catch afterwards as well haha.
I think it would be best to rather check the query name (i.e. root node) whether it starts with __ instead of the fields. Afaik __typename is always just a nested field. Not entirely sure how to do this though.
Any update on this matter? I found this repo of someone trying to work on this, for now it implements the least interesting part of it (limiting query depth). I'm really new to this so I don't know if this could help.
By the way, is it possible to add a middleware directly to a schema instead of having it plugged to schema.execute, I ask this because in some cases (for example when using Graphene with Starlette), we don't have access to schema.execute as it is performed internally. I opened an issue on that question on Starlette side, not sure what's the best solution here.
Can anyone enlighten me why schema introspection is a flaw?
I use graphene-django, and I only expose non-private fields using include_fields. For us, our React frontend dynamically generates Forms using schema introspection.
@melvinkcx it's not strictly a flaw, it's a risk:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Not stale
there are two projects which aim to solve this problem:
Most helpful comment
I use middleware for that right now