Prisma1: Hide existence of records the API user has no permissions to access

Created on 11 Sep 2017  ·  14Comments  ·  Source: prisma/prisma1

_(I'm new to GraphQL, so please excuse me if what I propose below runs counter to how GraphQL should behave)_

Scenario

Using the Relay API, it is possible to get the count of any type, even if permissions do not give you access to the items you are querying.

For example, in the screenshot below I'm getting the total count of users for an app which has all API access restricted to authenticated users. (Note: the individual user records are correctly hidden.)

screen shot 2017-09-11 at 18 01 00

When used in combination with filters, it is possible to check for the existence of very particular records in the DB. For example, you can check if a user has any account with an app using allUsers(filter: {email: "[email protected]"}) {

screen shot 2017-09-11 at 18 04 26

Impact

  • The existence of user accounts is revealed, which can have privacy and (somewhat dubious) security implications.
  • Depending on the data model of a given app, sensitive business data could be revealed such as number of paying customers.

Proposal

I have read two related proposals (Extend permissions types ; Meta should be its own permission) that help mitigate this problem by adding permissions to more data. However, I don't believe they address the problem that an authenticated user can get the counts of data which matches queries they define, despite not having access to that data.

A better behaviour, in my opinion, would be:

  1. Do not return null edges for nodes which cannot be accessed due to permissions.
  2. Counts should only include notes which can be accessed.

Most helpful comment

Hi @jonathanheron The issue you bring up is valid and your proposal does not run counter to how GraphQL should behave.

But it won't work :-) Let me explain why ...

Permission queries is a powerful abstraction that allow you to specify the permission for a node by traversing related data. Performing this validation has to happen for each node.

If we were to simply filter out null nodes we run into a problem when you have many nodes. Consider a relation with 1M nodes where you have access to the last 10. For us to deliver just the 10 nodes to you, we would have to check permissions on all one million nodes. We are likely to add support for permissions based on serverless functions, which would make this issue even worse.

The same applies for counts.

The way we will be tackling this issue is to allow you as a developer to completely remap your GraphQL API.

A common case will be to remove relation fields, including the top-level allX fields and replace them with custom fields that include extra filtering logic to only return nodes that the user has access to.

I would love to get your take on this solution.

All 14 comments

Hi @jonathanheron The issue you bring up is valid and your proposal does not run counter to how GraphQL should behave.

But it won't work :-) Let me explain why ...

Permission queries is a powerful abstraction that allow you to specify the permission for a node by traversing related data. Performing this validation has to happen for each node.

If we were to simply filter out null nodes we run into a problem when you have many nodes. Consider a relation with 1M nodes where you have access to the last 10. For us to deliver just the 10 nodes to you, we would have to check permissions on all one million nodes. We are likely to add support for permissions based on serverless functions, which would make this issue even worse.

The same applies for counts.

The way we will be tackling this issue is to allow you as a developer to completely remap your GraphQL API.

A common case will be to remove relation fields, including the top-level allX fields and replace them with custom fields that include extra filtering logic to only return nodes that the user has access to.

I would love to get your take on this solution.

Still, there have been a number of different FR's, all coming down to roughly the same question: I don't want to get any results that the user is not permitted to see.

Especially with the introduction of permission functions, applying the same filter to a query as the filter in the PQ is next to impossible, and the current implementation (for both API's) will leave you with a problem.

If I define on the server side which nodes a user is permitted to see, I would simply want no information about other nodes returned. No null values, no counts, nothing.

Every database engine implements all query parts over the entire set, and then applies grouping, top(x), etc.

I don't want to remove allX query fields, I just want them to adhere to the permissions I have set.

@sorenbs Thanks for explaining. I understand now what I was missing in my proposal…

sidneyharris_miracleweb

;-)

Having granular control to rework my GraphQL API would give me a solution to the problem, which is good.

Some thoughts:

👍 I like that your approach keeps things quick to get started out-of-the box.

👍 I like that it will be possible to solve the problem by reworking my GraphQL API to suit my exact needs.


👎 The fact permission queries don't prevent information leaking is not intuitive. I see plenty of scope for subtly leaking data. To echo @kbrandwijk, I would favour:

If I define on the server side which nodes a user is permitted to see, I would simply want no information about other nodes returned. No null values, no counts, nothing. 💯

👎 To take your million nodes example… if there are 999,990 null nodes, what's point in being able to get back 10 proper nodes? I haven't encountered a situation yet where many null values being returned presents a problem for my app. That's likely because I'm typically grabbing parent › list of children › lists of children, and the permissions model of my app is currently quite simple. But I can see how paging through null values on the client side would be a big hassle.

👎 I'm concerned it might take a lot of extra work to get the expected behaviour.

Please also refer to this: https://github.com/graphcool/graphcool/issues/314 for a description of a painful information leak, and a proposal.

It finally occurred to me that I've mis-using permissions.

In my naive, new-to-the-product way of thinking, I've being thinking of permissions as a combination of both _access control_ and _filtering_.

I realise now that if I want to do some thinking like "show me all the _Things_ in a given account", I need to use:

  1. Filters to get just the nodes I want.
  2. Permission queries to ensure someone can't abuse the API to retrieve nodes they shouldn't have access to.

In some ways, this feels like I need to do double-work, but it makes sense now that I think about it.

Suggestion

In your docs you have an example permission query for _"only friends of the author can be collaborators"_

screen shot 2017-09-11 at 22 27 46

I think it would be useful to provide another example for _"only friends of the author can view private posts"_ and include both a permission query and a query with filters to show how this could be implemented in practice.

I realise I'm going a little off-topic. I still don't want to leak null nodes and counts. But maybe this info will be useful for other people who wrongly (like me) thing of permissions as working for both filtering and access control.


Random thought

Perhaps there's a more constrained version of something like permission queries that could work for access control and also ensure no null / count values, by having implicit filters? (Sorry if I'm talking nonsense. I have only half an idea in my head.)

That's what I was referring to, when I said that it's not always possible to 'translate' a PQ into a filter, especially when it's possible to define permission functions.

And yes, that filter only works when you define it in your client. There's no way to prevent a user to query the endpoint directly, so the 'leaking' part is still valid.

While I haven't used Scaphold.io, their "relation base permissions" feature sounds like it ticks some good boxes:

  • There's a UI to set up the relation based permissions. You don't need to write code.
  • The UI constrains you to only create permissions that link the user to the record in question.
  • I _think_ they both filter and restrict permissions with the one solution.

That's the sort of approach I'd love to see Graphcool take. In the meantime, I now know I need to use both filters and permission queries to get the results I want, so my immediate need is taken care of.

@jonathanheron I think it would be valuable to add a way to define filters that are always applied to a query.

Our current understanding is that this will be possible in a GraphQL proxy that wraps your Graphcool data API. But it could be interesting to add a feature right into Graphcool similar to Permission Queries.

I imagine you specify a filter that is merged into every query for that model.

In your example you could define the following serverside filter for the User model:

{id: $current_user}

Then when you issue this query from the client:

{viewer{allUsers{count}}}

It will automatically be transformed to:

{viewer{allUsers(filter:{id: $current_user}){count}}}
And this query:

{viewer{allUsers(filter:{age_gt: 54}){count}}}

would be transformed to:

{viewer{allUsers(filter:{age_gt: 54, id: $current_user}){count}}}

The same filter would be applied in all cases where the User model is queried. Do you think that would be a compelling system?

ps. I got intrigued by your suggestion to check out Scaphold, so I just spent 20 minutes trying to get their relation permissions to work. I just get a Permission Error, no matter what I do. If you can make it work, i'd love to know how it behaves :-)

@jonathanheron @sorenbs I mentioned that relation type here: https://github.com/graphcool/graphcool/issues/276.

@sorenbs How did you try to query the type you set up the relationship for. Because with Scaphold, if you enable Relation permission on a Type, you can no longer query through allXXX. You can only query the type through the parent relation you have defined. The idea behind it is great, but it's basically nothing more than shorthand for:

query {
   SomeParentExists(filter: { userFieldOnParent: $user_id})
}

So if you have an Owner field on Post, and you query:

query {
  allPosts {
     comments
  }
}

You would get all Posts, but only see comments for posts you own.

Ahh, that makes sense. Thanks for the explanation @kbrandwijk!

Changing the query to go through the parent type fixed it.

I actually think it would be really cool to enable server-side filter merging as described above. That's basically a more generic implementation of what Scaphold is providing.

We will have to think hard about how to communicate this feature separately from permission queries, but I think that is doable.

@sorenbs I like the server-side filtering idea, but there's some overlap with permission queries, as they both define 'what data the user can see'. It's not clear to me at this point how those two concepts should interact. I do believe however, that the permission system has other shortcomings, for example that it's whitelisting only, as opposed to a system that specifically allows or denies access. I also feel that the fact that permissions are not inheritable is a shortcoming, and that's something that the inherit or relation permission types address effectively.

What do you mean by inherit permission type?

@sorenbs Typo, relation permissions and inherited permissions is the same idea.

This issue has been moved to graphcool/graphcool-framework.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

schickling picture schickling  ·  3Comments

sorenbs picture sorenbs  ·  3Comments

schickling picture schickling  ·  3Comments

AlessandroAnnini picture AlessandroAnnini  ·  3Comments

sedubois picture sedubois  ·  3Comments