graphql-tools 🚀 - Authorization

Another possible avenue (as suggested by @fubhy) rather than magic .pre/.post properties could be to simply pass a resolver executer, and let the user take care of running validations before/after:

const permissionMap = {
  Post: {
    notes: (resolverExecuter, args, context) => {
      if (!isLoggedIn(context.user)) return false;
      // Execute the resolver
      const data = resolverExecuter();
      return isAuthor(data);
    }
  }
}

resolverExecuter here is a closure around each resolver that takes care of passing the right variables to it and to return the unchanged data back to GraphQL. (otherwise the permissionMap could manipulate the data) It would also keep track of it's executed state, and if it's not been executed and the permission function returned true it'd execute the resolver itself, something to this effect:

let executed = false;
let data;

const resolverExecuter = () => {
  data = field.resolver();
  executed = true;
  return data;
}

if (field.permission(resolverExecuter, args, context)) {
  if (!executed) data = field.resolver();

  return data;
}

That way the developer has full control over when to execute which validations and which data they need, rather than relying on magic properties.

mxstbr on 13 Apr 2017

👍2

Hey @mxstbr, I think that's a really cool idea. I like the first syntax for defining it (permissionsMap), because it's much easier to read. You're right that something like .pre and .post will be needed, which would make it slightly less elegant. An alternative would be to have a permission be defined on a type itself along the lines of:

{
  Query: {
    post: isLoggedIn, // just an example
  }, {
    Post: {
      __self: and(isLoggedIn, or(isAuthor, isPublished)),
      notes: isAuthor,
  },
}

The __self permissions would be applied to every resolver that returns an object of that type, and it would get that object and the context as input, but not the args.
The non-__self permissions would get the usual data that the resolver gets.

I think this could look really neat, and would be great to have in graphql-tools. Do you have time to help us make it happen?

Another approach would be to actually push permissions down to the model layer (which is what FB does internally). We're thinking about a next version of graphql-tools that would organize schema and resolvers a little differently and lets you define the models along with it. I think @thebigredgeek has been thinking about stuff like this, so he might have some thoughts to share here.

Also, checking permissions is just a form of validation. Some validations (as you pointed out) can be done statically, some can only be done during execution. It would be really neat if the static ones could be run before the query even begins execution. I recently had a conversation with @JeffRMoore about this, so it would be really interesting to have his take here as well. I think it could be relatively easy to create a validation rule that checks "static" permissions before executing if we mark them somehow.

helfer on 13 Apr 2017

I like the first syntax for defining it (permissionsMap), because it's much easier to read.

The second syntax is still the same, except rather than having .pre/.post you have a function that lets you do your thing!

__self sounds like a great idea though, I like it.

Do you have time to help us make it happen?

I'm exactly 0 familiar with the codebase, but I'd be happy to dig in seeing as I need to add permissions to my server. :wink: Let's wait and see what @thebigredgeek and @JeffRMoore have to say!

mxstbr on 13 Apr 2017

Yeah those two have been thinking about this a lot!

stubailo on 14 Apr 2017

I saw @mxstbr tweet about this thread and wanted to chime in with how we do auth on the bustle graphQL API. I extracted the code below but this is the general setup:

Add a currentUser object to the context that is passed to graphQL. The setup and auth check for this occurs outside the scope of graphQL (We use a token based auth system)
Extend the field object to include a authorize key. This expects a function. The func signature is the same as resolve
Walk the built schema and wrap any resolvers in the authorize function

import {
  defaultFieldResolver,
  GraphQLID,
  GraphQLNonNull,
  GraphQLError,
  GraphQLObjectType,
  GraphQLString,
  GraphQLSchema
} from 'graphql'

const User = new GraphQLObjectType({
  name: 'User',
  fields: {
    id: {
      type: new GraphQLNonNull(GraphQLID)
    },
    email: {
      authorize: (root, args, context) => {
        if (root.id !== context.currentUser.id) {
          throw new GraphQLError('You are only authorized to access field "email" on your own user')
        }
      },
      type: new GraphQLNonNull(GraphQLString)
    }
  }
})

// Walk every field and swaps out the resolvers if an `authorize` key is present
function authorize (schema) {
  Object.keys(schema._typeMap).forEach((typName) => {
    const typ = schema._typeMap[typName]
    if (!typ._fields) {
      return
    }
    Object.keys(typ._fields).forEach((fieldName) => {
      const field = typ._fields[fieldName]
      if (field.authorize) {
        typ._fields[fieldName] = wrapResolver(field)
      }
    })
  })
  return schema
}

// wrap resolver with authorize function
function wrapResolver (field) {
  const {
    resolve: oldResolver = defaultFieldResolver,
    authorize,
    description: oldDescription
  } = field

  const resolve = async function (root, args, context, info) {
    await authorize(...arguments)
    return oldResolver(...arguments)
  }

  const description = oldDescription || 'This field requires authorization.'
  return { ...field, resolve, description, authorize, isAuthorized: true }
}

const query = new GraphQLObjectType({
  name: 'Query',
  fields: {
    user: {
      type: User,
      args: {
        id: { type: new GraphQLNonNull(GraphQLID) }
      },
      resolve: (root, args, context) => context.db.find('user', args.id)
    }
  }
})

export default authorize(new GraphQLSchema({ query }))

southpolesteve on 14 Apr 2017

👍5

Thanks of chiming in, that's remarkably similar to what I'm thinking! 👌

Seeing that example of using root to avoid having to run authorization checks before and after, maybe we don't even need .pre/.post or anything like that?

const isLoggedIn = (_, _, { user }) => user;
// Check if the user requesting the resource is the author of the post
const isAuthor = (root, _, { user }) => user.id === root.author;

const permissionMap = {
  Query: {
    post: isLoggedIn
  },
  Post: {
    __self: isLoggedIn,
    notes: isAuthor
  }
}

mxstbr on 14 Apr 2017

What are the benefits to using a permissions map compared to, say, composable resolvers that simply throw errors if state preconditions aren't met? This is what I am already doing and it works really well. I don't know if additional API surface area is necessary to solve this challenge

thebigredgeek on 15 Apr 2017

👍6

Sure, one could do that (it's what @fubhy is doing right now and suggested I do) but I found it makes the whole codebase more complex, unnecessarily so.

You can't write resolvers inline anymore, you have to understand how to compose functions (you and I do, but why do the juniors in your team have to?), permission functions are this limbo that kind of live with resolvers but kind of don't etc. (in my opinion, of course. Happy to be proven otherwise!)

It also means you have to define totally unnecessary resolvers for scalar types, as post.notes might just be a string that doesn't need a custom resolver, but now I have to write a custom resolver because I can't handle permissions otherwise:

const resolverMap = {
  Post: {
    // post.notes is just a String, but now I need to define a custom resolver just to add permissions even though GraphQL could get this scalar type perfectly fine without the custom resolver
    notes: (root, args, context) => isAuthor(root, args, context) && root.notes,
  },
};

Resolvers are a mapping from a path (post.notes) to a database query (db.getNotes(postId)) and, kind of like controllers in a MVC system, I've found that keeping them as small as possible helps keeping the codebase easy to grok. (types: output to/input from the user, resolvers: mapping of that input/output to db queries, models: database handling)

Now, one could argue that this means permissions should be pushed down to the model layer (skinny controller, fat model in traditional MVC-speak) and handled there, but that's unwieldy because you have to pass the context and root of the resolver down to the model for every. single. query. in every. single. resolver:

const Post = require('../models/Post');

const resolverMap = {
  Post: {
    // One would have to pass { root, context } to every query 👇 in every resolver
    notes: (root, args, context) => Post.getNotes(args.postId, { root, context }),
  },
};

That (again) adds a lot of complexity when working in the codebase because one has to remember which queries get passed that and which don't, and that you have to pass them all the time.

That's my logic and why I think permissions should live on a separate layer from resolvers and models. I don't know if permissionMap is the best solution, maybe there's a better one, but I feel like there needs to be a different way of doing this.

mxstbr on 15 Apr 2017

but that's unwieldy because you have to pass the context and root of the resolver down to the model for every. single. query. in every. single. resolver:

Why would the root be needed, when resolving the data? I mean, I can only imagine the user (user context) being needed, similar to what Dan Schafer points out in his talk about how Facebook goes about authorization.

Scott

smolinari on 15 Apr 2017

Like @southpolesteve, I saw @mxstbr tweet about this thread. I think the solution given by @southpolesteve is the way to go. However, this is too much dedicated to Authorization. This new layer should be more generic and be able to execute an array of functions whatever their purpose.

To give you an example, on a mutation, sometimes you have to format the incoming data or check if the email is available or not. In this case, you have to execute functions which are not specific to Authorization and we have to execute these functions before the query/mutation resolver.

IMO, we have two levels:

Before the query or mutation resolver
Before the field resolver

I kept the exact same example but I replaced the authorize key by a new one called policies and I'm passing an array of functions which are executed in series before the resolver. Maybe, we should have the exact same thing for executing an array of functions after the resolver, and this solution becomes closest to the pre/post one of @mxstbr.

Note: I haven't tested this code but it's just to give you an idea of how we could implement it.

import {
  defaultFieldResolver,
  GraphQLID,
  GraphQLNonNull,
  GraphQLError,
  GraphQLObjectType,
  GraphQLString,
  GraphQLSchema
} from 'graphql';

import {
  isAuthenticated,
  isOwner
} from './policies';

const User = new GraphQLObjectType({
  name: 'User',
  fields: {
    id: {
      type: new GraphQLNonNull(GraphQLID)
    },
    email: {
      policies: [isOwner],
      type: new GraphQLNonNull(GraphQLString)
    }
  }
})

function policies (schema) {
  Object.keys(schema._typeMap).forEach((typName) => {
    const typ = schema._typeMap[typName]
    if (!typ._fields) {
      return
    }
    // Be able to execute functions before the query/mutation resolver.
    if (typ.policies) {
      typ.resolve = wrapResolver(type);
    }

    Object.keys(typ._fields).forEach((fieldName) => {
      const field = typ._fields[fieldName]
      if (field.policies) {
        // Be able to execute functions before the field resolver.
        typ._fields[fieldName] = wrapResolver(field);
      }
    })
  })
  return schema
}

// wrap resolver with authorize function
function wrapResolver (field) {
  const {
    resolve: oldResolver = defaultFieldResolver,
    policies
  } = field

  const resolve = async function (root, args, context, info) {
    // Execute the array of functions (policies) in series.
    for (let i = 0; i < policies.length; ++i) {
      await policies[i](...arguments);
    } 

    return oldResolver(...arguments)
  }

  return { ...field, resolve, policies}
}

const query = new GraphQLObjectType({
  name: 'Query',
  fields: {
    user: {
      type: User,
      args: {
        id: { type: new GraphQLNonNull(GraphQLID) }
      },
      policies: [isAuthenticated],
      resolve: (root, args, context) => context.db.find('user', args.id)
    }
  }
})

export default policies(new GraphQLSchema({ query }))

The policy (function) could look like this:

// ./policies/isAuthenticated.js
export default (root, args, context, info) => {
  if (root.id !== context.currentUser.id) {
    throw new GraphQLError('You are only authorized to access field "email" on your own user')
  }
}

Aurelsicoko on 15 Apr 2017

Why would the root be needed, when resolving the data?

What about only the author of a post being able to read the notes? (see the OP) Can't do that based on the context and the args alone, you need to have the data for that.

This new layer should be more generic and be able to execute an array of functions whatever their purpose.

I like that idea, rather than being specific to permissions it could be a generic _validation_ layer, no matter if format or permissions or anything else! The API could still stand as discussed above, except renaming it to validationMap?

mxstbr on 15 Apr 2017

What about only the author of a post being able to read the notes? (see the OP) Can't do that based on the context and the args alone, you need to have the data for that.

Hmm... I would have thought the data would be outside the resolver, like Facebook does with DataLoader.

Scott

smolinari on 15 Apr 2017

Sure, that might be the case, but I feel like if we have the data why not pass it in and let people decide how to handle it? Also, not everybody might be using DataLoader etc. (it's also a pretty fine detail of a pretty big API 😉)

mxstbr on 15 Apr 2017

👍1

I like that idea, rather than being specific to permissions it could be a generic validation layer, no matter if format or permissions or anything else! The API could still stand as discussed above, except renaming it to validationMap?

Yes, it makes sense to use a more generic name. validationMap seems great to me!

Based on my personal experience, it was a nightmare to pass the context into the DataLoader because it only accepts one argument... Despite this, I agree with @mxstbr if we have it, we should probably pass the data in the context and let people decide how to handle it.

Aurelsicoko on 15 Apr 2017

Pulling back and looking at this from an overarching 1000 mile up in the sky view, I'm of the opinion that the authorization should be on top of the business logic and not in the API. So, this goes along with what @helfer said in his first post with:

Another approach would be to actually push permissions down to the model layer

My main reasoning for this is based on the assumption that at some point in the evolution of GraphQL (and I'll be money Facebook is doing it now), I believe the creation of the API will be mostly an automated process. Having permissions mixed in the API layer would make that automation programming a lot more difficult.

This graphic is best practice according to the makers (and how I see it too):

Also, in this great article from @helfer, he also says this:

Permission checks are best implemented on nodes

Permission checks are best separated from data fetching logic to make sure they are applied consistently, regardless of how the data is fetched.

I'd also actually improve number 2.

Permission checks are best separated from data fetching and API logic to make sure they are applied consistently, regardless of how the data is fetched or requested.

Which basically means, the authorization could be available for any type of API (as the image shows too) and not be concerned with data fetching concerns.

Such "business details" as authorization is opinionation and again, should be avoided in any API. Sure, it means no standard for authorization within (Apollo's) GraphQL implementation. But, that is, again in my opinion, a good thing.

What should first be agreed upon is where authorization should take place or where it shouldn't. I say, it should definitely not be within the GraphQL API layer.

That is just my (last) 2 cents. 😄

Scott

smolinari on 16 Apr 2017

👍2

In that graphic you posted there's two layers:

Business logic layer
Persistence layer

The "persistence layer" are the models, they persist stuff to the database. Pushing the business logic into that seems to even be discouraged by that graphic, as far as I can tell.

What I'm trying to figure out here is basically where that "business logic" layer lives—I don't think resolvers are the right place for that, models apparently aren't either, which leaves us with.. not much :wink:

That's why I think adding an explicit API to graphql-tools is helpful.

mxstbr on 16 Apr 2017

To me models are only the business logic. Theoretically, a good model system should not know any details about the databases/ data sources it connects to. Because, one day you might be using MySQL, then realize some other database(s) are even better solutions. You'll notice, any good ORM can connect to multiple SQL databases. This abstracted out connectivity to any database/ source is the persistence layer. It is also the layer where a tool like DataLoader would do very well. 😄

Obviously, we're talking about a lot of abstraction and for a relatively simple system, it is probably a good bit of overkill. However, this separation of concerns is how I believe (and Facebook also teaches) it needs to be, if your system is going to be future-proof.

Scott

smolinari on 16 Apr 2017

👍1

I want to weigh in here, because this has the danger of setting up some hasty de facto standards, and hasty is the last thing we should be when it comes to security.

I've built GraphQL servers on top of existing micro-service platforms, and in new projects where GraphQL and "the platform" live in the same app instance. Once thing i've never done is treat the GraphQL server as trusted. I can make every API that my resolvers call into a public one without breaking the security of my system. In the case of the existing systems, this was necessary (all our APIs are externally accessible); in the new systems it just seemed like a best practice. In both types of implementation (micro-service vs monolith), the way i've done the GraphQL server has been identical - resolvers do some relatively trivial unpacking they're parameters, than call a function that's been defined elsewhere as the interface to "the business layer". Everything behind those functions can change as much as it likes, I can switch databases and require no code changes to my GraphQL server. This is good.

Permissioning is a concern that all developers need to think about, and I think most will be nervous about having the outermost layer of the stack be responsible for handling it.

However, there's also talk of pre/post behaviour when evaluating Types. i.e. whenever I have a Product, regardless of how I get to it. There's value in exploring some ideas here, but in a more general way than specifically tackling permissions or validation. I looked into this very thing when trying to find a generalised solution for the look-ahead problem (i.e the final part of what I talk about in this article: https://dev-blog.apollodata.com/optimizing-your-graphql-request-waterfalls-7c3f3360b051). Ultimately I found other ways to do it (I wrapped dataloader), and felt better about not trying to bolt on additional unnecessary API to my GraphQL server, but I still think it's worth playing around with the idea to see what other utility it could bring.

One other parting thought. I believe the sweet spot of GraphQL is largely as @smolinari has pointed out. It takes relatively little effort to get to this type of separation, even in a monolithic codebase, and the benefits it brings for long-term maintainability are real. I think effort would be better spend producing examples of medium-sized codebases that follow these patterns.

AndrewIngram on 16 Apr 2017

👍9 ❤2

To me models are only the business logic. Theoretically, a good model system should not know any details about the databases/ data sources it connects to.

You're right, good call!

I can make every API that my resolvers call into a public one without breaking the security of my system.

This is a very interesting point. You're right that we loose this property with the current API suggestion, I agree that's suboptimal. I don't think I agree that this means we shouldn't think about adding an API for this, it just needs to be a different one.

I'd really like to avoid having to write a ton of resolvers _just_ for validation purposes because the default one would suffice normally. (e.g. notes: (root, args, ctx) => validate(root, args, ctx) && root.notes) Even more so I don't want to drop down to the model layer for that. (e.g. notes: (root, args, ctx) => getNotes(root, args, ctx) which just does the above thing, root => validate() && root.notes) That seems like a lot of unnecessary complexity and code. (maybe it isn't?)

I also want an abstraction that keeps validation separate from the general "business logic" layer rather than randomly sprinkled throughout the codebase. I'm pretty sure this makes it easier to figure out what's not being validated correctly/what's been forgotten and also make it easier to spot bugs since every responsibility (types, mapping types -> models, database handling, permissions/validation,...) is handled at a specific place.

Oh, I get some data I shouldn't be able to? Hop into validations/ and find the path!
Oh, the data I get back is wrong? Check the resolver of that path to make sure it maps to the right model call, then make sure that function does the right database queries.
Oh, I'm unable to get some data I should be able to via a GraphQL query? Let me fix the types!
Etc.

If I have validations somewhere inside my models (or not), I have to start digging a lot more to figure out which specific LoC is broken.

I don't have any good ideas for other APIs right now, but I think it's still worth finding one?

mxstbr on 17 Apr 2017

https://github.com/thebigredgeek/apollo-resolvers < This has made authorization a very simple matter for me and my team, at least with Monoliths.

That said, when working with microservices, I think it often makes sense to delegate authorization logic to the service responsible for the domain of a given resource, so the GraphQL server should probably only care about authentication and perhaps high-level authorization (is the user logged in or not, what is the high-level role of the user, etc).

I think you can accomplish more granular authentication rules by simply passing user data (such as their ID) to your data layer (models + connectors OR separate services, depending on your paradigm) and letting said data layer decide what the user should and shouldn't be able to do.

Ultimately, I think this conversation is more about how to handle authorization within the domain of GraphQL and not within the domain of the data layer, so I believe a line has to be drawn regarding best practices. The challenges facing GraphQL authorization are congruent with the challenges facing RESTful authorization, with the exception of data traversal restrictions that are necessary in GraphQL but often not with REST. I think we should avoid reinventing the wheel, treat GraphQL as a transport and traversion DSL, and reuse as many paradigms from the REST world as possible.

thebigredgeek on 17 Apr 2017

👍7

@AndrewIngram I think it's good to bring that up - I do think it's valuable to explore what other paradigms might look like, though.

this has the danger of setting up some hasty de facto standards, and hasty is the last thing we should be when it comes to security.

I agree we should make sure that there is some consensus among a wide range of people before this moves into the sphere of being some kind of recommendation or best practice, but I think that always starts with a discussion that considers a wide range of approaches.

stubailo on 17 Apr 2017

👍1

Another thing to consider - Even if you decide not to handle permission checks themselves in GraphQL, you'll definitely want to handle the effects of them.

Much like I prefer to handle validation errors from mutations as part of the schema design itself (I prefer reserving error handling for actual errors, not expected validation failures), a similar argument could be made for failed auth checks.

Let's say we have a screen on our app, and the data is fetched via a static query (as is becoming best practice), there may be some parts of that data that are only accessible to us when we have the right permissions. Schema-wise I think it's messy to just allow those parts of the schema to be nullable, because it forces us to overload what null means. I tend to advocate going modelling auth restrictions into the schema itself.

type User implements Node {
  id: ID!
  username: String!
  articles: [Article]
  private: RestrictedPrivateUser
}

type PrivateUser {
  email: String
  secret: String
}

type InsufficientPermissions {
  message: String!
}

union RestrictedPrivateUser = InsufficientPermissions | PrivateUser

I appreciate that this is a lot more verbose than just allowing email and secret to be nullable (and I hope that as graphql evolves we gain a more expressive type system to allow things like anonymous unions). But If I were to move those fields onto User itself, a null value for either of them doesn't tell me enough - i'm forced to look at the query error to figure out what the null means. But why is this an issue? Well mainly because I don't considered expected behaviour to be an error. Remember, we have a screen powered by a single static query, a query that works fine for one user might be full of errors for another, and for a permission-heavy app this might mean that a significant amount of front-end complexity is dedicated to pulling things out of the errors object.

Another reason I like this approach, is that it forces the front-end developers to handle these cases, much like we'd be doing if we were using a language like Elm. It's a total pain in the arse, but it leads to better software.

AndrewIngram on 18 Apr 2017

👍2

@AndrewIngram don't you have to implement a lot of duck typing on the client to properly handle whether or not a PrivateUser or an InsufficientPermissions is returned?

thebigredgeek on 18 Apr 2017

You'd just query for and branch on __typename

AndrewIngram on 18 Apr 2017

Ah interesting. Have you had any issues working with that kind of model? I've been keeping all of my errors in the errors array, but I've been thinking about placing them in context-specific places for a while. Can you speak on benefits as well?

thebigredgeek on 18 Apr 2017

It's mainly inspired by patterns I see in more functional languages, where most errors are treated as data rather than as a bailout. It feels like a better way of encouraging API consumers (in our case, client app developers) to properly handle the various ways things can go wrong.

Why model into the schema rather than the errors array? Essentially, you're going to end up with a variety of error type; timeouts, missing data, failed auth, form validation, each of which is likely to want to evolve over time to capture more nuance. A data structure suitable for modelling a timeout is unlikely to be identical to one suitable for modelling a form validation error. When we just throw all of these into the errors array, we're requiring that clients are aware of all the various types of errors and what they look like - otherwise how do they pull the data out? You end up with your own ad-hoc mini schema just for errors, but largely separate from the very technology aimed at allowing client-data coupling to be done cleanly - i.e. GraphQL.

This is why I prefer the errors array to represent errors within the GraphQL server itself, rather than errors exposed by the underlying data model.

Drawbacks? It's initially a lot more work to do it this way, the downsides of using the errors array aren't initially obvious (and you may reach the point where it's an issue). It can also be awkward to have to wrap scalars in object types just so they can be used in a union. If you want to move fast and break things, you're going to be very resistant to the approach i'm suggesting. I've mainly only tried this with toy problems, where it's the most painful (I don't have a large suite of error types built up in my schemas) to implement, so my feeling that it gets easier over time is based on inference rather than real-world experience.

AndrewIngram on 18 Apr 2017

👍6

To get back to the original conversation about authorization, I think there's a lot of value in having a clear place where permissions checks happen. I agree that they're only one kind of query and parameter validation, but they're an extremely common kind so I think it helps to give guidance around that.

@AndrewIngram 's point about many applications already having authorization in some backend or other is certainly true, but I think we shouldn't take architecture choices made prior to the addition of GraphQL as evidence that this is the right thing to do in GraphQL. I would argue that the opposite is true. Taking GraphQL as a given (which makes sense for a repo called graphql-tools), we should figure out a good way of doing authorization.

As mentioned earlier in the conversation, I think it makes sense to check permissions in a layer that's separate from resolvers, and also separate from loading data from the backends. Why? Because that gives you maximum flexibility when it comes to caching. Writing a permissions-aware caching layer is (in my humble opinion) a fool's errand. Assuming that we want the cache to live as close to the edge as possible for performance reasons, the cache should be in or near the GraphQL layer, and permissions should be checked after retrieving items from the cache (with some exceptions where it's trivially possible to determine permissions without materializing the object).
Strictly speaking this would mean pushing authorization to the model layer, but I think there are plenty of GraphQL servers which won't need a lot of business logic and which can do validation / authorization in a simpler way. I think the ideas that @mxstbr had would be a great fit for that kind of application.

But rather than just talking about it, I think it's most useful if we try out our different hypotheses with a couple of code examples like @mxstbr 's PR #315.

helfer on 19 Apr 2017

👍1

To clarify, my points aren't based on the assumption of building GraphQL on top of an existing system, they could apply equally to a new system. It's more based on the assumption that GraphQL isn't the exclusive route for data access.

What i'm arguing against is an explicit API for permissions (or input validation, but I find it easier to buy in to that one) in graphql-tools. @mxstbr's solution could easily be implemented as a generic preResolve and postResolve API, which brings all the benefits, but none of the connotations that this is the one true way to handle access control.

AndrewIngram on 19 Apr 2017

👍1

I went ahead and created a rough implementation of the API that was suggested above in #315. This is what it looks like:

# A written text by an author
type Post {}

type Query {
  # Get a post
  post(id: ID!): Post
}

schema {
  query: Query
}

makeExecutableSchema({
  validations: {
    Query: {
      // Returning false here will not execute the resolver when post is queried
      // Can also return a Promise if necessary
      post: (root, args, context, info) => true
    }
  }
})

Note: It only stops executing the resolver, but doesn't return an error to GraphQL just yet because I haven't figured out how to do that)

I think that works pretty well with what @AndrewIngram is saying, since it's not explicitly for permissions, it's just generally "Do you want to execute the resolver at this field yes/no", right?

mxstbr on 19 Apr 2017

👍6

@mxstbr I really like this approach. It doesn't mean: this is the guideline to handle permissions. It's just a new feature provided by graphql-tools to execute or not the resolver. More than that, people who don't want to use this pre/post feature to handle permissions could easily use others patterns without having the feeling that they are not following the guidelines.

It's generic and it could be used for others things such as input validations, etc.

Aurelsicoko on 19 Apr 2017

👍2

@mxstbr shouldn't the example be for pre rather than post? I'd assume with pre we're asking "should I continue resolving?", whilst with post we're asking "I have the data, do we need to do anything with it before returning?"

AndrewIngram on 19 Apr 2017

This runs before the resolver, check the code in the PR, I haven't implemented anything that runs after the resolver yet as I'm not 100% on what the API for that should look like yet.

mxstbr on 19 Apr 2017

I just realised what @AndrewIngram is referring to and It's my bad for a horrible example! post
in the example above refers to a defined query, not the .pre/.post thing:

# A written text by an author
type Post {}

type Query {
  # Get a post
  post(id: ID!): Post
}

schema {
  query: Query
}

I've updated the post above to include the schema!

mxstbr on 19 Apr 2017

😄1

Oh, haha!

AndrewIngram on 19 Apr 2017

Wow! There is a lot to follow here and some pretty cool ideas.

I don't think authorization will ever be a "solved" problem, nor should it be. My experience is that is a good way to tip a library over into being a framework. I mean that in a negative sense.

And yet, both validation and authorization are extremely important and there needs to be a good story or stories around them.

I advocate that there are three basic approaches to validation in APIs (Try and fail, dedicated api, and metadata) and that any sufficiently complex API will sprout all three forms.

I think the same can happen in Auth. I could easily see a schema driven auth system that is completely applied as a validation on a query AST. I could also see a schema driven system that lives as resolver decorators, or in the business logic backend. Or imperative checks, hand coded into resolvers, or somewhere in the interface with http. Or all of the above.

Auth can be really simple. Or it can be a nightmare.

Rather than adding auth-specific behaviors, I'd like to see the tooling around extension points in the query process become more robust. I really like the idea of expanding graphql-tools as a way to do that. I think I'm agreeing with @thebigredgeek on that.

Maybe a solution lies is some sort of support for directives in graphql-tools, where directives in the schema can help choose a composition of resolvers specified separately from the schema as a parameter to makeExecutableSchema?

That might help reduce boilerplate, help make the code more audit-able, but not impose a specific approach, leaving that to plugins.

JeffRMoore on 20 Apr 2017

👍1

I'm not sure there is a need to distinguish between pre and post, only the order in which resolver middleware functions are called. Pre or post is a feature of the actual implementation.

Fleshing this out with some hasty code:

# I want posts to only be readable by logged-in users
type Post 
@isLoggedIn()
{
  id: ID
  content: String
  author: User
  notes: [String]  # I want these to only be accessible by the author
  @isAuthor()
}

makeExecutableSchema({
  ...
  directives: {
    isAuthor: () => { ... }
    isLoggedIn: () => { ... }
  }
})

A pre- resolver middleware:

async (request, next) => { pre(); return next(); }

A post-resolver middleware:

async (request, next) => { result = await next(); return post(result); }

This presumes the function signature of resolve has been turned into single request object.

The order should just be the order in which they are declared.

JeffRMoore on 20 Apr 2017

👍5 👎1

@mxstbr my main concern with doing something like

makeExecutableSchema({
  validations: {
    Query: {
      // Returning false here will not execute the resolver when post is queried
      // Can also return a Promise if necessary
      post: (root, args, context, info) => true
    }
  }
})

is that I now have to define a separate deep permissions object that looks almost identical to my resolver structure and then keep those two things in sync with one another. I can already foresee headaches with this in my own personal process (albeit, I have really bad OCD with things that feel duplicitous in structure haha). I can imagine myself incorrectly mapping things by accident. In my opinion, if ACL is going to be handled by GraphQL it should probably be colocated / composed with the actual resolvers OR types.

@JeffRMoore I REALLY like the directive approach. I can't stress enough how happy doing something like that would make me feel. However, I am wondering how easily composition would actually be. It seems like I might end up with 100 different ACL directives as my codebase grows? It seems like it might actually be difficult to make the directives highly reusable between different types. If I can't reuse them, it seems like they are simply syntactically cool (really, it looks dope) but without adding much benefit?

thebigredgeek on 21 Apr 2017

After thinking about this some more, I can't stress enough how much I believe how bad of a mistake it would be to inject a permissions system inside of a GraphQL API. It's simply the wrong place for it. It is badly shortsighted. I'll give some reasons and yes, I am sort of repeating myself from above. But, it seems my last comment didn't make much of a dent in the contemplations here. Here are the reasons why GraphQL should never see or know about permissions:

Because permissions are business logic. In a business, one or more people FROM the business are going to be making the decisions on who is going to see and do things in the system. And that means, it shouldn't take a programmer to change permissions. The decision to put permissions in the API would mean a never ending story for the programmer. He or she or they would be the opposite of the 10X programmer. When a type needs to be declared, the programmer has to also run to the business or remember to always ask, "who should have access to this data?" OMG! Programming should always be fire and forget and the goal should be to never have to touch the programming again and certainly not for permission changes. Ok, the creation of the API might be automated too, so see point 4.
There are different depths of permission needs. Some organizations might want a simple system. Some might want a very, very complicated system with very high access control fidelity. Any solution found MUST account for ALL of those permission use cases. That is another bad headache to try and solve at the API layer level. It means it is on the Apollo GraphQL community's back to make sure all the possibilities are covered and that is a responsibility we really, really shouldn't have. At least not now and with GraphQL. We've got other much more pertinent problems that need solving. Making the internal API for permissions something abstract so permissions may or may not be added isn't the answer either. Or rather, if there is a need for this internal API, it shouldn't be for permissions, period.
Some use cases of a system might employ multiple APIs (like in the image from Facebook I offered above). This kind multi-API system might also happen through evolution. Introducing permissions inside GraphQL might seduce start-ups to do their permissions inside their GraphQL API. When they decide to go multi-API, they will be quite a bit screwed. In other word, injecting permissions into the API closes the door or causes headaches for the use of other APIs. That alone is a no-go.
The argument or thinking might be, it is a just schema. It is metadata. So why not inject permission information there too? Ok. But, at some point the GraphQL API creation will have to be automated, simply because it can and should shadow the true model graph of the business, which is also created in the business logic. It is duplication of sorts. In other words, the model of the business will be created first, within the business logic layer (and in most cases again, with the least bit of programmer help, i.e. through a smart backend) and the API should only represent an automated gateway into that logic. Adding permissions into the API means, whoever has to build this automation will have to include the authorization logic too, which is already in the business logic!!! That would be duplicated and thus unnecessary work.

I am so against this idea now. It is simply a bad one at too many levels, when thinking in longer more evolved terms. And, these evolutions will happen for sure. During the F8 talks, one of the devs mentioned that FB's GraphQL API is automated (in a roundabout way) as I had guessed. I had a good internal giggle, when she said it. I thought to myself, "Ha! I knew it!". 😄

Facebook knows what is right or rather smart, because they have already evolved way further than what we are seeing or know ourselves. And again, Facebook suggests to push permissions to the business logic level. I wholeheartedly agree. I think we should avoid talking about adding anything to do with permissions in GraphQL directly, at all.

I also believe the word "resolver" causes beginners or inexperienced devs to think that is where the data fetching from the source should occur. That is incorrect thinking. There must be a layer of business logic in between the resolver and the data sources. In other words, if anyone is putting REST calls or database queries in a resolver, they are doing GraphQL completely wrong from the start or rather, they are building up a shortsighted and problematic to evolve API. Adding permissions into the API layer is along those same lines of thinking.

What I'd like to see from Apollo is a smart business logic layer, which plays really well with a GraphQL API. And I'd be glad to help with the plug-in-ability of permission system add-ons. 😄 We should be working towards a GraphQL platform and not a Leatherman kind of "all-in-one" API layer. That won't get us very far in the long run.

If anyone wants to continue with suggesting the API should have permissions, then please make arguments that out-trump my arguments. Please tell us how doing permission in the API will be good for everyone and in what ways. In other words, we shouldn't be discussing solutions at all at this point, but rather, we should first discuss the smarts, the advantages and disadvantages, behind the decision to add permissions to Apollo's GraphQL API.

Scott

smolinari on 21 Apr 2017

👍12

I like the idea of middlewares and directives as well. I didn't have time to read in detail, but just wanted to point out that directives come after the thing they apply to, so the example from @JeffRMoore 's comment would have to be written as follows:

# I want posts to only be readable by logged-in users
type Post @isLoggedIn {
  id: ID
  content: String
  author: User()
  notes: [String]  @isAuthor # I want these to only be accessible by the author
}

helfer on 21 Apr 2017

👍2

@helfer Thanks for the clarification, I updated my prior comment.

@thebigredgeek I thought you would like it. It fits with the architectural style you've been promoting (which I like very much). As for your concern about combinatorics, I don't think that will happen. Or at least, if one finds oneself running into that issue, its probably an indicator of something else wrong. I did several authorization iterations for an enterprise SaaS product. Each one ended up simpler and faster, and in the last, we were able to accommodate almost everything with less than a dozen primitive patterns. (That system was schema based, but auth information was represented as boilerplate rather than metadata. Next time around, I would do metadata.)

Let me clarify with some types. Again this is hasty. Tell me if I'm screwing up.

type Post 
{
  id: ID
  content: String
  author: User
  notes: [String]  @restrictToAuthor()
}

makeExecutableSchema({
  ...
  directives: {
    restrictToAuthor: (params: GraphQLDirectiveParameters): GraphQLResolvingMiddleware => 
      {
        return restrictToAuthor;
        async function restrictToAuthor(request: GraphQLResolvingRequest, next: ContinueMiddleware): mixed {
        if (!AuthorCheck(/* whatever this looks like */)) {
           throw new ValidationError('Denied');
        }
        return next();
      }
    }
  }
})

Because the directives can have parameters, each time the directive appears in the schema, a distinct resolving function is created based on those parameters. That will kill a lot of combinatorics.

This solution lets you implement a lot of different kinds of things, like validation, authorization, or even ORM or backend fetching hints.

There are a lot of devils in the details of what GraphQLResolvingRequest looks like. Experimentation with performance implications is warranted.

If those details pan out, it may make sense to move that upstream into graphql-js. GraphQLResolvingRequest is a stronger concept than GraphQLResolveInfo. The benefit of enabling middleware may warrant a BC break someday. Until then, an adapter is needed.

JeffRMoore on 21 Apr 2017

👍1

@JeffRMoore I like it. Especially now that you've mentioned that you've iterated towards this pattern (with enterprise, no less! haha), I am sold on the DX / code cleanliness aspects. I am also curious, as you said, about perf impact. It might be nice to prototype this approach and bench mark it against non-authenticated with sync authentication functions just to see what kind of initial overhead this pattern would introduce. With async, I think the pattern of using something like redis or memcached (just like with every other type of API when it comes to highly-accessed data) is the right solution. Were there any particular cases that make you feel concerned about perf?

thebigredgeek on 21 Apr 2017

I like that approach a lot @JeffRMoore! Definitely much easier than duplicating the type tree again. Can that live in graphql-tools?

mxstbr on 23 Apr 2017

I'd like to point out a principle of introducing any permissions or access control to a system. When you set such a system up, it must always offer the ability to deny all access first, to then give access, only where it is applicable. Please think about that, when coming up with any permission or access control solution within the GraphQL API, if it absolutely has to be done.

Scott

smolinari on 23 Apr 2017

👍3

@JeffRMoore can we dive into your solution a bit, because whilst I like the idea of directives as an API for a meta-schema, I'm skeptical that this is the right use for them. The main benefit seems to be that it's an aesthetically-pleasing API.

The first issue is that I'm struggling to see how you'd have a generic directive for @restrictToAuthor. It seems like it'd either depend on a consistent pattern in your data, so that author is always accessible via the same field on an object (possible, but too magical for my tastes). Or it would need to be parameterised eg @restrictToAuthor(authorField="author"), but again, this feels like tying the schema to the data, and there's no contract that says the author field would be available before you're trying to resolve notes.

There's also the issue that this still feels like the wrong place to be defining this restriction. What if we want admins to be able to access everyone's notes, we now need to change our implementation at multiple layers of the stack.

So far, @helfer's concerns about caching are the main compelling argument as to why a GraphQL server might need to have some kind of permission check functionality. But this also seems to be highly implementation-specific. I'd be very interested to hear how Facebook has solved distributed caching in GraphQL without messing up the nice clean architecture.

AndrewIngram on 23 Apr 2017

Facebook doesn't let the request through, if the "viewer" doesn't have the proper permissions. The application caching happens after permission checking. Dan Schafer did a presentation about Facebook's way of doing cache and authorization (roughly).

https://www.youtube.com/watch?v=etax3aEe2dA

Of course, he says clearly that what he presents isn't necessarily THE way to run a GraphQL server. But, it certainly is smart to me. He also says the "thin" GraphQL API has turned out valuable to them. 😉

Scott

smolinari on 23 Apr 2017

👍1

@thebigredgeek I was thinking mostly of the overhead of constructing the GraphQLRequest object and the closures for applying the middleware chains. Its possibly not a big deal. Something to double check, nothing more.

@AndrewIngram, In the schema, its just meta-data. Its not until you supply the functions with GraphQL Tools that it becomes an implementation. That's the purpose of GraphQL Tools, no? To supply a resolving implementation, separating the description from the implementation?

As for whether that auth information should be co-located with the schema? Should GraphQL queries be co-located with their components or be separate? Should html be co-located or separate? Its just another application of principle of separation of concerns.

If for a particular case it doesn't make sense to co-locate auth metadata with the schema, don't. If it does, do.

I'm pretty sure there's no one right way.

The change cases @AndrewIngram mentions brings up an important point. Maybe there are some BC implications with evolving the API? How does one preserve the "no versioning" convention and evolve the authorization design?

I've used this pattern with validation and it works very well. Think name @maxLength(length=31). The challenge with validation is that you can only make the rules looser, not tighter. If I change to length=63, no problem. If I change my declaration to length=15 any data that has been previously accepted is not valid by the current rule declaration.

Changing validation rules often caused unexpected consequences. For example phone number going from optional to required. Now, what should we do with the user records that have no phone number? Especially when they edit their account info? Should we make them enter a phone number to change their email (also on the same record)? Frustrating. Should we allow partially valid records? Dangerous? Should we only validate changed fields and not consider the validity of the record as a whole? Now we're pushing conditionals into downstream code. We used a variety of approaches to handle this including data migration and versioning the record, and rules like "required only on create or if already specified."

There was a lot of value for us in making constraints a part of the meta-data description of the API. We could do things like user defined fields and apply validation rules in multiple contexts with different implementations. I would want to use this pattern again.

Is there value in the auth realm? Possibly. I might be able to use that information to hide or allow operations rather than allowing them to fail. Or rather than duplicating a hard coded check into the client code, a check that would have to be maintained in parallel with one on the server side, use the metadata to apply the same check both on the client and the server.

JeffRMoore on 24 Apr 2017

@smolinari That video is good.

Think of "Viewer" as "Authorization Context." For Facebook, that's Viewer. For other APIs, it might be different. In the Facebook example, the action "see" is baked into the function name. That's a part of the concept of authorization context and might be part of a parameter to check authorization instead in a generic authorization library. The action could be "delete" for example or even more application specific. Other environmental factors might be considered part of the context, like the requesting IP address or even time of day. "Joe is authorized to change grades only from an on-campus IP and only during office hours."

They also separate the point of policy enforcement (CheckCanSee) and the point of making the decision (The implementation of CheckCanSee). The example is a really minimal implementation of an important abstraction. This way resolver implementation should not change when Authorization policy changes.

Love how they build audibility into the structure of the system. Really important.

Isn't FB code-generating their resolver implementations?

Schema decoration is compatible with this approach?

JeffRMoore on 25 Apr 2017

Why don't we just build all of the business logic into the API's abilities? Thick resolvers, thin or even no model. Yeah. That would be so flexible and smart. Right?

The answer from everyone should be an emphatic, "Whoah Scott! No. no. no. Not at all."

Before I continue, I hope we can all agree on one thing. Authorization (or even validation) is business logic. Right?

So why are we discussing authorization as a part of the API? Where is the line drawn or where should it be drawn? FB obviously drew the line very close to the API and said, "let's keep the API layer as thin as possible. There will be very little to no business logic in it! Period!" Why are we fighting this "way of doing it"? Convenience? That is a stupid reason for doing anything and I am only guessing that is the reason. Please don't confuse convenience with simplicity. They aren't always the same and usually aren't. You must all also agree, adding business logic into the API isn't simple either.

What other reasons are there for adding business logic in the API layer? I have yet to hear one plausible reason except for "it's a natural fit". I've offered plenty of reasons why adding such logic into the API itself is a bad idea.

If Apollo wants to build out the server (and they should!), it needs to happen after the API layer and completely separated from it. That is my 2 cents on this (obviously). The "backend" should be a simple as possible set of plug-in-able interfaces for different business logic systems, including but not limited to validation and authorization systems. That's all that is needed. The rest should be done (and has been done to certain extents) by the rest of the Node community.

As we are speaking, teams are nailing these things together. Look at GraphCool. What Apollo should be saving us from is the work of nailing the system together. It should be "server glue".

And yet the system must still be way open to customization and evolution. Think about this. If I have a simple role based authorization system and I decide I want to add sharing rules, peppering the API with authorization directives is going to make my life a nightmare and so much so, I am probably never going to change it and thus, such a solution is clearly a dead end street. It's not good customizability, if that might have been thought of as a benefit.

I've been ignored for the most part in my views here (it seems) and that means either my argumentation is either so wrong, I'd look foolish, if someone points out the obvious I am not seeing and no one wants to make me look that foolish. No worries. I've been badly wrong before and I would love to be corrected or be wrong too. That only means I have a chance to learn.

Or the other possibility is, nobody has any decent argumentation on the benefits of peppering the API with business logic (no matter what logic it may be). Sorry, when that sounds a bit sarcastic, but from my current 1000 mile high view of this, that is what the suggestions given look like to me. Maybe that is my different viewpoint here. I am not an in-the-trenches developer. I am more a business manager or rather that is my real background.

Let's also be clear. If I change any constraints to an application, i.e. max number of characters for a field's size or permissions to a field or function, it isn't API breaking, it is application breaking or rather application constraining. In other words, these BC or constraint contemplations have nothing and should have nothing to do with the API's evolution and thus, isn't argumentation for or against having business logic in the API. The API is (or should be) working properly, no matter what I decide in terms of business or even programming constraints, and no matter which kind of API I have. This also shows, these concerns don't belong together.

Remember SoC? Why are we forgetting this generally accepted as good to follow rule of programming now?

Thanks @JeffRMoore for bringing the discussion more up to the level where I feel it should be. Again, we should be discussing the merits and advantages of adding business logic into the API layer. I don't see any currently, in fact, I see it as a really bad decision. Facebook didn't see any either.

What is the return on investment for adding business logic in the API layer? I've pointed out some risks.

Please answer that question for me and anyone else investing time and effort into any solution going forward, and not only as a solution to be worked on, but as part of the workflow for future developments using this API system. Because, without clear answers to that question, that work might become a complete waste and probably will be, because this direction wasn't properly thought out at the business level. And we certainly don't want people wasting their time, do we? 😄

Oh, and the argumentation "you can take it or leave it, if you want" isn't one. 😉

So, what are the benefits and advantages, the pros, for this direction/ suggestion?

Scott

smolinari on 25 Apr 2017

@smolinari about your earlier comment: I think you're misunderstanding the presentation by Dan. The caching that he talked about is client caching. Caching on the server happens below the permissions layer and below the model layer. In FB's case, the gen function always fetches from a caching layer, which forwards the request if necessary.

As for your most recent post, I'm not going to try and address every point (I'm not even sure what the main ones are), but I'd like to point out that most people aren't going to be running stacks with 5 or more layers like Facebook. The most common case will be either GraphQL directly to database, or GraphQL to REST microservice/monlith to database. In the first case, which I think is not unreasonable, your business logic is going to run alongside or in GraphQL. If having a simple and declarative way of defining validation rules and permissions is the most expedient way for someone to build their app, I don't think we should get in their way. Quite the opposite, I think we should make it easier for them.

In many cases, the perfect is the enemy of the good, and it sounds like your argument against most of the ideas in this thread is that they will break in some hypothetical future where you want to do something else. The fact of the matter is that if you worry too much about coding yourself into a corner, you're never going to write any code. I agree with you that we should avoid horrible mistakes, but I don't think you've made good argument why middlewares or adding metadata to the schema would be a horrible idea.

helfer on 25 Apr 2017

Just to be clear, I personally think that for a greenfield GraphQL server the best way to build it is to have almost no logic in resolvers, do all fetching through a model layer which checks permissions before or after fetching from the cache (which can be a mix of per-object or per-field caching), and which delegates data loading to data loaders (which I sometimes call connectors).

For GraphQL servers built on top of existing REST APIs or microservices, I don't think those rules necessarily apply, so it might very well make sense to do things differently.

I also think that the graphql schema is a great starting point for developing an app and could be used to get a proof of concept off the ground extremely quickly (You could check permissions and do data fetching purely with directives, or you could use the schema and directives to generate code in the model layer). It might well be possible to build something solid enough for production, so I think rather than arguing against it for philosophical reasons, we should just try it out and see how it works.

helfer on 25 Apr 2017

👍2

I think you're misunderstanding the presentation by Dan. The caching that he talked about is client caching.

Yes, I understand and within the server and per request (at least that is the suggested TTL by FB for DataLoader).

most people aren't going to be running stacks with 5 or more layers like Facebook

It that an argumentation for mixing concerns? I don't think so. Everyone will have the same concerns or similar as Facebook and being they are conveniently mixed available to customize within the API isn't the answer for them.

. The most common case will be either GraphQL directly to database, or GraphQL to REST microservice/monlith to database. In the first case, which I think is not unreasonable, your business logic is going to run alongside or in GraphQL. If having a simple and declarative way of defining validation rules and permissions is the most expedient way for someone to build their app, I don't think we should get in their way. Quite the opposite, I think we should make it easier for them.

And possibly also build up a dead end to future evolution? Not smart at all. It is very short sighted. Are "outside the API" solutions less convenient?

in some hypothetical future

Sorry, but nothing I am talking about is hypothetical. They are philosophies and best practices that have been proven very good many a time, over many years. The solution here to include business logic as being a "good thing" is the hypothetical. That is why it is being asked for and not already done.

I don't think you've made good argument why middlewares or adding metadata to the schema would be a horrible idea.

Did you miss this comment?

Just to be clear, I personally think that for a greenfield GraphQL server the best way to build it is to have almost no logic in resolvers, do all fetching through a model layer which checks permissions before or after fetching from the cache (which can be a mix of per-object or per-field caching), and which delegates data loading to data loaders (which I sometimes call connectors).

Cool, then we are two! 😄

For GraphQL servers built on top of existing REST APIs or microservices, I don't think those rules necessarily apply, so it might very well make sense to do things differently.

Why? Please explain. Why wouldn't I want my business logic layer in front of external REST API's? What makes them any different as a data resource than a database, other than possibly needing a lot less logic to process the data? I'd still want the control my business logic layer offers, in front of the data sources, no matter what data source it might be.

I also think that the graphql schema is a great starting point for developing an app and could be used to get a proof of concept off the ground extremely quickly

I also agree with this. But, is it really an argument for mixing concerns? Why can't Apollo offer a simple server solution, and let other more complicated needs be plug-and-playable? If business logic is built into the API, wouldn't the "plug-and-playability" of future needs be lost? I think it would be. You would be thinking you are building in simple flexibility for start-ups, but really, you are building complex convenience that ends in pain.

Scott

smolinari on 25 Apr 2017

👍1

@graphcool announced Permissions Queries - https://www.graph.cool/blog/2017-04-25-graphql-permission-queries-oolooch8oh/

it looks very scalable and customizable

sibelius on 25 Apr 2017

👍3

@sibelius if I understand this correctly, it's a generalised pattern for fields that essentially answer "can I do X?" ?

AndrewIngram on 26 Apr 2017

Thanks a lot for sharing our blog post @sibelius! We actually linked to this conversation since the exact right questions are being asked here. Seems like I'm joining this party a bit late...

I really like this suggestions so far and like @helfer said, I think there is room for multiple approaches and architectures. I totally agree that in the beginning it should be as easy as possible, so GraphQL directives seem like a great idea to get started.

As mentioned in the blog post we've taken a bit of a different approach on authorization by letting developers declare permission rules as GraphQL queries. An example permission query might look like this:

query permitUpdatePosts(
  $node_id: ID!, 
  $user_id: ID!
) {
  somePostExists(filter: {
    id: $node_id
    collaborators_some: {
      id: $user_id
    }
  })
}

We've taken this authorization approach to provide a better level of abstraction for developers using this system for the price of more implementation complexity on our side. Hopefully this concept can be a bit of inspiration throughout this conversation!

schickling on 26 Apr 2017

I'm actually struggling to grasp from just the examples on the blog post what the pattern is here, how much of the fields in that query is automagically generated from the schema? This might be because i'm not that familiar with how your product works.

Can you also clarify on where you say Facebook also uses this pattern?

AndrewIngram on 26 Apr 2017

👍2

In the end, permissions are a state within the application. State changes should be made due to code, but never (or as little as possible) because of code changing. Building authorization in GraphQL is baking state change into the code. And, from my personal perspective, whenever code needs to change, in order to change the state, _it is a design mistake_!!!

Scott

smolinari on 26 Apr 2017

how much of the fields in that query is automagically generated from the schema

The schema is generated individually for each permission, operation and type. Each query will be executed before a mutation is performed or before data is read. Based on the result of the query (which is a conjunction of booleans based on the query fields) the permission is enforced. You can find more about the concepts behind this in the respective docs.

Can you also clarify on where you say Facebook also uses this pattern?

To my knowledge Facebook's permission system is structured in a similar way, that by default you can either allow or deny a certain operation (whitelist vs blacklist). On top of that you can specify explicit permission rules to allow or deny a specific case. I'm not to sure about the implementation details of the rules though.

schickling on 26 Apr 2017

Oh, it read like you were saying this is how they implement it specifically with GraphQL, rather than the pattern of rule-based permissions in general (which is the only way i'm familiar with, how else would you do it?).

AndrewIngram on 26 Apr 2017

I would like to add my 5 cents into this great discussion. The concerns raised here are beyond authorization and validation needs and this is really good.

GraphQL schema is a contract between API implementation and users of the API. This contract however also specifies properties (field names, types, etc) of entities used in the implementation, for example Post, User, referred below as entities.

The developer might want to express some application-specific properties of entities used in GraphQL queries, such as constraints like maxLength, or validation type, or authorization rules, etc. These application-specific constraints (metadata) need not be processed by GraphQL engine. Application might extract this metadata from entities defined in GraphQL schema itself where needed and use this metadata.

If this option to attach metadata to schema is not available, the developer will have to:

Either create another schema inside the application with app-specific entities constraints and keep it in sync with GraphQL schema
Or have to use its own primary schema that has enough flexibility to express entities constraints and generate GraphQL schema from it.
Or have to imperatively force the application-specific entities constraints inside functions, instead of doing this declaratively in some sort of schema.

GraphQL decorators from the first look seem like a good way to attach metadata to schema and then process it by application. Here I think I share vision with @JeffRMoore

We certainly want all the GraphQL tools to continue working when we attach metadata to schema. And it looks doable, because these tools need not bother about this metadata at all. At the same time we want to fully express properties of schema entities, both GraphQL-specific properties and app-specific properties too to have it all in one place.

larixer on 26 Apr 2017

👍3

One other perspective to consider:

Schema languages exist to insure conformity to a specification of data structure. Schemata are contracts between producers and consumers that essentially say: "If you want to request data from, or wish to save data to my service, you must do so in this manner." Injecting identification/authentication into this contract is pulling business logic where it doesn't belong, IMO.

That being said, what is the role of a resolver? In the MVC world, I would categorize it as a Controller, as it augments and marshals data to and from the Model to match the View's (GrapqhQL client's) request. Logic of resolvers should be left to this task. I personally prefer the middleware approach, where you can inject authentication as a stage in request propagation, so having validation (though I think it should be able to be named anything) as a step in the request handling is fine. This is the convention with which web servers like express or koa approach authentication, and I think it works well. So why not just keep it simple, using more generic middleware slots in this manner?

Something along these lines:

const schema =  `
  type User {
    id: ID!
    posts: [Post]
  }
  type Post {
    title: String!
    content: String!
    owner: ID!
  }
  input PostInput {
    user: User!
    title: String!
    content: String!
  }
`
const query =  `
  type Query {
    post(id: ID!): Post
  }
`
const mutation = `
  type Mutation {
    addPost(input: PostInput!): Post
  }
`
const resolverMiddleware = {
  User: {
    posts: async ({ownerId: id}) =>  models.post.getAll(ownerId)
  },
  Query: {
    post: async (_, {id}) => models.post.get(id)
  },
  Mutation: {
    addPost: async (_, {input}) => {
      const post = models.post.add(input)
      models.user.addPost(post)
      return post
    }
  }
}
const authMiddleware = {
  Query: {
    post: async (source, {id}, ctx, next) => {
      if (!isLoggedIn(ctx.tokens) || models.post.isPrivate(id)) {
        throw new GraphQLAuthError("You do not have permission for this request.")
      }
      return await next(...arguments)
    }
  }
  Mutation: {
    addPost: async (source, {input}, ctx, next) => {
      if (!isLoggedIn(ctx.tokens) || !models.post.isOwner(input.id, ctx.tokens.authToken.user)) {
        throw new GraphQLAuthError("You do not have permission for this request.")
      }
      return await next(...arguments)
    }
  }
}
const makeExecutableSchema({
  schema,
  query,
  mutation,
  middleware: [authMiddleware, resolverMiddleware]
})

Panoplos on 3 Jul 2017

👍4

Adding yet another 2c to this conversation, but I am of the sentiment that _authorization does not belong as part of the schema_, and rather the business layer handles this.

I'm currently handling GraphQL authorization in three ways:

The /graphql endpoint requires a JWT. This is less about GraphQL, and more about securing API access (same process is used for the REST API).
The top-level query/mutation resolvers check to see if the viewer (ie. JWT user) has permission to query/mutate the requested field. Side note, this is one way to "protect" sub-fields, by only adding them to the parent field. So for instance, publicProject by schema design, returns a whitelisted subset of the full project field. If the user (ahem "viewer") is authorized to query publicProject, then we don't need to worry about any of the sub-fields. Again, we use this same high-level check for a REST endpoint (does the viewer have CRUD permission on this resource?).
For more complex conditional fields, the sub-field resolvers reach into the business layer, which use a series of voters to determine if the user can access the field. I tend to like the idea of repurposing HTTP errors, so attempting to access a protected field returns errors: [{message: "Permission Denied", code: 403}]. Graphcool seems to have their own code system.

So far, I haven't really found any issues with this approach, other than possibly performance (but that's a different topic). With the approach above, our REST API and GraphQL API are basically utilizing the same authorization logic.

bjunc on 4 Jul 2017

👍4

The suggestion above is exactly what I think should be absolutely avoided. It has nothing to do with being a Gateway into business logic. The suggestion is adding business logic into the Gateway. And, that is a total NO GO from my perspective!

Scott

smolinari on 13 Jul 2017

👍6

Then perhaps the middleware approach is better, as it leaves opinionative design choices up to the developer.

Panoplos on 13 Jul 2017

Another totally different approach is to make use of the Pipes and Filters Design Pattern. A pipeline is made out of processing filters (in this case implemented as promisses or async functions) and the output of one filter is the input for the next filter (one parameter). With this approach all the logic is in the resolver (maybe going against previous posts in this issue). The p-waterfall npm package is one implementation of this pattern. Then you can have for instance:

import processingPipe from 'p-waterfall';
........
........
async function pipelineResolver(parent, args, ctx, ast) {
  // Build one parameter accepted as input and returned by each filter
  let input = {parent: parent, args: args, ctx: ctx, ast: ast};
  let filters = [
            value => sanitize(value),
            value => audit(value),
            value => authenticate(value),
            value => authorize(value),
            value => validate(value),
            value => operation(value)  // The real work the resolver is supposed to do 
  ];
  return await processingPipe(filters, input);

leoncc on 8 Aug 2017

👍4 👀1

I thought I'd share my approach to authorization while developing an acl plugin for my project graphql factory which uses the acl library for authorization

In factory, there are before and after resolver function middleware hooks. For the acl plugin I take advantage of the before hook to analyze the fieldNodes to create a list of request paths

Request paths can contain schemas, mutations, queries, selections, and arguments.

An example would be

Foo.query.listFoo.selection.bar

where Foo is the schema, listFoo is a query field in that schema and bar is a field returned by the query

Additionally inherited permissions are specified with an * so a complete list of potential field paths for the query

query FooQuery {
    listFoo {
        bar
    }
}

would look like

*
Foo.*
Foo.query.*
Foo.query.listFoo.*
Foo.query.listFoo.selection.*
Foo.query.listFoo.selection.bar

This list is checked against the user id of the requestor (passed in rootValue.jwt as a json web token that the plugin knows the secret to verify) using the acl method acl.allowedPermissions. Each selection/argument requested is checked against the acl.allowedPermissions result by reducing it to a list of permissions for that selection or argument. Those permissions are then checked against the permission set on the field itself Example Here

If the user does not have the correct permission on any of the requested fields, the request is considered unauthorized and the before hook/middleware immediately returns an error result without executing the resolver function.

Additionally explicit denies which take priority over allows are achieved by prefixing the request path with a ! (i.e. !Foo.query.listFoo.selection.baz)

bhoriuchi on 17 Oct 2017

👍2

@bhoriuchi i completely agree with you with use ACL for graphql permissions, actually ACL achieve all requirements business application grade, i works for many years for usage in every day that approach and i was happy when i saw first time graphql it was easy to see schema as
ACO (access controled object)

in ACL your aco can be many things depends of your implementations/decisions, it can be
graphql.request.query.type.... (recursion)
rest.resource
(MVC)
controller.action (most popular b4 SPA)
form.fieldset.fieldset.input
model.id

if u have aco based on b-tree implementation you are in home actually cause ACL is super designed and you can do under access list:

check is ARO - acces request object (can be user,group, even device... whathever with linked to authorised object - certificate, user_id, jwt_payload)
has permissions (permissions table with eg. aro_id,aco_id, _mutate(boolean), _subscribe(boolean), _read(bolean), _write(bolean) ...) for
ACO - access control object

basically with use process gql request with function waterfall @leoncc idea and place inside value => authorization(...), method with check permission in acl mechanism i have totally independent flexibility with manage of rights to fields in graphql schema even rest, windows domain by one authorization engine (in my case sql with aro, aco, permission) one of limitation for me its many requests by ACL to DB during by request for process permission, but still its better what @graphcool proposed with them grpahql authorization queries and put logic under graphql

one more i thought i consider generate schemas per user with limit avaiable resources but i dont thing is worth of it

there is many implementation on os layer, osi network, web, frameworks, languages first time i used ACL in @cakephp project - acl plugin :) 🍺

agborkowski on 26 Oct 2017

Posts that have resonated the most with me:

There are different depths of permission needs. Some organizations might want a simple system. Some might want a very, very complicated system with very high access control fidelity
(--@smolinari )

Rather than adding auth-specific behaviors, I'd like to see the tooling around extension points in the query process become more robust. I really like the idea of expanding graphql-tools as a way to do that. I think I'm agreeing with @thebigredgeek on that.

Maybe a solution lies is some sort of support for directives in graphql-tools, where directives in the schema can help choose a composition of resolvers specified separately from the schema as a parameter to makeExecutableSchema?
(--@JeffRMoore)

I personally prefer the middleware approach, where you can inject authentication as a stage in request propagation, so having validation (though I think it should be able to be named anything) as a step in the request handling is fine. This is the convention with which web servers like express or koa approach authentication, and I think it works well.
(--@Panoplos)

My two bits:
I think support in graphql-tools for resolver middleware is the way to go for adding authorization functionality to a GraphQL server.

Middleware makes possible different execution flows. With respect to authorization:
- One could execute authorization before every resolver, or after every resolver.
- One could execute authorization and return an error, and not execute the resolver.
- One could execute both authorization and the resolver, filter the result, and return only authorized data.
Built-in resolver middleware support would encourage the growth of an ecosystem of plugins to quickly compose commonly needed functionality to create a GraphQL server. With respect to authorization:
- No single authorization model would be assumed or implied. One could use user-based, role-based, attribute-based access control, or something else as needed.
- Middleware support would promote a clean separation of concerns when combining functionality like authorization, dataloader caching, and logging.

This kind of functional composability support was a big win for Apollo Client 2.0 when it provided Apollo Link for client creation; it would be great if graphql-tools had something like that for server creation.

edelreal on 3 Nov 2017

My last comment was in July, pertaining my experience so far. Circling back now in Nov, I've since built two more GraphQL APIs in Phoenix (Elixir) and Symfony (PHP), both used in production environments (as well as a few smaller sandbox tests in Node, etc.). I can say that my last comment has worked well. To recap and extend a bit (note, this is higher level than apollo, or graphql-tools):

Endpoint auth via token. Is this viewer authorized to even access /graphql? Essentially the same as what you'd do for a REST endpoint.
Load the viewer into the query's context, and check their permissions against the top-level field (this is basically ACL). For instance, if the field is updateUser(id: 15), you'd check your business logic permissions for the viewer to update that particular user. This setup creates separation between your GraphQL layer, and business layer. It also allows you to reuse this logic for REST. You _would not_ bake this into your schema, and you'd limit the complexity of your resolve function. Anything beyond something like isGranted('UPDATE_USER', user) and you're likely putting too much intelligence in your resolver (IMO).
Sub-fields can be checked as well, but you'll need to be careful about performance. If you're not careful about your batching, you could end up with n+2 scenarios (every returned list item (n) gets the association (+1), but also does permission checks on that association (+1, assuming it only takes you 1 db call to verify permissions...). I've found it easier to expose different types that have already white-listed the allowed fields, but that may not work in your situation. That way, you check the top-level field (#2 above), and be done with it.

bjunc on 3 Nov 2017

👍2

Excellent discussion above, really appreciate it.

I found this discussion because an endpoint I have that was completely protected by a user session now needs "anonymous" user access. I need to forbid all mutations for this anonymous user and allow about 90% of the objects to be queried with existing ACL logic intact. Currently users not authorized are forbidden from doing anything (at the server layer), but that won't work anymore.

I really don't want to write the logic into all my ORM layer ACLs to forbid mutation access when viewer is falsey, I want to do it once. However, my ORM layer doesn't have middleware. So I was hoping to do this in my graphql layer.

But as noted above graphql is the wrong layer to do this, and furthermore, the fact that my ORM layer doesn't have middleware isn't your fault, it's my fault :-)

thanks for warning me off from a solution that will be problematic in the long term, if for example I add a RESTful API on top of my ORM layer, or a different schema of graphQL for a newer version of the API, or ...

TL;DR: Keep ACLs in the business logic. If your business logic isn't extensible, fix that problem in the business logc.

pszabop on 1 Dec 2017

I think if you want to do something simple like "forbid all mutations for X user" then it might be reasonable to do that in GraphQL, maybe? But there's a big risk that the rule will get more complex over time :]

stubailo on 1 Dec 2017

I have been using resolver middleware in my gateways, and I have to say that applying authentication and authorization functionality using resolver middlewares has been quite a good experience. I use a Koa-style pattern for my middlewares, so I can run logic both before ('route based') and after ('result based') the actual resolver.

To me, this is a clean way of implementing this functionality. When exposing an existing GraphQL endpoint in a Gateway, there's really no other place to put this logic, and I don't see why this would be considered bad practice at all. To me, resolver middleware is equivalent to the router-level middleware in your traditional Express/Koa web server, so it's a natural fit.

kbrandwijk on 1 Dec 2017

👍1

There's also some wisdom that says router middleware isn't a good place to put permissions logic though, right? At least back when I used Rails the mantra was "thin controllers, fat models", and all permissions was recommended to be done in the model, not the route handlers/controllers.

stubailo on 1 Dec 2017

👍1

@stubailo What would the 'model layer' be when working with existing GraphQL endpoints and schema stitching?

kbrandwijk on 1 Dec 2017

With schema stitching, that probably means in the underlying GraphQL services is best, and with GraphQL subscriptions you call your regular data loading code, right? I guess what I mean is that your security shouldn't be transport or API dependent, in my opinion. Since there are often different ways to access the same data (for example, an object showing up in a subscription and a query shouldn't require two separate security rules)

stubailo on 1 Dec 2017

@stubailo I don't agree with you on that. I'd like to compare it to having a bunch of internal REST API's that you are exposing through a Gateway. The REST API's themselves don't implement user-level security. The Gateway that exposes these as a public API needs to do that, as the API's themselves might not even share the (same) concept of a 'user'. This is a scenario where it's very common to have this logic in the Gateway.

I see the same happening with GraphQL endpoints. You can't always control the implementation of the GraphQL API you are exposing (more commonly not actually).

kbrandwijk on 1 Dec 2017

👍1

That's a great point - I think there isn't a clear answer yet, so we should consider all the options for sure.

stubailo on 1 Dec 2017

And as I write my code, and find yet another internal API that wasn't checking for a valid viewer before writing. So bug prone.

I wish I had the middleware to just prevent writes by all non-valid viewers. I note I didn't say where the middleware should go...

pszabop on 7 Dec 2017

👍1

Hey 👋 I created a helper tool called graphql-shield. Its purpose is to abstract away permission layer and make it super easy to implement any permission functionality you want. I would love to get any of you involved and get as much feedback as possible 🙂

https://github.com/maticzav/graphql-shield

maticzav on 14 Feb 2018

🎉12

Hey everyone! I'm going to close this issue, as it's been stale for a while (and frankly, a lot to digest as we try to clean up this repo).

I believe many of the issues raised in here may actually be solvable using schema directives, since they are as customizable as needed, and still leave auth at the schema level if that's something you want.

JakeDawkins on 13 Jul 2018

👍3

Graphql-tools: Authorization

Most helpful comment

All 79 comments

Related issues