Lighthouse: Order by nested relation, and aggregation results

Created on 10 Jul 2020  路  15Comments  路  Source: nuwave/lighthouse

I will gladly provide a PR for this issue

What problem does this feature proposal attempt to solve?

When you use OrderBy, you can't order by anything that isn't a direct column, and also you can't order by aggregations like min, max, and so on.

Which possible solutions should be considered?

In the OrderBy directive, create support for relation, and aggregation

Notes

  • Maybe it can somehow relate or use to another issue I made regarding aggregations.

  • When using relation, and not putting a column, it will order by the aggregation of the relation.

  • If this idea and the other one with the aggregation will be accepted, there would now be 3 things that depend on relations; so I think this proposal should also be considered, for easier usage of thosw functionalities.

  • Also maybe it should be a different a different directive, so it wouldn't break the original directive, and it would still be available for easier use.

Request Example
```graphql{
products(
orderBy: {
priorities: [
{
relation: 'purchases'
column: 'payed'
aggregation: AVG
sort: ASC
}
]
}
){
name
}

**Generated Type**
```graphql{
input OrderByClauses
{
  priorities: [OrderByClause!]
}

input OrderByClause
{
  relation: String

  column: String

  aggregation: AggregationFunction

  sort: SortOrder = DESC
}

enum AggregationFunction
{
  MIN

  MAX

  AVG

  SUM

  COUNT

  DISTINCT
}

enum SortOrder
{
  ASC

  DESC
}
discussion enhancement

Most helpful comment

I'm also looking for sorting on relations. In my case, there's a grid of rates that belong to various other tables like parts or countries. I'd like my grid to be able to sort by parts.sku or countries.name.

All 15 comments

I am a bit wary of reimplementing such SQL-like features in GraphQL, is it really necessary?

I think sorting by data inside relations is pretty crucial for databases with certain complexity. That's also where I was coming from with the has condition PR. Would be kind of inconsistent in my opinion if we had relation driven filters, but not sorting, the two things that in my opinion should compliment and work parallel with each other.

I like that you are pushing the boundaries and am curious at least. Can you give some realistic example use cases?

Well, there are pretty simple and straightforward examples, like; If you have a users table with hasMany relation to a donations table, and a donation contains a value for the amount donated, and you would like to sort users by how much they donated on total, or in average. Or much more complex things, like sorting by how much reaction a user has generated by proxy, by going summing the amount of ratings made on comments of posts the user has created (eg. SUM of user.posts.comments.ratings). Another thing that also came to mind, is the ability to create conditioning, for example, in the example before there could be a case where you want to sort only by positive reaction, so you would need to SUM only ratings that are positive. At which point, to just give full spectrum of control, we probably should provide an implementation of there existing whereCondition. Sure this would get pretty complex input wise, but in the end it is not complexity we enforce on users, but optional tools that exists there to be used on demand for a more complex oriented requirements.

I get why you would need such a query in principle, but am not yet convinced that adding more and more dynamic query capabilities to GraphQL is the way to go.

Another way to solve this issue is to offer specialized queries to the clients. Given your example, something like topDonators: [User!]! would work. This requires some coordination between client and server, but enables the server to optimize or cache expensive queries.

If you are in a scenario with multiple clients, how would they learn about the possible queries? For instance, how would they know which relations can be ordered or aggregated upon for a given query? How can we reflect that information in the schema?

I'm also looking for sorting on relations. In my case, there's a grid of rates that belong to various other tables like parts or countries. I'd like my grid to be able to sort by parts.sku or countries.name.

I have the same problem, in my case I have authors and I need to display a table, and sort using: name, created_at and posts_count.

The query that you suggests @spawnia topDonators in my case will not help because I need to sort the main Model Author using the relation with Post.

Also with directive withCount cannot be used in the root Field so I can't workaround the problem pre-fetching the relation and load column count in the main query.

I have no plans to implement this feature in Lighthouse, but am open to accepting pull requests.

I made some tests, before open a PR I want to discuss a bit the possibile solution:

The order query builder is here

If i need to sort by name, created_at, and posts_count in schema the directive will be:

allAuthors(
  orderBy: _ @orderBy(columns: ["name", "created_at", "posts_count"])
): [Author!] @paginate

In orderBy directive definition can be add an new param countRelations (it can be better) to have a whitelist of relations to pass to the withCount method.

Before the orderBy in the handleBuilder will be added this code:

$relationsWhitelist = $this->directiveArgValue('countRelations', []);

$relations = collect($value)
    ->map->column
    ->map(function ($column) use ($relationsWhitelist) {
        if (! Str::endsWith($column, 'count')) {
            return;
        }

        $guessRelationMethod = Str::camel(Str::replaceLast('count', '', $column));

        return in_array($guessRelationMethod, $relationsWhitelist)
            ?  $guessRelationMethod
            : null;
    })
    ->filter();

    if ($relations->isNotEmpty()) {
        $builder->withCount($relations->toArray());
    }

and the schema will be:

allAuthors(
  orderBy: _ @orderBy(columns: ["name", "created_at", "posts_count"], countRelations: ["posts"])
): [Author!] @paginate

This solution is less complex and less powerful in comparison with initial solution, but this resolve some use case.

The source schema definition would be more explicit as @orderBy(columns: ["name"], countRelations: ["posts"]). No need for the developer to know the magic _count convention and have a redundant schema definition.

Since it recently came up in another issue, how about ordering on other aggregates such as sum?

The are 5 available options: withCount, withSum, withMin, withMax, and withAvg.

~The first solution is to create for each type a parameter {function}Relations.~
This is not possible because withSum, withMin, withMax, and withAvg need an extra parameter column.

The other is to use a collection to define on each item relation, function and column (if needed).

allAuthors(
  orderBy: _ @orderBy(
    columns: ["name", "created_at"], 
    relations: [
      { relation: "posts", function: COUNT },
      { relation: "ratings", function: AVG, column: "value" }
    ]
  )
): [Author!] @paginate

Looks good. I propose the following naming:

@orderBy(
  columns: ["name", "created_at"], 
  relations: [
    { relation: "posts", aggregate: COUNT },
    { relation: "ratings", aggregate: AVG, column: "value" }
  ]
)

We might also omit the aggregate argument and just define allowed relations and aggregable columns:

@orderBy(
  columns: ["name", "created_at"], 
  relations: [
    { relation: "posts" }, # Only COUNT possible
    { relation: "ratings", columns: ["quality", "design", "performance"] } # COUNT and other aggregates dynamically available
  ]
)

How would the generated input look like?

An example of query can be this one:

allAuthors(
  orderBy: [
    { column: POSTS_COUNT, order: DESC },
    { column: RATINGS_AVG_DESIGN, order: DESC }
  ]
) {
    title
    created_at
    posts_count
    ratings_avg_quality
    ratings_avg_design
    ratings_avg_performace
  }

The generated columns for the posts relation:

{ relation: "posts" }
---
POSTS_COUNT

Instead if the item have the columns param will be generated for each column all options using
{RELATION}_{FUNCTION}_{COLUMN} and the base {RELATION}_COUNT

{ relation: "ratings", columns: ["quality", "design", "performance"] }
---
RATINGS_COUNT
RATINGS_AVG_QUALITY
...
RATINGS_MIN_QUALITY
...
RATINGS_MAX_QUALITY
...

Without knowing the naming convention it might not be obvious what the different columns do. A more explicit and modular approach:

authors(
  orderBy: [
    { posts: { aggregate: COUNT }, order: DESC }
    { ratings: { aggregate: AVG, column: DESIGN }}
  ]
)

This is nicer for autocompletion, where the column name based approach might be filled with a lot of entries pretty soon, given the amount of permutations is relation * column * 5 (i guess there are 5 aggregate variants?).

It also allows targeted exposure of certain dynamic variables on the client side:

query ($ratingColumn: AuthorsOrderByRatingsColumn!) {
  authors(
    orderBy: [
      { ratings: { aggregate: AVG, column: $ratingColumn }}
    ]
  ]
)

Yes, i think this can be more explicit and powerfu!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

alexwhb picture alexwhb  路  4Comments

mehranabi picture mehranabi  路  3Comments

sadhakbj picture sadhakbj  路  4Comments

let-aurn picture let-aurn  路  3Comments

caizhigang97 picture caizhigang97  路  3Comments