Prisma1: Slow Performance of Connection Aggregate Queries

Created on 7 May 2018  路  8Comments  路  Source: prisma/prisma1

Bug Report

Given a query graphql query like so....

query {
  labelsConnection(
    where: {
      dataRow: {
        dataset: {
          projects_some: {id: "cjdtaxrkrznmq0194h91zw9lb"},
          deleted: false}
      }
    }
  ) {
    aggregate {
      count
      __typename
    }
    __typename
  }
}

Prisma 1.7 generates this SQL query

mysql> select count(*) from Label WHERE  (
    ->   exists (
    ->     select * from `default@default`.`DataRow` as `DataRow_Label`
    ->       inner join `default@default`.`_DataRowLabels`
    ->       on `DataRow_Label`.`id` = `default@default`.`_DataRowLabels`.`A`
    ->       where `default@default`.`_DataRowLabels`.`B` = `default@default`.`Label`.`id` and (
    ->         exists (
    ->           select * from `default@default`.`Dataset` as `Dataset_DataRow_Label`
    ->           inner join `default@default`.`_DataInDataset`
    ->           on `Dataset_DataRow_Label`.`id` = `default@default`.`_DataInDataset`.`B`
    ->           where `default@default`.`_DataInDataset`.`A` = `DataRow_Label`.`id` and (
    ->             `default@default`.`Dataset_DataRow_Label`.`deleted` = 0 and exists (
    ->               select * from `default@default`.`Project` as `Project_Dataset_DataRow_Label`
    ->               inner join `default@default`.`_ProjectDatasets`
    ->               on `Project_Dataset_DataRow_Label`.`id` = `default@default`.`_ProjectDatasets`.`B`
    ->               where `default@default`.`_ProjectDatasets`.`A` = `Dataset_DataRow_Label`.`id` and (
    ->                 `default@default`.`Project_Dataset_DataRow_Label`.`id` = 'cjdtaxrkrznmq0194h91zw9lb'
    ->               )
    ->             )
    ->           )
    ->         )
    ->       )
    ->   )
    -> );

+----------+
| count(*) |
+----------+
|      449 |
+----------+
1 row in set (23.12 sec)

But as you see it takes 23 seconds to run against my ~4gb of data

Compare that to a query I wrote by hand that takes 0.04 seconds to execute

mysql> SELECT count(*)
    -> FROM Label
    ->   # datarow
    ->   LEFT join _DataRowLabels
    ->   on Label.id = _DataRowLabels.B
    ->   LEFT join DataRow
    ->   on DataRow.id = _DataRowLabels.A
    ->
    ->   # dataset
    ->   LEFT join _DataInDataset
    ->   on DataRow.id = _DataInDataset.A
    ->   LEFT join Dataset
    ->   on Dataset.id = _DataInDataset.B
    ->
    ->   # project
    ->   LEFT join _ProjectDatasets
    ->   on Dataset.id = _ProjectDatasets.A
    ->   LEFT join Project
    ->   on Project.id = _ProjectDatasets.B
    -> WHERE Dataset.deleted = false
    ->   and Project.id = 'cjdtaxrkrznmq0194h91zw9lb';
+----------+
| count(*) |
+----------+
|      449 |
+----------+
1 row in set (0.04 sec)

I understand that the generated query by prisma needs to handle much more complexity then my naive sql query. However, if my query takes 23 seconds I'm somewhat forced to change my datamodel or write the SQL myself.

areengine statustale

Most helpful comment

@danielrasmuson Have you tried the query again on a more recent version? 1.14 and up maybe? We did some changes to the query generation and it would be interesting to hear what the results are in your case.

All 8 comments

Hey, @danielrasmuson thanks for the detailed reproduction. We plan to soon rewrite the parts of our query generation that nest the SQL statements to generate flatter statements like the one you hand wrote. We'll probably do this once we have the passive SQL connectors and have a better overview over what capabilities and structure we want for our query generation.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Any update on this issue?

if you want to resolve the problem until prisma improves it, you can write your own custom queries.

try this one:
https://www.youtube.com/watch?v=YUjlBuI8xsU
the example is with postgresql

for mysql you can use: https://github.com/mysqljs/mysql

Yep. That is what I'm doing. The pain point for writing the SQL by hand is the connection tables are little difficult to manage.

select TableOne.field, TableTwo.field
from TableOne
left join ConnectionTable on TableOne.id = ConnectionTable.B
left join TableTwo on TableTwo.id = ConnectionTable.A;

Also for deletes you need to clear out all the relationship tables as well.

And if the datamodel changes the SQL scattered throughout the project is going to break.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@danielrasmuson Have you tried the query again on a more recent version? 1.14 and up maybe? We did some changes to the query generation and it would be interesting to hear what the results are in your case.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings