Describe the bug
I have a copy of my MySQL db deployed on AWS Aurora.
When I am doing a query, asking for the first X of a table (where x>max number of lines for this query), the query never ends, therefore it times out.
For example, my mission with id "cjcgea18taaze0124ff7j25zs" has 236 images linked to it.
By querying the first 235 images, I have the result in less than a second, the executed query is the following :
{
images(first:235 where: { mission: { id: "cjcgea18taaze0124ff7j25zs" } }) {
id
}
}
Which executes on Aurora :
SELECT *
FROM `sterblue@dev`.`Image` AS `Top_Level_Alias`
WHERE (((EXISTS
(SELECT *
FROM `sterblue@dev`.`MissionExecution` AS `MissionExecution_Top_Level_Alias`
INNER JOIN `sterblue@dev`.`_MissionImages`
ON `MissionExecution_Top_Level_Alias`.`id` = `sterblue@dev`.`_MissionImages`.`B`
WHERE `sterblue@dev`.`_MissionImages`.`A` = `Top_Level_Alias`.`id`
AND (((`MissionExecution_Top_Level_Alias`.`id` = 'cjcgea18taaze0124ff7j25zs'))) ))))
ORDER BY `Top_Level_Alias`.`id` ASC
LIMIT 0,
236
BUT if I ask for 236 lines :
{
images(first:236 where: { mission: { id: "cjcgea18taaze0124ff7j25zs" } }) {
id
}
}
SELECT *
FROM `sterblue@dev`.`Image` AS `Top_Level_Alias`
WHERE (((EXISTS
(SELECT *
FROM `sterblue@dev`.`MissionExecution` AS `MissionExecution_Top_Level_Alias`
INNER JOIN `sterblue@dev`.`_MissionImages`
ON `MissionExecution_Top_Level_Alias`.`id` = `sterblue@dev`.`_MissionImages`.`B`
WHERE `sterblue@dev`.`_MissionImages`.`A` = `Top_Level_Alias`.`id`
AND (((`MissionExecution_Top_Level_Alias`.`id` = 'cjcgea18taaze0124ff7j25zs'))) ))))
ORDER BY `Top_Level_Alias`.`id` ASC
LIMIT 0,
237
md5-db2fdbe0385e58998b75f0fc7b4648a6
SELECT I.id
FROM `sterblue@dev`.`Image` I, `sterblue@dev`.`MissionExecution` ME,
`sterblue@dev`.`_MissionImages` MI
WHERE MI.A = I.id AND MI.B = ME.id AND ME.id = 'cjcgea18taaze0124ff7j25zs'
LIMIT 0,
237
As my knowledge of database performance (index, ...) is limited, I am interested in knowing why you are using this kind of queries, instead of a simple one like the one above.
I would like to know if this issue comes from the use of Aurora, and how it is possible to fix it.
Expected behavior
Return the lines in a decent amount of time.
Versions (please complete the following information):
1.11.0Post-Scriptum : our image table has around 200000 lines.
As I can't divulge our schema or db, I will try to provide a way to reproduce the bug as soon as possible.
I can give you a quick reproduction of the MySQL DB case (not the Aurora one) :
type MissionExecution {
id: ID! @unique
createdAt: DateTime!
updatedAt: DateTime!
images: [Image!]! @relation(name: "MissionImages")
}
type Image {
id: ID! @unique
createdAt: DateTime!
updatedAt: DateTime!
mission: MissionExecution @relation(name: "MissionImages")
}
(I used http://filldb.info/ but it is a bit buggy, he didn't allow me to import as much data as I wanted)
In my case with 70000 Images, 1200 MissionExecutions and 47000 relations (number are really low compared to our real db)
SELECT I.id FROM Image I, MissionExecution ME, _MissionImages MI WHERE MI.A = I.id AND MI.B = ME.id AND ME.id = 'cjca062y65c298s34f0o78w90' LIMIT 0, 80000
Takes 0.0022 seconds
SELECT * FROM Image AS `Top_Level_Alias` WHERE (((EXISTS (SELECT * FROM MissionExecution AS `MissionExecution_Top_Level_Alias` INNER JOIN _MissionImages ON `MissionExecution_Top_Level_Alias`.`id` =_MissionImages.`B` WHERE _MissionImages.`A` = `Top_Level_Alias`.`id` AND (((`MissionExecution_Top_Level_Alias`.`id` = 'cjca062y65c298s34f0o78w90'))) )))) ORDER BY `Top_Level_Alias`.`id` ASC LIMIT 0, 80000
Takes 1.77 seconds
I found the root of the problem.
Whenever I query my table Image, prisma is querying ALL the fields. But our image entity has a long string field (~10000 char).
As Prisma seems to always query * fields, the query never ends as it has to go through a lot of data..
I will send you in a private message a reproduction including two prisma services, with and without the said field.
8840a46 fixes the bug for count, but not for a simple query.
Here is the repository you can use to reproduce the issue https://github.com/crubier/PrismaBug
Ignore the src, it only contains a script used to generate some dummy data for the reproduction.
All the credentials are in the docker-compose
Right now our aurora has two prisma services deployed with the following schema :
type MissionExecution {
id: ID! @unique
createdAt: DateTime!
updatedAt: DateTime!
images: [Image!]! @relation(name: "MissionImages")
}
type Image {
id: ID! @unique
createdAt: DateTime!
updatedAt: DateTime!
mission: MissionExecution @relation(name: "MissionImages")
metadata: String
}
default@default : queries should work fine here as metadata field is empty for all the entities.
default@bug : the metadata field has been filled with data (according to our database, this field usually has ~10000 char).
You can see that querying the first 100 image ids is much longer on the bug stage.
This issue is really blocking us right now.
We have been waiting to migrate from Graphcool to Prisma for several months, and this is the only issue remaining. Prisma is unusable for our use case because of this bug which degrades performance by 100x on many of our queries.
It would be really, really cool if this was solved :-)
Keep up the good work !
Thanks a lot for your detailed report. Only querying selected fields is now available on 1.15-alpha: https://github.com/prismagraphql/prisma/pull/2839.
I'm closing this as duplicate from https://github.com/prismagraphql/prisma/issues/2727.
Most helpful comment
I can give you a quick reproduction of the MySQL DB case (not the Aurora one) :
1. Deploy a service using this schema
2. Import a lot of data
(I used http://filldb.info/ but it is a bit buggy, he didn't allow me to import as much data as I wanted)
3. Do the query in the post above
In my case with 70000 Images, 1200 MissionExecutions and 47000 relations (number are really low compared to our real db)
Takes 0.0022 seconds
Takes 1.77 seconds