If you suspect this could be a bug, follow the template.
play.dgraph.io
Yes, it still fails
MacOS but also play.dgraph.io, so probably irrelevant
Run a shortest path query and use first or offset to get a specific part of the path.
For instance, in order to find movies directed by Michael Bay where Megan Fox appears,
we can run this query on play.dgraph.io:
{
path as shortest(from: 0x5526c6, to: 0x652d3f, depth: 2) {
<director.film>
<actor.film>
<starring>
<performance.actor>
}
names(func: uid(path)) {
uid
name@.
}
}
This wils return one of the movies matching the description:
{
"extensions": {...},
"data": {
"names": [
{
"uid": "0x5526c6",
"name@.": "Michael Bay"
},
{
"uid": "0x40b2f3",
"name@.": "Transformers: Revenge of the Fallen"
},
{
"uid": "0x4d6d44"
},
{
"uid": "0x652d3f",
"name@.": "Megan Fox"
}
],
"_path_": [
{
"director.film": [
{
"starring": [
{
"performance.actor": [
{
"uid": "0x652d3f"
}
],
"uid": "0x4d6d44"
}
],
"uid": "0x40b2f3"
}
],
"uid": "0x5526c6"
}
]
}
}
Now, if I want to get only the name of the movie, I know that this is the second element of the path, so first:1, offset: 1 should work, but it doesn't.
Depending on the combination you might get different results.
If I set offset: 1 I expect to see the name of the movie first (we skipped the director) followed by an anonymous node (the performance) and finally Megan Fox.
{
path as shortest(from: 0x5526c6, to: 0x652d3f, depth: 2) {
<director.film>
<actor.film>
<starring>
<performance.actor>
}
names(func: uid(path), offset: 1) {
uid
name@.
}
}
Randomly, I get either the name of the movie and the performance but no Megan Fox:
[...]
"data": {
"names": [
{
"uid": "0x7111ab",
"name@.": "Bad Boys II"
},
{
"uid": "0x733703"
}
],
"_path_": [
{
"director.film": [
{
"starring": [
{
"performance.actor": [
{
"uid": "0x652d3f"
}
],
"uid": "0x733703"
}
],
"uid": "0x7111ab"
}
],
"uid": "0x5526c6"
}
]
}
}
But also, sometimes, I'll get just Megan Fox
[...]
"data": {
"names": [
{
"uid": "0x652d3f",
"name@.": "Megan Fox"
}
],
"_path_": [
{
"director.film": [
{
"starring": [
{
"performance.actor": [
{
"uid": "0x652d3f"
}
],
"uid": "0x466767"
}
],
"uid": "0x1053f0"
}
],
"uid": "0x5526c6"
}
]
}
}
If I set the request to use first: 1 the query works correctly and only returns Michael Bay.
But ... if I set it to two, randomness comes back and sometimes I'll get the right result:
{
path as shortest(from: 0x5526c6, to: 0x652d3f, depth: 2) {
<director.film>
<actor.film>
<starring>
<performance.actor>
}
names(func: uid(path), first: 2) {
uid
name@.
}
}
[...]
"data": {
"names": [
{
"uid": "0x5526c6",
"name@.": "Michael Bay"
},
{
"uid": "0x7111ab",
"name@.": "Bad Boys II"
}
],
"_path_": [
{
"director.film": [
{
"starring": [
{
"performance.actor": [
{
"uid": "0x652d3f"
}
],
"uid": "0x733703"
}
],
"uid": "0x7111ab"
}
],
"uid": "0x5526c6"
}
]
}
}
And sometimes I'll get Michael Bay and Megan Fox, skipping both the movie and the performance!
[...]
"data": {
"names": [
{
"uid": "0x5526c6",
"name@.": "Michael Bay"
},
{
"uid": "0x652d3f",
"name@.": "Megan Fox"
}
],
"_path_": [
{
"director.film": [
{
"starring": [
{
"performance.actor": [
{
"uid": "0x652d3f"
}
],
"uid": "0x466767"
}
],
"uid": "0x1053f0"
}
],
"uid": "0x5526c6"
}
]
}
}
The query I wanted to perform to get the second element was first: 1, offset: 1.
{
path as shortest(from: 0x5526c6, to: 0x652d3f, depth: 2) {
<director.film>
<actor.film>
<starring>
<performance.actor>
}
names(func: uid(path), offset: 1, first: 1) {
uid
name@.
}
}
This will again sometimes work:
[...]
"data": {
"names": [
{
"uid": "0x7111ab",
"name@.": "Bad Boys II"
}
],
"_path_": [
{
"director.film": [
{
"starring": [
{
"performance.actor": [
{
"uid": "0x652d3f"
}
],
"uid": "0x733703"
}
],
"uid": "0x7111ab"
}
],
"uid": "0x5526c6"
}
]
}
}
But sometimes it won't:
[...]
"data": {
"names": [
{
"uid": "0x652d3f",
"name@.": "Megan Fox"
}
],
"_path_": [
{
"director.film": [
{
"starring": [
{
"performance.actor": [
{
"uid": "0x652d3f"
}
],
"uid": "0x4d6d44"
}
],
"uid": "0x40b2f3"
}
],
"uid": "0x5526c6"
}
]
}
}
I had a deeper look into this and the problem here is that we don't preserve the order of UIDs in the uidMatrix when populating UIDs from a variable that was the result of the shortest path query. We sort the list of UIDs before doing pagination. In general all uid variables are supposed to contain sorted lists whereas in the case of the shortest path, they are not guaranteed to and the order matters. We'll have to add a way to find out the cases in which we should preserve the order.
This is the step where the order of elements in the uidMatrix is changed
https://github.com/dgraph-io/dgraph/blob/857e30fd11556b5f5948b521fb29da2a4b338c5a/query/query.go#L2295
No need to work on this right now, but it's good to keep track of what we need to improve regarding k-shortest paths.
Seems tricky. We'd need to make variables switch from Go maps to some sorted map implementation, which would be slower.
I do understand this is not easy to fix, but something needs to be done.
Maybe we should simply disable the usage of first and other pagination on variables coming from shortest path?
Or maybe the variables coming from shortest path should simply be different?
Github issues have been deprecated.
This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.
