Azure-cosmos-dotnet-v2: Issue when using order by

Created on 16 Oct 2015  ·  32Comments  ·  Source: Azure/azure-cosmos-dotnet-v2

Some queries return blank pages when using order by
For instance the query

select * from c where c.CustomerAreaId=”1” order by c.DocumentId

returns blank pages and a very high request charge. The response will indicate there is more data and if you keep getting the next page the data eventually starts to come back although due to the high request charge this is impractical.

However, if the queryis sorted desc then the data comes back immediately

select * from c where c.CustomerAreaId=”1” order by c.DocumentId desc

bug

Most helpful comment

It seems I found the solution to the problem. As per the documentations here . It says

If you would like to perform Order By queries against string properties, then you must specify a precision of -1 for the corresponding paths.

The default indexing policy sets "kind=Hash, precision=3" by default. If it is changed into "kind=Range, precision=-1". Order by on string datatype works as expected.

I recommend anybody who faces these inconsistencies to have a custom indexing policy to suit the query patterns. Cheers!

All 32 comments

We're aware of this problem in certain scenarios and are working on performance improvements so that order by queries return results immediately

Is there any update on when a complete fix for this will be available? I have just run into another situation similar to this where a query with order by doesn't return the same total number of results as one with without an order by.

Do you get fewer results when adding the Order By clause?

The most common reason for different number of results returned when including Order By is the following ...

in a query such as:
SELECT * FROM c

when adding Order By c.age the query that is executed changes slightly to the following;

SELECT * FROM c WHERE IS_DEFINED(c.age) ORDER BY c.age

so the reason there would be differences in number of results is because the first query may well be returning documents that don't have an age property defined.

could you do some checks and confirm if this is the case.

I was performing a query like
select c.id,c.TenantId,c.ObjectType from c where c.TenantId="James Industries PLC" order by c._ts asc

so that shouldn't be the case. I also noticed that when I performed the query

select c.id,c.TenantId,c.ObjectType from c where c.TenantId="James Industries PLC" order by c._ts desc

only 2 results were returned with the HasMoreResults flag set to false. There are 397343 objects that match this query in the database. I confirmed this particular query in the portal query explorer which also returned 2 results.

When paging through all the results without sorting I was getting 397343 as opposed to 396879 when the order by asc was on the query.

Unfortunately today I cannot repeat the results - we did have a publishing run which added more documents to the database so maybe that had an influence on it.

If issues like this occur again do you want me to log a new issue?
Should this particular issue be closed or is it still work in progress?

I do hate no repeatable issues. But i am glad you're not getting strange results currently.
This different results issue is a different issue to the original one, so in future a separate issue would be great.

Thanks!

Just managed to repeat the discrepancy with the order by desc this time only returning 3 results.
(Some documents were deleted through an automated expiry process last night which may have triggered it?)
I have some screen shots and am currently making a copy of the database to hopefully reproduce it on there... I will log a new issue soon and put all the details in there.

Thanks

When I add the order by clause to my simple select * from c query, I consistently get zero results back. Without the order by clause I consistently get the correct 33 results back.
I've managed to work around most of the issues I've had so far with DocumentDB but this has stopped my project dead in it's tracks.
I know that every document has a valid value for the property I am using to order by.

@tomeastham could you confirm two things for me please;
1) if the path in your Order By clause does actually exist in the documents. if the path is not there, then there will be no results.
2) if you switch Order By to Order By Desc does this change the result?

@ryancrawcour thanks for the reply.

  1. I used a number of different fields, all of which I knew contained values only once did I get results and that was when I changed everything to lowercase but only one execution gave me results. I gave up on the SQL approach at that point and used LINQ with exactly the same (equivalent) query which works fine.
  2. I tried ASC and DESC in SQL but neither gave results.

I'm going to use LINQ for this query thanks.

@tomeastham are you saying LINQ with Order By works as expected, but the SQL you wrote does not?

@ryancrawcour yes that is correct.

@tomeastham this seems very strange indeed, because all LINQ queries are translated to SQL. could you do me a favor please, if you do a .ToString() on the queryable object before walking it for results you should get the SQL it is going to send (or capture the SQL in Fiddler) and compare that to your handcrafted SQL. there must be a subtle difference somewhere.

@ryancrawcour I got over the previous issue I had with orderby. However, I have another issue that I can't find a work around for. I'm using LINQ and have 3000 objects in the collection. I do a query that filters it down to match around 1300 objects. In a console app, I output a date property and it shows a valid date for all 1300 objects. When I add an OrderBy(i => i.Added) it gives me zero results consistently. if I change it to Name with is a string property it give zero results. If I change to State which is an integer, it returns all 1300 results and then Exists which is a boolean it returns 1300 results. For everyhting I've tried this seems to be consistent behaviour. If I change to filter down to a smaller set of just 10, OrderBy works fine on all the properties.
Are you aware of this issue?
Thanks,
Tom.

with a query that filters it down to around 600 results, the orderby still returns zero results. When filtered down to around 300 results, the orderby works fine.

@tomeastham note that the DocumentDB query execution plan might be different based on the query and parameters. When you retrieve a larger page size, the engine might choose to retrieve documents from the index vs. sorting in memory. The query will always return the correct results when you paginate via the continuation tokens.

@arramac thanks for the response but all I want is the latest 100 items ordered by a date column which at the moment gives me zero results. The continuation token doesn't exist for this initial query. Are you suggesting there could be something wrong with the index for my documents? if so, how might i go about diagnosing that?
To re-iterate, my query selects documents where the ObjectType = 1 (there are 3000 of these) OrderBy the Added property (DateTime) I use MaxItemCount = 100 in FeedOptions. When I call AsDocumentQuery(), the HasMoreResults property is set to true, the ExecuteNextAsync() has a ResponseContinuation of null, RequestCharge of 29.46 and yields zero results.
If I change the OrderBy to the State property (Integer) I get 100 results ordered by the State property.
If I change the OrderBy back to Added and to ObjectType = 2 (there are 300 of these) I get 100 results ordered by the Added property.
I just want the latest 100 (ObjectType=1)documents.
Thanks,
Tom.

I am having this same issue - ordering by a string field yields no results - seems like a pretty severe problem.

I deleted my collection and recreated it. After that OrderBy started working and I was able to modify the indexing policy to further fit my needs.
I then moved onto problems ordering on partitioned collections which was unsupported until 1.9.2 of the .Net SDK. I'm nearly there now but it's been a lot of hard work...

Hi! I'm also a victim of this problem. Everything is working fine without ordering, but gets crazy with oerdering (ASC and DESC). HasMoreResults is false when I use ordering. What's more it sounds to work fine when I use new aggregate function count().
image
image
image
image
image

Same problem, I had to cleanup the whole database

PS: Using Node.js client

Yeah Me too!

I think the issue is that the default index policy does not include a range index for string. It seems it should throw an error telling us this rather than return 0 results, like what happens when you try to sort on something entirely unindexed.
Look at this page for a bit of explaination: https://docs.microsoft.com/en-au/azure/documentdb/documentdb-indexing-policies

I am facing the same issue. Ordering by a String datatype which I am sure exist, does not return any results. Whereas the same query without Order By returns results. A pretty severe issue. This seems to be an issue with DocumentDb query engine. I tried from the Azure DocumentDb Studio and I am able to reproduce this issue. As per comments from others, I feel if I delete the whole db and start again it may work. But this issue is really serious when it happens in production. Any luck with resolving this?

Another update, I am able to get results for the same query with different parameters. For example

select * from c where c.type='message' and c.userId=1 order by c.createdOn

Does not return result, whereas

select * from c where c.type='message' and c.userId=1

returns results with all the documents containing string datatype attribute "createdOn"

However, if I try the same query for a different user
select * from c where c.type='message' and c.userId=2 order by c.createdOn

This returns correct results. This seems to be an issue in indexing where some of the documents are left out of the range indexes.

Somebody has an update on this?

Another update. I read in the Azure documentation that for ordering by string datatype requires the index to be configured with Precision: -1 (maximum). I am trying the same.

+1, I get the same problem. I've had to remove ordering from all DocumentDb calls.

It seems I found the solution to the problem. As per the documentations here . It says

If you would like to perform Order By queries against string properties, then you must specify a precision of -1 for the corresponding paths.

The default indexing policy sets "kind=Hash, precision=3" by default. If it is changed into "kind=Range, precision=-1". Order by on string datatype works as expected.

I recommend anybody who faces these inconsistencies to have a custom indexing policy to suit the query patterns. Cheers!

I'm experiencing this problem whenever adding a calculated property and then ordering by it. For example, the normal properties are "id" and "Name" and then I add on "Rank" = STARTSWITH(Name, query) ? 1 : 0.

So this works:
SELECT VALUE {"Id": root["id"], "Name": root["Name"], "Rank": STARTSWITH(root["Name"], "a") ? 1 : 0} FROM root WHERE CONTAINS(root["Name"], "a")

But this does not:
SELECT VALUE {"Id": root["id"], "Name": root["Name"], "Rank": STARTSWITH(root["Name"], "a") ? 1 : 0}
FROM root WHERE CONTAINS(root["Name"], "a")
ORDER BY root["Rank"] DESC

Well of course it doesn't - root doesn't have a property named "Rank". But all of my attempts to order on an aliased version don't work either.
Alias without Order:
SELECT {"Id": root["id"], "Name": root["Name"], "Rank": STARTSWITH(root["Name"], "a") ? 1 : 0} AS Test
FROM root WHERE CONTAINS(root["Name"], "a")

Alias with Order:
SELECT {"Id": root["id"], "Name": root["Name"], "Rank": STARTSWITH(root["Name"], "a") ? 1 : 0} AS Test
FROM root WHERE CONTAINS(root["Name"], "a")
ORDER BY Test["Rank"] DESC

Error: : {"code":400,"body":"{\"code\":\"BadRequest\",\"message\":\"Message: {\\"errors\\":[{\\"severity\\":\\"Error\\",\\"location\\":{\\"start\\":159,\\"end\\":163},\\"code\\":\\"SC2001\\",\\"message\\":\\"Identifier 'Test' could not be resolved.\\"}]}\r\nActivityId: 03dbbe43-a032-4e72-a5c0-6596f721a929, Microsoft.Azure.Documents.Common/1.22.0.0\"}","activityId":"03dbbe43-a032-4e72-a5c0-6596f721a929"}

I am seeing an empty result set with ORDER BY, but only when using STARTSWITH in the WHERE clause.

I just ran into this issue with OrderBy and DateTime. I found this article https://docs.microsoft.com/en-us/azure/cosmos-db/working-with-dates where they talk about adding RangeIndex. In my case I extended the document after I created the collection. And I couldn't get OrderBy on DateTime to work. I then deleted the collection and recreated it and....suddenly orderby on datetime worked, no additional RangeIndex added. Wonder how this is possible?

I just experienced a weird bug with the ORDER BY clause :

I have 16 documents in my database but 2 of them do not have any 'foo' property. If i do a simple
SELECT VALUE c FROM c
I get 16 documents no problem. But if i do
SELECT VALUE c FROM c ORDER BY c.foo
I only get 14 documents !
I think i should still be getting 16 documents with the 2 items without 'foo' property at the end or the beginning ?

You will find that is by design. Plenty of people have requested that the documents should be returned in this scenario but I don't know if they plan to change it or not....

It makes no sense to me to FILTER the result based on the ORDER BY clause. This is the role of the WHERE clause. I would call this a really bad design. I think that it's probably because they are using the index to order and since a missing property is not part of that index, they are not able to get the information.

The original issue is no longer a problem for me....
Please open a new issue in future

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Lihan-Chen picture Lihan-Chen  ·  6Comments

wingfeng picture wingfeng  ·  8Comments

eiriktsarpalis picture eiriktsarpalis  ·  5Comments

felschr picture felschr  ·  5Comments

bowen5 picture bowen5  ·  4Comments