Azure-docs: High RUs when query does not find document in collection and in Document Explorer in Portal

Created on 10 Jun 2020  Â·  7Comments  Â·  Source: MicrosoftDocs/azure-docs

As I was reading/exercising as in this learning module...
https://docs.microsoft.com/en-us/learn/modules/monitor-and-scale-cosmos-db/8-index-exercise

Noticed few discrepancies,

  • What is expected of Indexing and actual behavior.
  • Document explorer in Portal shows different RUs than Command shell in Portal (running .net code to run queries)

Attached is full log of the learning module (monitor-and-scale-cosmos-db).
monitorscaledblog.txt

Main issue: When a document could be found in collection by using "WHERE" in SQL query, RUs are less (expected). But when a document could not be found (WHERE clause does not satisfy any document), RUs are very high.

Issue occurred when...
_1. When there are no indexes on any attribute,_

  • 'a0c621c6-ef32-7c45-0c7a-6f4dda6e0591' is known id that exists in collection (RUs = 2.4).
  • 'a0c621c6-ef32-7c45-0c7a-6f4dda6e0591n' (suffixed by "n") id does not exist in collection (RUs = 251.4).
prashant@Azure:~/mslearn-monitor-azure-cosmos-db/ExerciseCosmosDB$ dotnet run -- -c Orders -o QueryCollection -q "SELECT TOP 1 * FROM c WHERE c.Item.id='a0c621c6-ef32-7c45-0c7a-6f4dda6e0591'"
Starting experiment with 1 tasks @ 6/9/20 4:23:23 PM
Performed 1 Query operations @ 1 operations/s, 2.4 RU/s)

-----------------------------------------------------------------
Performed 1 Query operations @ 1 operations/s, 2.3 RU/s)
Total (consumed 2.4 RUs in 1 seconds)
------------------------------------------------------------------
CosmosDB experiment complete
prashant@Azure:~/mslearn-monitor-azure-cosmos-db/ExerciseCosmosDB$ dotnet run -- -c Orders -o QueryCollection -q "SELECT TOP 1 * FROM c WHERE c.Item.id='a0c621c6-ef32-7c45-0c7a-6f4dda6e0591n'"
Starting experiment with 1 tasks @ 6/9/20 4:24:24 PM
Performed 1 Query operations @ 1 operations/s, 252.2 RU/s)

-----------------------------------------------------------------
** Performed 1 Query operations @ 1 operations/s, 251.4 RU/s) **
Total (consumed 252.2 RUs in 1 seconds)
------------------------------------------------------------------
CosmosDB experiment complete


prashant@Azure:~/mslearn-monitor-azure-cosmos-db/ExerciseCosmosDB$ dotnet run -- -c Orders -o QueryCollection -q "SELECT TOP 1 * FROM c WHERE c.Item.id='a0c621c6-ef32-7c45-0c7a-6f4dda6e0591'"
Starting experiment with 1 tasks @ 6/9/20 4:25:12 PM
Performed 1 Query operations @ 1 operations/s, 2.4 RU/s)

-----------------------------------------------------------------
Performed 1 Query operations @ 1 operations/s, 2.3 RU/s)
Total (consumed 2.4 RUs in 1 seconds)
------------------------------------------------------------------

_2. When there was partial index policy set on the collection, but the attribute used in WHERE clause was not indexed._

  • c.Customer.State = 'WAX' does not exist in collection (RUs = 507.2).
prashant@Azure:~/mslearn-monitor-azure-cosmos-db/ExerciseCosmosDB$ dotnet run -- -c Orders -o QueryCollection -q "SELECT TOP 1 * FROM c WHERE c.Customer.State = 'WA'"
Starting experiment with 1 tasks @ 6/9/20 4:31:54 PM
Performed 0 Query operations @ 0 operations/s, 0 RU/s)
Performed 1 Query operations @ 0 operations/s, 1.4 RU/s)

-----------------------------------------------------------------
Performed 1 Query operations @ 0 operations/s, 1.4 RU/s)
Total (consumed 2.8 RUs in 2 seconds)
------------------------------------------------------------------
CosmosDB experiment complete

prashant@Azure:~/mslearn-monitor-azure-cosmos-db/ExerciseCosmosDB$ dotnet run -- -c Orders -o QueryCollection -q "SELECT TOP 1 * FROM c WHERE c.Customer.State = 'WAX'"
Starting experiment with 1 tasks @ 6/9/20 4:32:56 PM
Performed 1 Query operations @ 1 operations/s, 508.8 RU/s)

-----------------------------------------------------------------
** Performed 1 Query operations @ 1 operations/s, 507.2 RU/s) **
Total (consumed 507.7 RUs in 1 seconds)
------------------------------------------------------------------
CosmosDB experiment complete

_3. "Document explorer" show high (double) RUs compared to Azure command shell._

  • 2.9 RUs v/s 5.82 RUs
dotnet run -- -c Orders -o QueryCollection -q "SELECT TOP 1 * FROM c WHERE c.OrderStatus = 'NEW'"
prashant@Azure:~/mslearn-monitor-azure-cosmos-db/ExerciseCosmosDB$ dotnet run -- -c Orders -o QueryCollection -q "SELECT TOP 1 * FROM c WHERE c.OrderStatus = 'NEW'"
Starting experiment with 1 tasks @ 6/9/20 4:28:29 PM
Performed 1 Query operations @ 1 operations/s, 2.9 RU/s)

-----------------------------------------------------------------
Performed 1 Query operations @ 1 operations/s, 2.9 RU/s)
Total (consumed 2.9 RUs in 1 seconds)
------------------------------------------------------------------
CosmosDB experiment complete


* Document explorer metrics for same query...

METRIC
VALUE
Request Charge
5.82 RUs
Showing Results
1 - 1
Retrieved document count
More information
2
Retrieved document size
More information
1919 bytes
Output document count
More information
2


Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Pri2 cosmos-dsvc cxp product-question triaged

Most helpful comment

Hi @AnuragSharma-MSFT, I'm the PM that owns indexing in Azure Cosmos DB. To address questions 1+2: If you don't index a property in your container and then run a query that filters on that property, it is expected that the RU charge will be high. The query engine had no index available so it had to load a lot of documents even though none of them actually ended up matching the query filter. To address question #3: the query charge will be the same regardless of where you run your query. If the query is a cross-partition query, the query's RU charge will increase as the number of physical partitions in the container increases (minimum 2-3 RUs per physical partition). This is described in more detail here: https://docs.microsoft.com/en-us/azure/cosmos-db/how-to-query-container. You can check the number of physical partitions in your container using these steps: https://docs.microsoft.com/en-us/azure/cosmos-db/how-to-query-container

All 7 comments

@prashantma Thank you for the detailed feedback. We are actively investigating and will get back to you soon.

@prashantma Really appreciate the detailed information provided as part of the issue.

We have tried replicating the scenarios with the inputs provided but could not observe any discrepancies and in all the 3 scenarios, request units consumption match for both cases.

Could you please let us know few more details mentioned below:

  1. Were you using the cosmos instance provisioned through the learn site to exercise this or were you using azure subscription to run through these examples?
  2. Have you tried running the same queries multiple times or was it just once you observed the differences? In case you ran them just once, could you try them again.

You will need to follow the learning tutorial from start to finish.

  1. I was using cosmos instance provisioned through the learn site to exercise this. The test data, code, resource templates are all provided by ms learning site.
  2. I tried it multiple times...
    a. if data exists in the collection and could be found, RU are less.
    b. then I run the query for something that does not exist in collection, in that case RU are high.
    c. then I do the query as in a. again with same result.
    I saw same high RUs in case b, anytime a / b / c were repeated.
    _Expected: Low RU after first/recent query as in b._
    re "Data Explorer" discrepancy, I did try multiple times, RUs were high compared to running same query in .net app.

@AnuragSharma-MSFT I see you tested the steps. Not sure who is the point of contact for this module but doesn't the Learn team has a different form to report the feedback?

I get this link from the bottom of the module, does that create an issue in this repo? please redirect if there is a different way that the Learn team handles the issues.

Hi @AnuragSharma-MSFT, I'm the PM that owns indexing in Azure Cosmos DB. To address questions 1+2: If you don't index a property in your container and then run a query that filters on that property, it is expected that the RU charge will be high. The query engine had no index available so it had to load a lot of documents even though none of them actually ended up matching the query filter. To address question #3: the query charge will be the same regardless of where you run your query. If the query is a cross-partition query, the query's RU charge will increase as the number of physical partitions in the container increases (minimum 2-3 RUs per physical partition). This is described in more detail here: https://docs.microsoft.com/en-us/azure/cosmos-db/how-to-query-container. You can check the number of physical partitions in your container using these steps: https://docs.microsoft.com/en-us/azure/cosmos-db/how-to-query-container

Thanks Tim.
Regarding #1-2, will db engine attempt to use data already read from collection and available in memory, knowing nothing has changed in DB, everything that is in collection is in memory...and then RUs will be less when same query is run again and again, compared to running the query first/later times, which will require read from storage.
Regarding #3, exactly same query on same collection/data was run in DB explorer in Azure portal, and .net app. But DB explorer showed higher RUs.

Regarding #1-#2, the optimizing you are describing does not exist in Azure Cosmos DB. If you run the same query on the same container over and over again, the RU charge will always be the same. In this case, you'd want to create indexes (you can index as many properties as you'd like).

Regarding #3: This is unexpected behavior. Could you confirm that you ran the query one after another? If the number of physical partitions didn't change (could change if provisioned RU amount was increased or more data was loaded), the query RU charge will always be the same. It's all the same query engine on the backend so the method in which you run the query (SDK choice, Portal, Azure command shell has no impact on the RU charge (assuming the query itself is the same).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

JeffLoo-ong picture JeffLoo-ong  Â·  3Comments

Favna picture Favna  Â·  3Comments

bdcoder2 picture bdcoder2  Â·  3Comments

spottedmahn picture spottedmahn  Â·  3Comments

Agazoth picture Agazoth  Â·  3Comments