Azure-docs: Queries are CASE-SENSITIVE...

Created on 25 Apr 2018 · 19Comments · Source: MicrosoftDocs/azure-docs

One of the KEY points that is missing in the documentation (sorry if I missed it), is that SQL queries involving strings are case-sensitive. For example, the following query:

SELECT * c WHERE c.email = "joe.[email protected]"

Will NOT return any records if the email address stored is "joe.[email protected]" (note the capital T).

In order to perform a case-insensitive query, you must either normalize your data (to upper or lower case) or use LOWER() or UPPER() in the WHERE clause, for example:

SELECT * FROM c WHERE LOWER(c.email) = "joe.[email protected]"

However, be warned - using LOWER() or UPPER() will cause your Request Units (RUs) to spike and will likely cause 429 errors !

In my opinion there should be NO RU charge for using LOWER() or UPPER() and this would solve the problem of case-insensitive queries without getting clobbered by RU charges and 429 errors.

Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

ID: f02ab33b-616c-7055-2375-43c700d609cd
Version Independent ID: 81d1c3e3-c920-61d2-2e5f-8ed42f91c778
Content: SQL queries for Azure Cosmos DB
Content Source: articles/cosmos-db/sql-api-sql-query.md
Service: cosmos-db
GitHub Login: @LalithaMV
Microsoft Alias: laviswa

Pri2 assigned-to-author cosmos-dsvc doc-enhancement triaged

Source

bdcoder2

Most helpful comment

Is there any news on the index mechanism changes? Having to store an upper cases version of the data to avoid the RUs costs a lot of development time.

simonvane on 4 Jan 2019

👍6

All 19 comments

@bdcoder2 Thanks for the feedback! I have assigned the issue to the content author to evaluate and update as appropriate.

Mike-Ubezzi-MSFT on 25 Apr 2018

@bdcoder2 Thanks for the feedback! Yes you should use Upper and Lower functions to consider case-insensitivity. The stackoverflow page - https://stackoverflow.com/questions/30512806/how-to-do-a-case-insensitive-search-on-azure-documentdb responded by MSFT team has couple of examples on how to use upper and lower.

We have a new version of indexing mechanism coming soon (no timelines yet) to support this functionality and it will address the RU consumption too. Hope this helps, we will update you when the feature is available. Let us know if you have any additional questions!

SnehaGunda on 8 May 2018

@bdcoder2 we are closing this issue now, if you have any further questions, feel free to reopen it or create a new issue.

please-close

SnehaGunda on 9 May 2018

@bdcoder2 We will now proceed to close this thread. If there are further questions regarding this matter, please reopen it and we will gladly continue the discussion.

Mike-Ubezzi-MSFT on 10 May 2018

Is there any news on the index mechanism changes? Having to store an upper cases version of the data to avoid the RUs costs a lot of development time.

simonvane on 4 Jan 2019

👍6

Just tried the "Lower" function on a single field of a document db Cosmos unit. WOrks great with a small number of records but our production env, with over a million docs, immediately returns a 429 error and shuts down. Not what you'd expect for the price. So @SnehaGunda when will this be fixed? As @simonvane said...avoiding RUs is costly. It will double the size of our storage and impacts TCO too. @SnehaGunda please reopen this request. No MODERN db should be case sensitive.

esfitsMack on 7 Nov 2019

👍2 ❤1

Any word on this? It's currently preventing our migration.

somanysteves on 23 Dec 2019

@simonvane @esfitsMack @imjuststeve This is all great feedback and a valid concern. Since the purpose of this channel is to address documentation issues, the best means for providing this feedback is via the Azure Cosmos DB UserVoice where you will be able to engage the product team directly. I did a search and did not find an existing entry. It would be advantageous to have this topic created and as more customers up-vote and comment on this subject, the product group will know it is an item that needs attention. Please let me know if you have any additional questions with this. Thank you!

Mike-Ubezzi-MSFT on 26 Dec 2019

@Mike-Ubezzi-MSFT I have added to this use voice post which seems to cover the feature requirement - https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/36119482-support-case-insensitive-indexing

However, it would be very helpful if the documentation covered best practices for handling case insensitive issues in a performance manner rather than everybody having to trawl the internet to find advice and everyone coming up with a different solution.

Thinking about it, where the are deficiencies / feature gaps in general it would save developers so much time if best practice workarounds were documented officially.

simonvane on 27 Dec 2019

👍2

@simonvane Thank you for posting to the UserVoice. I am reopening this Git Issue as a doc-enhancement to address the documentation part of the request.

Mike-Ubezzi-MSFT on 27 Dec 2019

@SnehaGunda and @LalithaMV for awareness.

Mike-Ubezzi-MSFT on 27 Dec 2019

The best practice for this today is to store the data in a separate property all lower case and then use that for queries with values passed through lower().

We are currently working on providing case-insensitive indexing but do not have an ETA to announce at this time.

@Mike-Ubezzi-MSFT , please assign this to @timsander1 Tim, can you please include a note on this article to provide guidance in this scenario?

Thanks.

markjbrown on 3 Jan 2020

@markjbrown Is it safer to store as upper case because of the Turkish I problem? - https://haacked.com/archive/2012/07/05/turkish-i-problem-and-why-you-should-care.aspx/

simonvane on 3 Jan 2020

Wasn't aware of this. Sure if you expect to have Turkish users then yes, use upper case.

markjbrown on 3 Jan 2020

@markjbrown I was just keen to make sure the note that @Mike-Ubezzi-MSFT / @timsander1 were going to do gives good advice and recommends storing as uppercase to avoid issues with text in different languages. It is not restricted to Turkish, that's just an example.

simonvane on 5 Jan 2020

We've recently published an updated article on query performance troubleshooting in which we provide the following guidance.

UPPER/LOWER

Instead of using the system function to normalize data each time for comparisons, instead normalize the casing upon insertion. Then a query such as SELECT * FROM c WHERE UPPER(c.name) = 'BOB' simply becomes SELECT * FROM c WHERE c.name = 'BOB'

Thanks for raising this issue.

please-close

markjbrown on 22 Jan 2020

👍1

@markjbrown Thanks for the tip on the better query performance. While that is very helpful it doesn't really solve the issue as it requires a duplication of data. Does closing this issue mean that it's 'closed and won't fix'?

andy-davies on 28 Feb 2020

The issue from a documentation perspective is closed as we've explained the behavior.

We have a backlog item to provide case-insensitive search and will release this functionality in a future release. You can track that from here. https://feedback.azure.com/forums/263030/suggestions/36119482

Thanks.

markjbrown on 28 Feb 2020

👍1

@markjbrown Thanks for clearing that up!

andy-davies on 2 Mar 2020

Was this page helpful?

0 / 5 - 0 ratings