Azure-docs: How do you perform queries without specifying shard key in mongodbapi and how do you query across partitions?

Created on 24 Jul 2018 · 21Comments · Source: MicrosoftDocs/azure-docs

Hi,

After I converted my collection to a sharded collection(dropped and then recreated) all queries that does not specify shard key fails.
There is no documentation whatsoever for partitioning in the mongodb api section of azure cosmos docs beyond how to create a sharded collection.
So how do you perform queries without specifying shard key in mongodbapi and how do you query across partitions?

Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

ID: 9da32a62-a60f-1013-bb2f-688c67a446ec
Version Independent ID: 7a2bf261-0a20-30f0-837d-cf02700034aa
Content: Partitioning and horizontal scaling in Azure Cosmos DB
Content Source: articles/cosmos-db/partition-data.md
Service: cosmos-db
GitHub Login: @rimman
Microsoft Alias: rimman

Pri1 assigned-to-author cosmos-dsvc doc-enhancement triaged

Source

Maggern3

👍2

Most helpful comment

@Maggern3 We will now proceed to close this thread. If there are further questions regarding this matter, please reopen it and we will gladly continue the discussion.

@Mike-Ubezzi-MSFT @SnehaGunda Documentation on querying partitioned collections is still missing. There is a lot of good information on how to create partitioned collections and how to choose partition keys, but not how to actually work with partitioned collections. We have to figure out on our own which queries work without specifying the partition key and which require to include the key.
For MongoDB API it seems that Find and Aggregates work across paritions, while Deletes and Updates need the partition key.

@Maggern3 .... If a partition key is chosen, queries should be optimized by using the partition key. And if you convert a collection to sharded collection, you should rewrite the queries to include the partition key.

Of course, using the partition key to queries is an optimization, but sometimes we want to query on all partitions or we simply don't know the partition of the document.

How I have to query the collection also affects the choice of a partition key as I have to make sure that I'm actually be able to add the key to the query.
We should have documentation on when we have to add partition keys to queries and why.

hansmaad on 5 Mar 2019

👍15 ❤1

All 21 comments

@Maggern3 Thanks for the feedback. We are actively investigating and will get back to you soon.

Mike-Ubezzi-MSFT on 24 Jul 2018

@Maggern3 Thanks for the feedback, I see you have also posted this question in the discussion alias, we have a sync meeting to discuss this, we will update the docs and help you with samples.

SnehaGunda on 25 Jul 2018

👍1

@SnehaGunda Is there an update to be made here or did you reach out to the customer through the discussion alias? Thank you!

Mike-Ubezzi-MSFT on 31 Jul 2018

@Mike-Ubezzi-MSFT We don't have an update yet, but this work item is in the backlog for one of our PM's. We will update the docs soon.

SnehaGunda on 1 Aug 2018

@Maggern3 The documentation update(s) are currently being planned. In the meantime, I have assigned the issue to the content author to evaluate and update as appropriate. Thanks, Mike

Mike-Ubezzi-MSFT on 1 Aug 2018

@Maggern3 Hope you are already unblocked, per my discussion with our team, choosing partitioned Vs non-partitioned collection is one of the initial decisions of designing a database. If a partition key is chosen, queries should be optimized by using the partition key. And if you convert a collection to sharded collection, you should rewrite the queries to include the partition key.

SnehaGunda on 9 Aug 2018

please-close

SnehaGunda on 9 Aug 2018

@Maggern3 We will now proceed to close this thread. If there are further questions regarding this matter, please reopen it and we will gladly continue the discussion.

Mike-Ubezzi-MSFT on 10 Aug 2018

hi, getting error as 'query in command must target a single shard key' for Cosmos Mongo API with SpringBoot and MongoRepository for CRUD. Also we are using '_id' in entity. Collection is unlimited and we have Partition key field in collection. Can you help

pdodda19 on 14 Feb 2019

@Maggern3 We will now proceed to close this thread. If there are further questions regarding this matter, please reopen it and we will gladly continue the discussion.

@Maggern3 .... If a partition key is chosen, queries should be optimized by using the partition key. And if you convert a collection to sharded collection, you should rewrite the queries to include the partition key.

Of course, using the partition key to queries is an optimization, but sometimes we want to query on all partitions or we simply don't know the partition of the document.

hansmaad on 5 Mar 2019

👍15 ❤1

"query in command must target a single shard key", cannot really save on a partitioned collection while using mongo syntax. please escalate

... and yes I do have shrad key, test is as follow. Create Document using portal so I'm sure object is valid with partition key. Now in java code I retreive this object, change its ID to something new - try to save - and get "query in command must target a single shard key" for the very same object that was before in DB...

iteration21 on 20 May 2019

@iteration21 Transactions that span multiple partitions does not seem to be supported (Link). Can you please make a product request to Azure Cosmos DB UserVoice and detail the specific functionality you are seeking. Thanks!

Mike-Ubezzi-MSFT on 20 May 2019

@Mike-Ubezzi-MSFT Hi, I think you are talking about different use case - correct me if I'm wrong but I would like to make insert without transaction. Just like its possible to add document using portal (document that have partition/shard key) I would like to be able to do very same thing using mongoRepository. I expect this should work since it works in portal but its not - seems like a bug not connected to transactions

iteration21 on 24 May 2019

@Maggern3 Hope you are already unblocked, per my discussion with our team, choosing partitioned Vs non-partitioned collection is one of the initial decisions of designing a database. If a partition key is chosen, queries should be optimized by using the partition key. And if you convert a collection to sharded collection, you should rewrite the queries to include the partition key.

Hi Sneha, according to this documentation link below you have a different behaviour in Mongo Native Server:

"If queries do not include the shard key or the prefix of a compound shard key, mongos performs a broadcast operation, querying all shards in the sharded cluster. These scatter/gather queries can be long running operations."

https://docs.mongodb.com/manual/sharding/#advantages-of-sharding

So this is how native mongo is behaving and this is why people that migrate to CosmosDB Mongo API are resentful, because they need to rewrite their source code which is a hell of a selling challenge even inside their own organization, not talking about microsoft KAMs. The main point to understand here for developers is that CosmosDB Mongo API is either not mature enough or just an adapter for another API.

I still hope this can be handled by Microsoft even with parallel queries to different shards(enablecrosspartitionquery?) as Mongo is doing without the need of source code rewriting. Just wanted to confirm once again that READ operations work accross shards without the need of change. However find({ something : something }).count() does not work accross shards (preview features enabled). I also confirm WRITE operations as delete and update currently do not work accross shards, you need to specify the shard property and its value.

rgherta on 21 Aug 2019

Just to add another voice in, but we are facing the same issues using a shared throughput collection with the MongoDB API and Spring Data. We will either have to write a custom repository that attempts to inject shard keys into every operation where possible, or migrate off CosmosDB.

When something is advertised as Mongo compatible, we expected Spring Data to work without heavy lifting.

mgolub2 on 21 Aug 2019

👍6

I am also facing issue with Shared key. I am using flask mongoengine library to handle the database. When I try to write in the database, it says mongoengine.errors.OperationError: Could not save document (Shared throughput collection should have a partition key

Can someone help me with this?

anurag-ae on 15 Sep 2019

+1 - this disparity between mongo and cosmos, and especially for collections that are used extensively in an enterprise system, makes it very difficult to simply migrate from mongo to cosmos.

Users are left with a choice of either non sharding their collections, limiting them to 10k RUs & 10GB of storage, or sharding their collections and revising every usage of their migrated collections across the entire application stack.

In order to accommodate for a simple migrate scenario, I think that what should be considered is either increasing the limits for non sharded collections,
or to implement a cross partition talk for all mongo operations which are supported natively.

dany74q on 12 Dec 2019

👍2

Update: It seems that cross-partition queries, counts and updates (that is, w/o specifying the shard key) are now supported for the 3.6 cosmosdb mongo engine - very cool !

dany74q on 7 Jan 2020

@dany74q Any reference?

hansmaad on 8 Jan 2020

Hi, I am using cosmos mongo db 3.6 and java mongodb driver 3.11.1. I am trying to update(renaming a field) in my shard collection (which has millions of data), but getting _The full response is {"_t": "OKMongoResponse", "ok": 0, "code": 61, "errmsg": "query in command must target a single shard key", "$err": "query in command must target a single shard key"}__

Bson filter= Filters.gt("_id.orderId",1); (_id.orderId is not a unique key, it just a partition key)

Bson updatePart = combine(rename("partitionNbr","vendorNbr"));

UpdateResult result = mongoDatabase.getCollection("lineDetails").updateMany(filter,updatePart, new UpdateOptions().upsert(false));

Still the above update query works for single document with Filters.eq(partitionkey). So facing same issue(target a sigle shard key) for multiple updates across partitions for Filter.gt/gte/lt/lte..

aweseeker on 13 Feb 2020

I came across with the same problem using the Mongo API.
Is there really not a way to execute an update many in a Cosmos instance? Not even with the SQL syntax? I couldn't find anything.

Thank you!