Cosmos recently added bulk API support to REST API. .NET has already added it to the SDK and we want to add it to JavaScript as well.
Requests are sent to the normal document endpoint as a POST with the body containing an array of operations.
x-ms-cosmos-is-batch-request - Value: True
x-ms-documentdb-partitionkeyrangeid - Integer. Must be set for all requests
x-ms-cosmos-batch-continue-on-error: True/False
Indidcates if the backend should continue processing operations if an operation fails
x-ms-cosmos-batch-atomic: false/true
Indicates if the backend should treat operations as an atomic batch
type Body = Operation[]
export interface Operation {
operationType: 'CREATE' | 'READ' | 'UPDATE' | 'DELETE';
/* String conforming to header partition key value */
partitionKey: string;
id: string;
ifMatch: string;
ifNoneMatch: string;
resourceBody: object;
}
POST
x-ms-documentdb-partitionkeyrangeid: 0
x-ms-cosmos-batch-continue-on-error: True
x-ms-cosmos-is-batch-request: True
x-ms-date: Thu, 20 Feb 2020 09:52:04 GMT
Authorization: * removed *
x-ms-version: 2018-12-31
Accept: application/json
[{"operationType":"Create","partitionKey":"["TBD1"]","id":"","ifMatch":"","ifNoneMatch":"","resourceBody":{"id":"doc12","name":"Microsoft","founder":"Gates","Status":"TBD1"}},
{"operationType":"Create","partitionKey":"["TBD1"]","id":"","ifMatch":"","ifNoneMatch":"","resourceBody":{"id":"doc13","name":"FooBar Inc","founder":"Foo","foundedOn":"2018","Status":"TBD1"}}]
Ideally, we expose this API in a couple of ways:
This should be a thin wrapper around the REST API that gives a user full control and allows them to mix operation types. Example:
container.items.bulk([creatOperation, updateOperation, deleteOperation], { atomic: false, continueOnError: true})
Existing APIs should expand if possible to support bulk operations:
container.items.create([doc1, doc2], { atomic: false, continueOnError: true})
partitionKeyRangeIDExpose partitionKeyRangeID isn't strictly required to make bulk APIs, but exposing a low-level API without also exposing partitionKeyRangeID may not work. It also requires more SDK effort to determine the partitionKeyRangeID for a given document. We would likely have to bring in a client-side murmur hash implementation and also implement a new API for retrieving opaque values for the partition key ranges.
Bulk operations that span partitions won't be able to actually execute in a single request. Will users find this confusing? How will we communicate this information to them in the response?
cc/ @bterlson Did the arch board decide on guidance for bulk operations?
Yo! Was looking for exactly this. I'm migrating a CosmosDB-based EventStore from .Net Core to TypeScript and strumbled on the lack of this API. The project is a no go now since we can't do batch conditional writes using this SDK.
The only workaround we see is by adding a stored procedure that takes the array of events to be written but I would like to avoid it since maintenance of stored procedures are very cubersome since we can't properly test it and it doesn't support Promise/async/await which makes the code _really_ hard to read with a nightmare of callbacks.
Today this is what I do (simplified version) on the .Net Core version of the product:
TransactionalBatch batch = this._container.CreateTransactionalBatch(pk);
batch.CreateItemStream(headerStream);
batch.CreateItemStream(evtStream);
var batchResult = await batch.ExecuteAsync();
So, do you have plans to have this out or any feasible workaround?
Thanks!
The only current workaround is the one you mentioned, using stored procedures. This is on my plate, but I don't have a firm ETA yet
Ok thanks, I'll keep an eye here. Thanks!
@southpolesteve one question... I've looking at the REST API docs for the API that support batch but I can't find the docs. I was wondering if while you don't have the SDK updated, I could post the rests there. It is a pretty simple scenario we have here which only requires batch append and on a few use cases a replace/upsert request so maybe we can use the REST API while the SDK isn't out.
Do you have a sample request that we can use here?
The API isn't public yet. The above is just based on my own notes.
Oh! Ok. So I wonder how the .Net SDK is working then...
The .NET SDK actually uses a slightly different API internally. Since .NET has "direct mode" where it makes a direct TCP connection to a backend replica, it can do some things we cannot do yet in the JS SDK.
Hate to sound desperate but :)
Any updates you could share?
I'm targeting to have this done by the end of June
Quick update. I know many of you are waiting on this feature and "end of June" has passed. We are actively working on it but the client-side hashing of partition keys is proving particularly complex and taking extra time. Estimating another 4 weeks. Thank you all for your patience!
If you are curious about the details, you can follow along on one of the initial PRs https://github.com/Azure/azure-sdk-for-js/pull/9821
Hey issue watchers! Bulk support has been merged. We've held off on doing an official release while we get some feedback on the public API shape. Please do tell us what you think! This is designed to be a low level flexible API. We may add some higher level convenience methods later.
Examples in the PR above and you can get it by installing the dev release: npm install @azure/cosmos@dev
This has been release in 3.8.0
Hey I'm having some issues implementing this - I am trying to update the following code:
for (i = 0; i < rows.length; i++) {
container.items.create(rows[i]);
}
So far I have tried:
container.items.create(rows);
this returned a 400 BadRequest One of the specified inputs is invalid
I then had a look into the source and have been trying this (also trying to specify partitionKey).
operations = [];
for (i = 0; i < rows.length; i++) {
operations.push({
operationType: "Create",
resourceBody: rows[i]
});
}
container.items.bulk(operations);
This doesn't give me an error but also nothing appears in the cosmos container. Am I doing something wrong? Or are you able to point me in the direction of any available documentation for the bulk API? I appreciate this is all very new :) Thanks!
Hey @liamFerris, it might be best to open a new issue and report a bug as this is more of a feature issue. But I'm happy to help take a look - it would be helpful to see what that resourceBody argument ends up being, if you could paste an example of rows[i] that could help me debug
I am surprised you get no error - are you forgetting to await on the container.items.bulk() call?
@zfoster thanks for the help & speedy response! I am making a new bug issue currently :)
I am surprised you get no error - are you forgetting to
awaiton thecontainer.items.bulk()call?
Me too... yeah I have tried with awaits, also .catch() which again logs nothing and .then() which will successfully log a message
an example of rows[i] is a flat json object with string values:
{
"Town": "blah",
"City": "blah",
...
}
Most helpful comment
I'm targeting to have this done by the end of June