Do you want to request a feature or report a bug?
bug
What is the current behavior?
Since the last update of mongoose I experience a lot of "Cursor already in use" in long running queries where a cursor is used in combination with eachAsync().
If the current behavior is a bug, please provide the steps to reproduce.
Run a long runnign query and use eachASync to iterate over it. It's not 100% reproducable, but it happens significantely often.
What is the expected behavior?
No cursor already in use error.
What are the versions of Node.js, Mongoose and MongoDB you are using? Note that "latest" is not a version.
const stream = DbModel.find({
createdAt: { $lt: new Date() },
})
.lean()
.cursor({ batchSize: 1500 });
await stream.eachAsync(myObject => asyncDataProcessing(), { parallel: 5 });
One more thing: It seems sometimes if there is no error, the script just finishes earlier without all processed data sets. Sometimes a lot (e.g. supposed to iterate over 3500 sets, finished only around 700).
Backtrace:
MongoError: Cursor already in use (namespace: 'aDatabase.aCollection', id: 5482770079154774044).
File "/var/app/current/node_modules/mongoose/node_modules/mongodb/lib/core/connection/pool.js", line 466, col 61, in Connection.<anonymous>
return handleOperationCallback(self, workItem.cb, new MongoError(responseDoc));
File "events.js", line 198, col 13, in Connection.emit
File "domain.js", line 448, col 20, in Connection.EventEmitter.emit
File "/var/app/current/node_modules/mongoose/node_modules/mongodb/lib/core/connection/connection.js", line 364, col 10, in processMessage
conn.emit(
File "/var/app/current/node_modules/mongoose/node_modules/mongodb/lib/core/connection/connection.js", line 566, col 15, in Socket.<anonymous>
processMessage(conn, emitBuffer);
File "events.js", line 198, col 13, in Socket.emit
File "domain.js", line 448, col 20, in Socket.EventEmitter.emit
File "_stream_readable.js", line 288, col 12, in addChunk
File "_stream_readable.js", line 269, col 11, in readableAddChunk
File "_stream_readable.js", line 224, col 10, in Socket.Readable.push
File "internal/stream_base_commons.js", line 94, col 17, in Pipe.onStreamRead [as onread]
It turns out this issues is even more critical.
I downgraded to 5.6.13 (before 5.7 where the new driver has been added) and a lot of things changed since then: I get even more results for a simple find() query, eachAsync works again etc etc.
Details to my setup: I have 2 shards with 1 replace and 1 arbitor each. I believe the problems only happen on sharded collections, but couldn't verify this yet.
Anyway BEWARE: the current version of mongoose / mongodb driver does not work correctly!
After running some tests I've found that this is definitely an issue with eachAsync
and the parallel option:
.on('data')
and .on('end')
always succeeds (tested on 50k docs batch size 1000)eachAsync({ parallel: 1 })
always succeeds (tested on 50k docs batch size 1000)For reference this is my test function:
model
.find()
.read('sp')
.batchSize(1000)
.cursor()
.eachAsync(async item => console.log(item._id), { parallel: 8 })
Perhaps most interesting is that parallel <= 8
always succeeds, where as parallel >= 9
always fails at some point with the error:
{ MongoError: cursor id 8862375401742707437 not found
at Connection.<anonymous> (/Users/andrewbarba/Code/Barstool/bets-api/node_modules/mongoose/node_modules/mongodb/lib/core/connection/pool.js:466:61)
at Connection.emit (events.js:198:13)
at processMessage (/Users/andrewbarba/Code/Barstool/bets-api/node_modules/mongoose/node_modules/mongodb/lib/core/connection/connection.js:364:10)
at TLSSocket.<anonymous> (/Users/andrewbarba/Code/Barstool/bets-api/node_modules/mongoose/node_modules/mongodb/lib/core/connection/connection.js:533:15)
at TLSSocket.emit (events.js:198:13)
at addChunk (_stream_readable.js:288:12)
at readableAddChunk (_stream_readable.js:269:11)
at TLSSocket.Readable.push (_stream_readable.js:224:10)
at TLSWrap.onStreamRead [as onread] (internal/stream_base_commons.js:94:17)
operationTime:
Timestamp { _bsontype: 'Timestamp', low_: 3, high_: 1571375722 },
ok: 0,
errmsg: 'cursor id 8862375401742707437 not found',
code: 43,
codeName: 'CursorNotFound',
'$clusterTime':
{ clusterTime:
Timestamp { _bsontype: 'Timestamp', low_: 3, high_: 1571375722 },
signature: { hash: [Binary], keyId: [Long] } },
name: 'MongoError',
[Symbol(mongoErrorContextSymbol)]: { isGetMore: true } }
I doubt the 8 has too much significance, seeing as original thread was using 5, but its incredibly consistent for the data set I'm testing on.
Mongoose v5.7.5
Node v10.16.3
@AndrewBarba @simllll thanks for pointing out this issue and providing details on how to repro it. Turns out some recent refactoring to eachAsync()
caused it to break in a non-obvious way. The fix will be in 5.7.6, and I can confirm the below script runs consistently now:
'use strict';
const mongoose = require('mongoose');
mongoose.set('debug', true);
const Schema = mongoose.Schema;
run().catch(err => console.log(err));
async function run() {
await mongoose.connect('mongodb://localhost:27017/test', { useNewUrlParser: true, useUnifiedTopology: true });
const Model = mongoose.model('Test', Schema({ name: String }));
const count = await Model.countDocuments();
const numDocs = 50000;
if (count < numDocs) {
const docs = [];
for (let i = count; i < numDocs; ++i) {
docs.push({ name: `doc${i}` });
}
await Model.create(docs);
}
console.log('Executing query...');
let numProcessed = 0;
await Model.
find().
batchSize(1000).
cursor().
eachAsync(async item => {
console.log(item.name);
++numProcessed;
}, { parallel: 9 });
console.log('NumProcessed:', numProcessed);
}
Thanks for the quick fix, really appreciate it
Thanks also from me, great that you found the issue so quick +1
Looking forward to 5.7.6 :)
I think the original issue is solved, but somehow my executions take now way longer... e.g. I iterate over a big collection without any conditions ({}) and I "only" get arround 250 entries/seconds. With mongoose 5.6 I got around 2000-3000 per second. I tried playing around with batchSize, but it seems it does not make any difference. Unsure how to dbeug this correclty, but could it be that batchSize does not apply anymore?