Mongoose: eachAsync: Cursor already in use

Created on 10 Oct 2019  路  9Comments  路  Source: Automattic/mongoose

Do you want to request a feature or report a bug?
bug

What is the current behavior?
Since the last update of mongoose I experience a lot of "Cursor already in use" in long running queries where a cursor is used in combination with eachAsync().

If the current behavior is a bug, please provide the steps to reproduce.
Run a long runnign query and use eachASync to iterate over it. It's not 100% reproducable, but it happens significantely often.

What is the expected behavior?
No cursor already in use error.

What are the versions of Node.js, Mongoose and MongoDB you are using? Note that "latest" is not a version.

  • tested with v12.6.0 and v10.16.3 (LTS).
  • mongoose: v5.7.3
  • mongodb: 4.0.10
confirmed-bug

All 9 comments

const stream = DbModel.find({
                createdAt: { $lt: new Date() },
            })
                .lean()
                .cursor({ batchSize: 1500 });

await stream.eachAsync(myObject => asyncDataProcessing(), { parallel: 5 });

One more thing: It seems sometimes if there is no error, the script just finishes earlier without all processed data sets. Sometimes a lot (e.g. supposed to iterate over 3500 sets, finished only around 700).

Backtrace:

MongoError: Cursor already in use (namespace: 'aDatabase.aCollection', id: 5482770079154774044).
  File "/var/app/current/node_modules/mongoose/node_modules/mongodb/lib/core/connection/pool.js", line 466, col 61, in Connection.<anonymous>
    return handleOperationCallback(self, workItem.cb, new MongoError(responseDoc));
  File "events.js", line 198, col 13, in Connection.emit
  File "domain.js", line 448, col 20, in Connection.EventEmitter.emit
  File "/var/app/current/node_modules/mongoose/node_modules/mongodb/lib/core/connection/connection.js", line 364, col 10, in processMessage
    conn.emit(
  File "/var/app/current/node_modules/mongoose/node_modules/mongodb/lib/core/connection/connection.js", line 566, col 15, in Socket.<anonymous>
    processMessage(conn, emitBuffer);
  File "events.js", line 198, col 13, in Socket.emit
  File "domain.js", line 448, col 20, in Socket.EventEmitter.emit
  File "_stream_readable.js", line 288, col 12, in addChunk
  File "_stream_readable.js", line 269, col 11, in readableAddChunk
  File "_stream_readable.js", line 224, col 10, in Socket.Readable.push
  File "internal/stream_base_commons.js", line 94, col 17, in Pipe.onStreamRead [as onread]

It turns out this issues is even more critical.
I downgraded to 5.6.13 (before 5.7 where the new driver has been added) and a lot of things changed since then: I get even more results for a simple find() query, eachAsync works again etc etc.

Details to my setup: I have 2 shards with 1 replace and 1 arbitor each. I believe the problems only happen on sharded collections, but couldn't verify this yet.

Anyway BEWARE: the current version of mongoose / mongodb driver does not work correctly!

After running some tests I've found that this is definitely an issue with eachAsync and the parallel option:

  1. Using .on('data') and .on('end') always succeeds (tested on 50k docs batch size 1000)
  2. Using eachAsync({ parallel: 1 }) always succeeds (tested on 50k docs batch size 1000)

For reference this is my test function:

model
  .find()
  .read('sp')
  .batchSize(1000)
  .cursor()
  .eachAsync(async item => console.log(item._id), { parallel: 8 })

Perhaps most interesting is that parallel <= 8 always succeeds, where as parallel >= 9 always fails at some point with the error:

{ MongoError: cursor id 8862375401742707437 not found
    at Connection.<anonymous> (/Users/andrewbarba/Code/Barstool/bets-api/node_modules/mongoose/node_modules/mongodb/lib/core/connection/pool.js:466:61)
    at Connection.emit (events.js:198:13)
    at processMessage (/Users/andrewbarba/Code/Barstool/bets-api/node_modules/mongoose/node_modules/mongodb/lib/core/connection/connection.js:364:10)
    at TLSSocket.<anonymous> (/Users/andrewbarba/Code/Barstool/bets-api/node_modules/mongoose/node_modules/mongodb/lib/core/connection/connection.js:533:15)
    at TLSSocket.emit (events.js:198:13)
    at addChunk (_stream_readable.js:288:12)
    at readableAddChunk (_stream_readable.js:269:11)
    at TLSSocket.Readable.push (_stream_readable.js:224:10)
    at TLSWrap.onStreamRead [as onread] (internal/stream_base_commons.js:94:17)
  operationTime:
   Timestamp { _bsontype: 'Timestamp', low_: 3, high_: 1571375722 },
  ok: 0,
  errmsg: 'cursor id 8862375401742707437 not found',
  code: 43,
  codeName: 'CursorNotFound',
  '$clusterTime':
   { clusterTime:
      Timestamp { _bsontype: 'Timestamp', low_: 3, high_: 1571375722 },
     signature: { hash: [Binary], keyId: [Long] } },
  name: 'MongoError',
  [Symbol(mongoErrorContextSymbol)]: { isGetMore: true } }

I doubt the 8 has too much significance, seeing as original thread was using 5, but its incredibly consistent for the data set I'm testing on.

Mongoose v5.7.5
Node v10.16.3

@AndrewBarba @simllll thanks for pointing out this issue and providing details on how to repro it. Turns out some recent refactoring to eachAsync() caused it to break in a non-obvious way. The fix will be in 5.7.6, and I can confirm the below script runs consistently now:

'use strict';

const mongoose = require('mongoose');
mongoose.set('debug', true);

const Schema = mongoose.Schema;

run().catch(err => console.log(err));

async function run() {
  await mongoose.connect('mongodb://localhost:27017/test', { useNewUrlParser: true, useUnifiedTopology: true });

  const Model = mongoose.model('Test', Schema({ name: String }));
  const count = await Model.countDocuments();

  const numDocs = 50000;
  if (count < numDocs) {
    const docs = [];
    for (let i = count; i < numDocs; ++i) {
      docs.push({ name: `doc${i}` });
    }
    await Model.create(docs);
  }

  console.log('Executing query...');


  let numProcessed = 0;
  await Model.
    find().
    batchSize(1000).
    cursor().
    eachAsync(async item => {
      console.log(item.name);
      ++numProcessed;
    }, { parallel: 9 });

  console.log('NumProcessed:', numProcessed);
}

Thanks for the quick fix, really appreciate it

Thanks also from me, great that you found the issue so quick +1

Looking forward to 5.7.6 :)

I think the original issue is solved, but somehow my executions take now way longer... e.g. I iterate over a big collection without any conditions ({}) and I "only" get arround 250 entries/seconds. With mongoose 5.6 I got around 2000-3000 per second. I tried playing around with batchSize, but it seems it does not make any difference. Unsure how to dbeug this correclty, but could it be that batchSize does not apply anymore?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

efkan picture efkan  路  3Comments

gustavomanolo picture gustavomanolo  路  3Comments

lukasz-zak picture lukasz-zak  路  3Comments

weisjohn picture weisjohn  路  3Comments

adamreisnz picture adamreisnz  路  3Comments