Mongoose: Using eachAsync with parallel option doesn't return all data in query

Created on 6 Jul 2017  路  9Comments  路  Source: Automattic/mongoose

If I use eachAsync with parallel options greater than 2, it return only 101 results from my query that contains more than 101 matched records.

Simplified step to reproduce:

const mongoose = require('mongoose');
const NUM_RECORDS = 1000;
const Foo = mongoose.model('Foo', new mongoose.Schema({bar: String}));
let count = 0;
return Foo.remove().then(() => {
    let foos = [];
    for (let i = 0; i < NUM_RECORDS; i++) {
        foos.push({bar: 'baz'});
    }
    return Foo.create(foos);
}).then(() => {
    return Foo.find() //Get all data from Foo, so 1000 records
        .cursor()
        .eachAsync((foo) => {
            count++;
        }, {parallel: 10});
}).then(() => {
    console.log(`${count} != 1000`); //Expect to have 1000 records iterated
    return mongoose.connection.close();
}).catch(console.log).finally(process.exit);
needs clarification

Most helpful comment

Fixed by running the next()'s in series. The intent of the parallel option is to be able to run multiple functions in parallel, not necessarily get the documents in parallel (that's what batchSize is for).

All 9 comments

this is a problem with you trying to maintain the state of count i think moreso than it is a bug with mongoose. WHat exactly is your problem? If you set parallel: 1, you'll see the right count value. Think of parallel as a sort of batching op, so in your script, count will increment numRecords / parallel times

My problem is quite clear. My query contain 1000 rows. I should expect to iterate through my 1000 rows.
asyncEach have a bug: if we use the options parallel with a value > 2, it iterate only on 101 results instead of 1000 rows.

Think of parallel as a sort of batching op, so in your script, count will increment numRecords / parallel times

I'm sorry but i don't get your point. The parallel way doesn't change how much rows i should see.

Here for exemple a code using bluebird and map in parallel that works:

const mongoose = require('mongoose');
mongoose.Promise = require('bluebird');
const NUM_RECORDS = 1000;
const Foo = mongoose.model('Foo', new mongoose.Schema({bar: String}));
let count = 0;
return Foo.remove().then(() => {
    let foos = [];
    for (let i = 0; i < NUM_RECORDS; i++) {
        foos.push({bar: 'baz'});
    }
    return Foo.create(foos);
}).then(() => {
    return Foo.find() //Get all data from Foo, so 1000 records
        .exec()
        .map((foo) => {
            count++;
        }, {concurrency: 10});
}).then(() => {
    console.log(`${count} == ${NUM_RECORDS}`); //Expect to have 1000 records iterated
    return mongoose.connection.close();
}).catch(console.log).finally(process.exit);

Underlying error thrown in QueryCursor.js

{ MongoError: clientcursor already in use? driver problem?
    at Function.MongoError.create (/mongoose/node_modules/mongodb-core/lib/error.js:31:11)
    at /mongoose/node_modules/mongodb-core/lib/connection/pool.js:497:72
    at authenticateStragglers (/mongoose/node_modules/mongodb-core/lib/connection/pool.js:443:16)
    at Connection.messageHandler (/mongoose/node_modules/mongodb-core/lib/connection/pool.js:477:5)
    at Socket.<anonymous> (/mongoose/node_modules/mongodb-core/lib/connection/connection.js:351:20)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at readableAddChunk (_stream_readable.js:176:18)
    at Socket.Readable.push (_stream_readable.js:134:10)
    at TCP.onread (net.js:548:20)
  name: 'MongoError',
  message: 'clientcursor already in use? driver problem?',
  ok: 0,
  errmsg: 'clientcursor already in use? driver problem?',
  code: 12051 }
100
{ MongoError: Cursor not found, cursor id: 4338373587832
    at Function.MongoError.create (/mongoose/node_modules/mongodb-core/lib/error.js:31:11)
    at /mongoose/node_modules/mongodb-core/lib/connection/pool.js:497:72
    at authenticateStragglers (/mongoose/node_modules/mongodb-core/lib/connection/pool.js:443:16)
    at Connection.messageHandler (/mongoose/node_modules/mongodb-core/lib/connection/pool.js:477:5)
    at Socket.<anonymous> (/mongoose/node_modules/mongodb-core/lib/connection/connection.js:321:22)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at readableAddChunk (_stream_readable.js:176:18)
    at Socket.Readable.push (_stream_readable.js:134:10)
    at TCP.onread (net.js:548:20)
  name: 'MongoError',
  message: 'Cursor not found, cursor id: 4338373587832',
  ok: 0,
  errmsg: 'Cursor not found, cursor id: 4338373587832',
  code: 43 }

When trying to get 100th documents from cursor

Problem found in https://github.com/christkv/mongodb-core/blob/2.0/lib/cursor.js#L614
Calling multiple time cursor.next() while no more document is currently available, cause multiple call on getMore() and so cursor can't handle it.

Trying to make a simplified step to reproduce for mongodb-core

Ticket opened on mongodb-core: https://github.com/christkv/mongodb-core/issues/199
Nothing to be done on mongoose side to fix the problem except update mongodb-core once issue fixed.

Fixed by running the next()'s in series. The intent of the parallel option is to be able to run multiple functions in parallel, not necessarily get the documents in parallel (that's what batchSize is for).

Hello,
For me it's still not fixed:
mongoose 5.0.16 Model.find.exec query:

MongoError: cursor id 2923793755420 not found
at node_modules\mongodb-core\lib\connection\pool.js:598:61
at authenticateStragglers (node_modules\mongodb-core\lib\connection\pool.js:516:16)
at Connection.messageHandler (node_modules\mongodb-core\lib\connection\pool.js:552:5)
at emitMessageHandler (node_modules\mongodb-core\lib\connection\connection.js:309:10)
at Socket. (node_modules\mongodb-core\lib\connection\connection.js:452:17)
at emitOne (events.js:116:13)
at Socket.emit (events.js:211:7)
at addChunk (_stream_readable.js:263:12)
at readableAddChunk (_stream_readable.js:250:11)
at Socket.Readable.push (_stream_readable.js:208:10)
at TCP.onread (net.js:607:20)

{
"name": "MongoError",
"message": "cursor id 2923793755420 not found",
"ok": 0,
"errmsg": "cursor id 2923793755420 not found",
"code": 43,
"codeName": "CursorNotFound"
}

@cosminn777 please open a new issue with the code needed to reproduce what you are seeing along with the output. Thanks!

I had to loop through 350 000 records or more in the future
and this problem was emerging every 3-5 minutes
so i solved this problem like this:

let cursor = create new cursor
let txn = await cursor.next();
while(txn) {
        try {
            //process
            index++;
        } catch (error) {
            console.log();
            console.log("Error occurred. Resuming cursor from index - " + index + " - " + new Date());
            cursor = create new cursor
            txn = await cursor.next();
        }
    }
Was this page helpful?
0 / 5 - 0 ratings