Loopback: How can i make "Group By" query on loopback with mongodb connector ?

Created on 3 Dec 2014  Â·  37Comments  Â·  Source: strongloop/loopback

How can i make "Group By" query on loopback module with mongodb connector ?
I can not find any document which say about that problem ?
Please help me. Thanks

doc

Most helpful comment

I was trying to do something like this and came up with something that works, and I though I would contribute a more complete example, that would have saved me some time.

I have a model I want to do "group by" on. The model is in ./common/models/wafer.js. To that file, add:

  Wafer.groupBy = function( filter, cb ) {
    let waferCollection = Wafer.getDataSource().connector.collection( Wafer.modelName );
    // OPTIONAL if ( ! filter.where ) return cb( new Error( 'missing a where clause in the filter' ) );
    if ( ! filter.groupBy ) return cb( new Error( 'missing a groupBy clause in the filter' ) );
    let pipeline = [];

    if ( filter.where )
      pipeline.push({ $match: filter.where });

    pipeline.push({
      $group: {
    _id: '$' + filter.groupBy,
    count: { $sum: 1 }
      }
    });

    // rename the _id field to be the name of the groupBy field
    let p = {
      _id: 0,      
      count: 1
    };
    p[ filter.groupBy ] = "$_id";

    pipeline.push({
      $project: p
    });

    // console.log( 'pipeline:', JSON.stringify( pipeline, null, 2 ) );

    waferCollection.aggregate( pipeline, cb );
  }

  Wafer.remoteMethod('groupBy', {
    isStatic: true,
    description: 'Invoke a group by',
    accessType: 'WRITE',
    accepts: {arg: 'filter', type: 'object', description: '{"where:{...}, "groupBy": "field"}'},
    returns: {arg:'data', type:['object'], root:true}
  });

You will now see a POST /api/wafers/groupBy endpoint in swagger. My wafer model has "product" and "lot" fields. You can input this filter to get the unique product names and the count for each:

{"groupBy": "product"}

The result will look like:

[
  {
    "count": 1926,
    "product": "DB1681_UAD16"
  },
  {
    "count": 8307,
    "product": "B1681_UAD16"
  }
]

Now, given a product, you can get the list of unique lot numbers and their counts:

{"where": {"product":"DB1681_UAD16"}, "groupBy":"lot"}

and the result:

[
  {
    "count": 16,
    "lot": "6184410"
  },
  {
    "count": 25,
    "lot": "6184400"
  },
  (...)
]

One could probably generalize this further.

remember though this will only work when the datasource is MongoDB.

All 37 comments

Are you meaning, is there a way to run MongoDB aggregations via loopback?

@jrschumacher : No, I'm using this way. It used collection of mongodb to make query on mongodb:

var bookCollection = Book.getDataSource().connector.collection(Book.modelName);
bookCollection.aggregate({
  $group: {
    _id: { category: "$category", author: "$author" },
    total: { $sum: 1 }
  }
}, function(err, groupByRecords) {
  if(err) {
    next(err);
  } else {
    next();
  }
});

But i want a better solution, just use loopback lib but don't use mongodb collection directly.

@bluestaralone I looked at the source but there doesn't seem to be any reference to the aggregator. https://github.com/strongloop/loopback-connector-mongodb

@jrschumacher : Please see the codes:

  • node_modules/loopback-connector-mongodb/node_modules/mongodb/lib/mongodb/collection.js
    Line 596: Collection.prototype.aggregate = function() { return aggregation.aggregate; }();
  • node_modules/loopback-connector-mongodb/node_modules/mongodb/lib/mongodb/collection/aggregation.js
    Line: 253: var aggregate = function(pipeline, options, callback) {...
  • node_modules/loopback-connector-mongodb/node_modules/mongodb/public/api/AggregationCursor.html
    Line 783: var cursor = collection.aggregate([...
    In this file you can see how to use it ?

Currently, i used "aggregate" method through loopback-connector-mongodb, but what do you think if we can use it through loopback Model ?

@raymondfeng
@bajtos
I don't know if thĂ­s problem is solved ?

aggregate is a mongodb query. Currently Loopback doesn't expose access to these queries.

edit: fix typo

@BerkeleyTrue :+1:

Sorry: meant to say 'doesn't...'

any update on this?

I don't think this is ever going to be first class. The way @bluestaralone laid out above is actually valid and the way I do things like this.

I am getting socket closed when use your example with async in parallel in a bit more complex method where i also query other models directly.

{ [MongoError: server XXXXX sockets closed]
  name: 'MongoError',
  message: 'server XXXXX sockets closed' }

It's not clear to me why this is labeled as a doc issue (and assigned to me). If I need to add something to the docs, please tell me what. If not, then I'm going to remove the "doc" label... and if it's not actionable, then I presume it should be closed @superkhau .

@crandmck Ask @bajtos, he added the doc + question label for this issue.

/cc @bajtos

I thought it would be a good idea to document how to make an aggregate (Group By) query in mongodb, using the solution described in https://github.com/strongloop/loopback/issues/890#issuecomment-66581628.

@crandmck if you think this does not belong to docs then I am fine with that too. In which case this GH issue can be closed right away.

OK, thanks, I see. I added https://docs.strongloop.com/display/LB/MongoDB+connector#MongoDBconnector-Aggregate(groupby)query
It could probably use a bit more explanation, but I don't know anything about MongoDB, so suggestions welcome.

@crandmck thank you, the example LGTM. I'm closing this issue as done.

I would suggest to update doc since aggregate take other pipeline stages as an individual parameter.

bookCollection.aggregate( {
    $match: {} 
   }, 
   {
    $group: {} }
     ... , 
  function(err, groupByRecords) {
  if(err) {
    next(err);
  } else {
    next();
  }
});

Thx @mike-aungsan, updated.

Why does the aggregation method take the stages of the pipeline as individual parameters? The mongo driver takes them as an array and it is convenient in the case you need to dynamically assemble the pipeline pushing/pulling stages in the array. Using individual parameters makes it less flexible IMHO.

I was trying to do something like this and came up with something that works, and I though I would contribute a more complete example, that would have saved me some time.

I have a model I want to do "group by" on. The model is in ./common/models/wafer.js. To that file, add:

  Wafer.groupBy = function( filter, cb ) {
    let waferCollection = Wafer.getDataSource().connector.collection( Wafer.modelName );
    // OPTIONAL if ( ! filter.where ) return cb( new Error( 'missing a where clause in the filter' ) );
    if ( ! filter.groupBy ) return cb( new Error( 'missing a groupBy clause in the filter' ) );
    let pipeline = [];

    if ( filter.where )
      pipeline.push({ $match: filter.where });

    pipeline.push({
      $group: {
    _id: '$' + filter.groupBy,
    count: { $sum: 1 }
      }
    });

    // rename the _id field to be the name of the groupBy field
    let p = {
      _id: 0,      
      count: 1
    };
    p[ filter.groupBy ] = "$_id";

    pipeline.push({
      $project: p
    });

    // console.log( 'pipeline:', JSON.stringify( pipeline, null, 2 ) );

    waferCollection.aggregate( pipeline, cb );
  }

  Wafer.remoteMethod('groupBy', {
    isStatic: true,
    description: 'Invoke a group by',
    accessType: 'WRITE',
    accepts: {arg: 'filter', type: 'object', description: '{"where:{...}, "groupBy": "field"}'},
    returns: {arg:'data', type:['object'], root:true}
  });

You will now see a POST /api/wafers/groupBy endpoint in swagger. My wafer model has "product" and "lot" fields. You can input this filter to get the unique product names and the count for each:

{"groupBy": "product"}

The result will look like:

[
  {
    "count": 1926,
    "product": "DB1681_UAD16"
  },
  {
    "count": 8307,
    "product": "B1681_UAD16"
  }
]

Now, given a product, you can get the list of unique lot numbers and their counts:

{"where": {"product":"DB1681_UAD16"}, "groupBy":"lot"}

and the result:

[
  {
    "count": 16,
    "lot": "6184410"
  },
  {
    "count": 25,
    "lot": "6184400"
  },
  (...)
]

One could probably generalize this further.

remember though this will only work when the datasource is MongoDB.

@peebles Hi, I have tried your code but it returned the message no response from the server, below is the screenshot of it, the second screenshot is the console.log output, please help, thank you:

image

image

The code:

Message.groupBy = function( filter, cb ) {
let deviceCollection = Message.getDataSource().connector.collection( Message.modelName );
// OPTIONAL if ( ! filter.where ) return cb( new Error( 'missing a where clause in the filter' ) );
if ( ! filter.groupBy ) return cb( new Error( 'missing a groupBy clause in the filter' ) );
let pipeline = [];

if ( filter.where )
  pipeline.push({ $match: filter.where });

pipeline.push({
  $group: {
_id: '$' + filter.groupBy,
count: { $sum: 1 }
  }
});

// rename the _id field to be the name of the groupBy field
let p = {
  _id: 0,      
  count: 1
};
p[ filter.groupBy ] = "$_id";

pipeline.push({
  $project: p
});

console.log( 'pipeline:', JSON.stringify( pipeline, null, 2 ) );

deviceCollection.aggregate( pipeline, cb );}

Message.remoteMethod('groupBy', {
isStatic: true,
description: 'Invoke a group by',
accessType: 'WRITE',
accepts: {arg: 'filter', type: 'object', description: '{"where:{...}, "groupBy": "field"}'},
returns: {arg:'data', type:['object'], root:true}
});

And it also come with this error in the terminal:

TypeError: Converting circular structure to JSON
    at JSON.stringify (<anonymous>)

Is "field" one of the properties of "messages"? What does the schema for "messages" look like?

the field is topic, the schema:

{
"topic": "string",
"temperature": 0,
"timestamp": "2018-05-08T03:46:05.869Z",
"name": "string",
"description": "string",
"tags": "string",
"last_activity": "2018-05-08T03:46:05.869Z",
"last_temperature": 0,
"id": "string",
"messageId": "string"
}

Then please try:

{“groupBy”: “topic”}

A

On May 7, 2018, at 9:12 PM, JohnHour89 notifications@github.com wrote:

the field is topic, the schema:

{
"topic": "string",
"temperature": 0,
"timestamp": "2018-05-08T03:46:05.869Z",
"name": "string",
"description": "string",
"tags": "string",
"last_activity": "2018-05-08T03:46:05.869Z",
"last_temperature": 0,
"id": "string",
"messageId": "string"
}

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

I modified the pipeline to just hard code the field to group however still getting TypeError: Converting circular structure to JSON logging my pipeline:
pipeline: [
{
"$group": {
"_id": "$STAFF",
"count": {
"$sum": 1
}
}
},
{
"$project": {
"_id": 0,
"count": 1,
"STAFF": "$_id"
}
}
]

When you say whats the schema look like I'm assuming the model json properties... it looks like:
"STAFF": {
"type": "string",
"required": false
},

How do I debug correcting the circular structure error?

Try removing your debug line:

console.log( 'pipeline:', JSON.stringify( pipeline, null, 2 ) );

Otherwise, I don't know. Could be we are on different versions of loopback-connector-mongo. When I was doing this work, I was using loopback 2.0.

removed the console.log, modified the aggregate pipeline and 'cb' part to give me some more granularity in debug. Dropping the second section of the aggregate function to -

function (err, records) {
console.log(records);
if (err) return cb(err);
//if(!err) console.log(JSON.stringify(records, null, 2));
if (records.length == 1)
cb(null, true);
else
cb(null, false);
}

at least gave me the connection object details but I think your right in the lb2 to lb3 doing some prioritization/laziness in executing the query and thereby getting results.

FOUND IT!! - The bulk of the issue was coming from the filter parsing. Ultimately If I changed:
Model.groupBy = function (filter, cb) {
to
Model.groupBy = function (cb) {

IT WORKED!! I got results as expected atleast.

Note this will screw up being able to customise the groupBy. But atleast we/I know where the whole issue is coming from.

For others attempting this with LB3 and need some examples:

Model.groupBy = function (filter, cb) {
    let modelCollection = Model.getDataSource().connector.collection(Model.modelName);
    let pipeline = [];
    pipeline.push({
        $group: {
            _id: '$STAFF',
            //_id: '$' + filter.whateverjsonfields,
            count: { $sum: 1 }
        }
    });

    // rename the _id field to be the name of the groupBy field
    let p = {
        _id: 0,
        count: 1
    };
    p["STAFF"] = "$_id";
    //p[ filter.groupBy ] = "$_id";

    pipeline.push({
        $project: p
    });

    //console.log('pipeline:', JSON.stringify(pipeline, null, 2));
    modelCollection.aggregate(pipeline, cb);
}


Sale.remoteMethod(
    'groupBy', {
        accepts: {arg: 'filter', type: 'json'},
        http: {
            path: '/groupBy',
            verb: 'get'
        },
        description: 'Invoke a group by',
        returns: {
            arg: 'data',
            type: 'json'
        }
    });

In your examples above, you show

Model.groupBy = function (filter, cb)

but your explanation of the problem suggested you had to drop the "filter" argument, ie:

Model.groupBy = function (cb)

So, is your example for L3 folks to follow incorrect?

No final post works really well. the 'drop the filter' commentary had something to do with issues in how the 'remotemethod' json section was configured and how the call then receives args.

Just for further help I managed to not only get aggregate grouping working as follows:

    Sale.groupBy = function (filter, cb) {
        //console.log(filter.filterfield);
        let saleCollection = Sale.getDataSource().connector.collection(Sale.modelName);
        // OPTIONAL if ( ! filter.where ) return cb( new Error( 'missing a where clause in the filter' ) );
        //if (!filter.groupBy) return cb(new Error('missing a groupBy clause in the filter'));
        let pipeline = [];
        pipeline.push({
            "$redact": {
                "$cond": {
                    "if": {
                        "$and": [
                            { "$gt": ["$" + filter.filterfield, new Date(filter.start)] },
                            { "$lt": ["$" + filter.filterfield, new Date(filter.end)] },
                        ]
                    },
                    "then": "$$KEEP",
                    "else": "$$PRUNE",
                }
            }

        });
        pipeline.push(
            { $unwind: '$_id' }
        );
        pipeline.push(
            {
                $group: {
                    _id: { ["$" + filter.groupBy]: "$" + filter.filterfield },
                    'stdDeviation': { '$stdDevSamp': "$" + filter.amountfield },
                    'revenueOfSales': { '$sum': "$" + filter.amountfield },
                    'avgSaleSize': { '$avg': "$" + filter.amountfield },
                    'numberOfSales': { '$sum': 1 },
                }
            }
        );
        pipeline.push(
            { '$sort': { ["_id"]: 1 } },
        )

        //console.log('pipeline:', JSON.stringify(pipeline, null, 2));
        saleCollection.aggregate(pipeline, cb);
    }

    Sale.remoteMethod(
        'groupBy', {
            accepts: { arg: 'filter', type: 'json', description: 'Likely in form of {"amountfield": "SALE_AMOUNT", "filterfield": "SALE_DATE", "start": "2014-01-01", "end": "2016-01-01","groupBy": "month"}' },
            http: {
                path: '/groupBy',
                verb: 'get'
            },
            description: 'Invoke a GroupBy for Sales Figures',
            returns: {
                arg: 'data',
                type: 'json'
            }
        });

But Also proper 'across collection' aggregates working as follows:

Model.aggro = function (filter, cb) {
        console.log(JSON.stringify(filter, null, 2));
        console.log(filter.followcollection);
        console.log(filter.leadfield);
        console.log(filter.followfield);
        Model.getDataSource().connector.connect(function (err, db) {
            let pipeline = [
                {
                    '$lookup':
                        {
                            from: filter.followcollection,
                            'localField': filter.leadfield,
                            'foreignField': filter.followfield,
                            as: "sold_staff_details"
                        },
                },
                {
                    "$unwind":
                        {
                            "path": "$sold_staff_details",
                            "preserveNullAndEmptyArrays": true
                        }
                },
                {
                    "$redact":
                        {
                            "$cond":
                                {
                                    "if":
                                        {
                                            "$and":
                                                [{
                                                    "$gt": ["$sold_staff_details.SALE_DATE", new Date(filter.start)
                                                    ]
                                                },
                                                {
                                                    "$lt": ["$sold_staff_details.SALE_DATE", new Date(filter.end)
                                                    ]
                                                },
                                                ]
                                        },
                                    "then": "$$KEEP",
                                    "else": "$$PRUNE"
                                }
                        }
                },
                {
                    '$group': {
                        _id: "$"+filter.leadID,
                        'numberOfModels': { '$sum': 1 },
                        'revenueOfModels': { '$sum': "$sold_staff_details.AMOUNT" },
                        'avgModelSize': { '$avg': "$sold_staff_details.AMOUNT" },
                        'stdDeviation': { '$stdDevSamp': "$sold_staff_details.AMOUNT" },
                        'childModels': { '$push': "$sold_staff_details" },
                    }
                },
                { '$sort': { 'revenueOfModels': -1 } },
                { '$limit': 1000 }
            ];

            let aggroCollection = db.collection(filter.leadcollection);
            //console.log('pipeline:', JSON.stringify(pipeline, null, 2));
            aggroCollection.aggregate(pipeline, cb);

        });
        }

    Model.remoteMethod(
                'aggro', {
                    accepts: { arg: 'filter', type: 'json', description: 'Likely in form of {"leadcollection": "model2", "followcollection": "model1", "leadfield": "STA_INITIALS", "followfield": "STAFF", "leadID": "STA_NAME", "start": "2016-01-01", "end": "2016-01-31"}'
                },
                    http: {
                        path: '/aggro',
                        verb: 'get'
                    },
                    description: 'Can I Invoke an !Aggrogate! for ANY Mongo table!',
                    returns: {
                        arg: 'data',
                        type: 'json'
                    }
                });

Point for anyone just copying pasting....
The filter JSON that the aggregate and group queries use should really have a bunch of default and error handling added to the logic. Specifically things like 'limit' especially for non index'd mongo attrs will bring the server and HW to a severe slow down.

Paying close attention to the arg input 'types' between lb3 and lb2 pays but seems to be a black art. The documentation on 'object' vs 'json' vs null doesn't make any sense on the website for LB3. In the end 'json' worked for me.

Interestingly after upgrading to connector 3.9.2
I get the following error at this line in my code:
var planCollection = PLAN.getDataSource().connector.collection(PLAN.modelName);
Error: MongoDB connection is not established

base on @peebles code, I made some adjustments based on a comment on https://github.com/strongloop/loopback-connector-mongodb/issues/434#issuecomment-398557920 and finally got the result for lb3 and mongo version 3.2.0

Membercheckin.groupBy = function (filter, cb) {
    let memberCheckinCollection = Membercheckin.getDataSource().connector.collection(Membercheckin.modelName);
    if (!filter.groupBy) {
        return cb(new Error('missing a groupBy clause in the filter'));
    }
    let pipeline = [];
    if (filter.where) {
        pipeline.push({
            $match: filter.where
        });
    }
    pipeline.push({
        $group: {
            _id: '$' + filter.groupBy,
            count: {
                $sum: 1
            }
        }
    });
    let p = {
        _id: 0,
        count: 1
    };
    p[filter.groupBy] = "$_id";
    pipeline.push({
        $project: p
    });
    // console.log('pipeline: ', JSON.stringify(pipeline, null ,2));
    memberCheckinCollection.aggregate(pipeline, function (error, cursor) {
        if (error) {
            cb(true);
            return;
        }
        let result = [];
        const callbackFn = function () {
            cb(null, result);
        }
        cursor.on('data', function (doc) {
            // console.log('doc: ', doc);
            result.push(doc);
        });
        cursor.once('end', callbackFn);
        // cb(null, cursor.toArray());
    });
}

Membercheckin.remoteMethod('groupBy', {
    http: {
        path: '/groupBy',
        verb: 'get'
    },
    description: 'Invoke a group by',
    accepts: {
        arg: 'filter',
        type: 'object',
        description: '{"where":{...}, "groupBy": "field"}'
    },
    returns: {
        arg: 'data',
        type: ['object'],
        root: true
    }
});`

filter {"groupBy":"member_id"}
result:

[
  {
    "count": 1,
    "member_id": "234"
  },
  {
    "count": 95,
    "member_id": "00000"
  },
  {
    "count": 1,
    "member_id": "121"
  },
  {
    "count": 4,
    "member_id": "348"
  },
  {
    "count": 1,
    "member_id": "48"
  },
  {
    "count": 2,
    "member_id": "112"
  },
  {
    "count": 1,
    "member_id": "120"
  }
]

Finally, I got resolve the issue with the toArray method

Products.getDataSource().connector.connect(function(err, db) {
var collection = db.collection('Products');
collection.aggregate(
[{ $match: { "productCode" : "WIN10-NoOS" } }] ).toArray(function(err,servicesData){
if(err){ }else{
cb(null,servicesData);
} });
});

Was this page helpful?
0 / 5 - 0 ratings