I've been struggling with what I think should be a simple query, maybe someone can help put me on the right track. Below is a simplified version of my problem:
I have two simple models: User
and Surf
.
The User
model has a sub-documents: boards
. It also has another property friends
that is an array of references to other users, among other simple properties.
The Surf
model has a reference to user_id
as the owner of the surf, friends
as a subset of the user's friends (i.e. who of your friends you surfed with for that surf), and it _also_ has a reference a board_id
, which is part of the user's sub-document described above. I'm not registering the BoardSchema
with Mongoose, because as far as I can tell from the docs on sub-documents, you shouldn't.
All my CRUD operations on user and surf work well. I'm running a query to get all the surfs for a given user's friends and when I populate the friends
property of the surf as shown in the controller, it works. I assume that's the case because it's a reference to a model. However, when I try to populate the board_id
with name
and size
I get a mongoose error:
MissingSchemaError: Schema hasn't been registered for model "User.boards".
This is the User
model:
// user.model.js
/**
* Board Schema
*/
var BoardSchema = new Schema({
name: String,
size: String,
category: String
});
/**
* User Schema
*/
var UserSchema = new Schema({
name: String,
email: { type: String, lowercase: true },
boards: [BoardSchema],
friends: [{type: Schema.Types.ObjectId, ref: 'User'}]
});
module.exports = mongoose.model('User', UserSchema);
This is the Surf
model:
// surf.model.js
/**
* Surf Schema
*/
var SurfSchema = new Schema({
user_id: {type: Schema.Types.ObjectId, ref: 'User'},
friends: [{type: Schema.Types.ObjectId, ref: 'User'}],
board_id: {type: Schema.Types.ObjectId, ref: 'User.boards', required: true},
comment: {type: String, required: false},
sessionDate: {type: Date, required: true},
}, {
collection: 'surfs' // Mongoose pluralizes 'surf' to 'surves', so define explicitly
});
module.exports = mongoose.model('Surf', SurfSchema);
This is the Surf
controller:
// surf.controller.js
var User = require('../user/user.model')
var Surf = require('./surf.model');
exports.feed = function (req, res) {
var userIds = req.user.friends;
Surf
.find({user_id: {$in: userIds}})
.populate('user_id', 'name email')
// .populate('board_id', 'name size')
.populate('friends', 'name email')
.exec(function (err, surfs) {
if (err) {
return handleError(res, err);
}
return res.json(200, surfs);
});
};
The commented out .populate('board_id', 'name size')
in the controller is what is causing the issue. I'm not registering the BoardSchema
with Mongoose, because it's for a sub-document not a model. Do I need to be? Interestingly, I did try to do that and the query runs with no error but the results under the board_id
property are simply null
. Or is there perhaps another way of writing this query to populate the board
from that right user's sub-document of boards? I think the key here is that I have a reference in one model to another model's sub-document. Hopefully that's not anti-pattern in mongoose/mongo. I'd greatly appreciate any help, thanks!
Sorry to disappoint but that is an anti-pattern :( Populate can't populate from another collection's subdocs - the reason why you're getting that error is that there's no model for boards. However, why do you need to populate boards? You already get all the user's boards by default when you call .populate('user_id')
.
However, I see that you're explicitly excluding the list of boards when populating user id. If this is because the list of boards is expected to grow without bound, you've run into the classic mongodb 'array that grows without bound' anti-pattern. Basically, embedding an array is a good idea when you expect the array to have bounded size and don't really want to query for individual elements in that array in isolation. Otherwise it's probably better to have Boards be in their own collection.
This may seem a little counter-intuitive when you're first starting out, but keep in mind mongodb was originally designed primarily to be a backend store for websites and mobile apps; in other words, applications where the client has limited bandwidth and so you need to paginate long lists in order to get your user the most relevant results without having to load everything. If you embed an array that's growing without bound, to paginate you'd have to load the document every single time you load a new page and there's no good way for mongodb to sort the array for you so you'd have to sort on the server.
@vkarpov15 - first off, thank you so much for the quick follow-up and through explanation. You insight is very helpful and it's good to hear from an expert about some of the things I have questioned myself.
With respect to my architecture: your point on the array growing without bounds definitely makes sense. That's why I created the Surf as a separate model because the app is one where a users tracks their surf sessions over time so that model would grow without bounds. Boards (like friends) on the other hand would be some very small set of things which would never grow to more than 15 for a user (it's the set of boards you own, in surfing terms we call it a "quiver" :smile: ). So I created that as a sub-document.
Then, in a given surf session you of course surf with just one of your boards, so I thought the right pattern would be to simply store a reference on the Surf
model to the user's board id. Is that by itself an anti-pattern? This made sense to me because if they ever update something about the board (i.e. the name), it just needs to get updated in the user's subdoc and not in every surf session that used it.
Now, you correctly observed that when I .populate('user_id')
I exclude the list of boards and that's because this endpoint is a session _feed_ - for the logged in user, I am grabbing all their friends then querying all _their_ sessions. I don't want to return the friend with their entire quiver for each session as part of the data to the client. I just want the user and the one board whose object id is stored on the surf. I could return the object id of course directly from the Surf, but I want to look up the name and another property that is stored in the users boards
sub-document. Does that make sense? Do you recommend a way to do that in my controller? Even if it requires looping through my results within the callback of my query, that would be fine for now, but I'm struggling to figure out how to do just that. Or if it does require including the boards in the first .populate('users_id')
(as opposed to explicitly excluding as I'm doing now) so that it's available to be looked up in the callback, that would be fine, too. Thoughts? Again, thanks for so much your help!
Perhaps something like this:
exports.feed = function (req, res) {
var userIds = req.user.friends;
Surf
.find({user_id: {$in: userIds}})
.populate('user_id', 'name boards') // added boards
.populate('friends', 'name')
// .populate('board_id', 'name size') // can't do this as discussed
.exec(function (err, surfs) {
if (err) {
return handleError(res, err);
}
surfs.forEach(function (surf) {
surf.boardInfo = surf.user_id.boards.id(surf.board_id)
});
// TODO: now remove the surf.user_id.boards.
return res.json(200, surfs);
});
};
This also doesn't seem to work - the results return but with no boardInfo
property included. Can I not add arbitrary data to the results? I _think_ everything is available in surfs
, so this lookup and the return can be written synchronously. Am I wrong?
An update - some quick research indicates I need to use the setter:
Surf
.find({user_id: {$in: userIds}})
.populate('user_id', 'name boards') // added boards
.populate('friends', 'name')
// .populate('board_id', 'name size') // can't do this as discussed
.exec(function (err, surfs) {
if (err) {
return handleError(res, err);
}
surfs.forEach(function (surf) {
surf.set('boardInfo', surf.user_id.boards.id(surf.board_id), {strict: false});
});
// TODO: now remove the surf.user_id.boards.
return res.json(200, surfs);
});
Now, aside from removing the entire surf.user_id.boards
, this is pretty much what I need. Please let me know if you think there is a better / more efficient way of doing this while querying. Thanks so much!
It sounds like you want to embed board in the Surf
model. If you don't care about updating the board in the Surf
model if it was changed in the User
model and always want to load the Board
with the Surf
, it makes more sense to just embed it then keep it in a separate document.
Interesting suggestion. The problem there is that I guess I _do_ care if the user updates a board in their "quiver" then I'll want the feed to reflect the updated board. I would expect this to happen pretty rarely though... I suppose the mongo way would then be to simply search for all Surf
s with that board and update all of them all individually?
I will close this so as to not litter the open issues with what is really a request for help. Thanks for your help and insight @vkarpov15!
If you have Board embedded in Surf, you can update using something like Surf.update({ 'board._id': board._id }, { $set: { board: board.toObject() } }, { multi: true }, callback);
If you expect individual boards to be updated infrequently, this could be better schema design in terms of performance. But really depends on your usage patterns.
No worries, schema design is tricky. Thanks for the surfing insight - I enjoy hitting the waves on occasion but I'm nowhere near good enough to warrant having my own board, let alone a quiver of boards :)
@ramdog @vkarpov15 thank you for this. I have a similar issue (although not for surfing purposes...). Probably a great candidate for a plugin (my skillset is not there yet). I can see now that the response is - do it manually.
@ramdog and @vkarpov15 thanks again.
This convesation clear many things to me.
I think like @Tallyb. Maybe this can be a plugin to do it.
3 Years passed, still here we are. At "Manual". :(
Most helpful comment
Sorry to disappoint but that is an anti-pattern :( Populate can't populate from another collection's subdocs - the reason why you're getting that error is that there's no model for boards. However, why do you need to populate boards? You already get all the user's boards by default when you call
.populate('user_id')
.However, I see that you're explicitly excluding the list of boards when populating user id. If this is because the list of boards is expected to grow without bound, you've run into the classic mongodb 'array that grows without bound' anti-pattern. Basically, embedding an array is a good idea when you expect the array to have bounded size and don't really want to query for individual elements in that array in isolation. Otherwise it's probably better to have Boards be in their own collection.
This may seem a little counter-intuitive when you're first starting out, but keep in mind mongodb was originally designed primarily to be a backend store for websites and mobile apps; in other words, applications where the client has limited bandwidth and so you need to paginate long lists in order to get your user the most relevant results without having to load everything. If you embed an array that's growing without bound, to paginate you'd have to load the document every single time you load a new page and there's no good way for mongodb to sort the array for you so you'd have to sort on the server.