I was encountering issue with both crawler bots (that i would still like to be able to crawl my site for data), and AWS's HealthChecker bots getting session keys created for them, as this added unnecessary keys to my storage (redis).
I found piggy backing off the ua-parser2 library gave me most of what i needed, except for HeathChecker bots (https://github.com/commenthol/ua-parser2/issues/1). I added the following code to the function session(req, res, next) function just below the self-aware check:
var UA = require('ua-parser2')(); /*included at top of document*/
// dont generate for bots
var isBot = false;
var browserDetails = UA.parse(req.headers['user-agent']);
if(browserDetails.ua.hasOwnProperty('type')){
if(browserDetails.ua.type == 'bot'){
isBot = true;
}
}
if(browserDetails.string.indexOf('HealthChecker') >= 0){
isBot = true;
}
if (isBot) return next();
Apologies if theres a super simple alternative to stopping bots from generating session's, but my googling turned up nothing :(
Happy to submit a pull request, but figured you guys might prefer to look into an alternative user-agent parser/detector
Hi! So there are of course multiple ways to approach this problem, but I'll admit up front that we wouldn't be adding anything to this module that does user agent sniffing for various reasons.
So let's start with the best way (not sniffing user agents): Set the saveUninitialized option to false and then in your code, only set things on req.session when an action has occurred and you actually _want_ a session to exist to hold onto that data. This means that just making a request to your site won't just automatically create a session until your code actually decided to put something into that session.
Otherwise, if you really do want to do use-agent sniffing, you can simply use the "middleware wrapping" pattern, where you create a middleware but you app.use() your own mini middleware that decides to execute the former middleware. Extending your example above, you would end with the following:
var express = require('express')
var expressSession = require('express-session')
var uaParser2 = require('us-parser2')
var app = express()
var sessionMiddleware = expressSession({
// your configuration
})
app.use(function useSession(req, res, next) {
// this would have normally been app.use(sessionMiddleware)
if (!isBot(req)) {
return sessionMiddleware(req, res, next)
}
next()
})
function isBot(req) {
var userAgent = req.headers['user-agent']
if (!userAgent) {
// assume not a bot without a user agent
return false
}
var browserDetails = usParser2.parse(userAgent)
return browserDetails.ua.type === 'bot'
|| browserDetails.string.indexOf('HealthChecker') !== -1
}
Hey!
Thanks so much for your response! That's a much better way to do it. Appreciate you taking the time to help me out with that.
Cheers
It is absolutely no problem :)
I just came across this thread and I want to thank you both for the insightful information!
Most helpful comment
Hi! So there are of course multiple ways to approach this problem, but I'll admit up front that we wouldn't be adding anything to this module that does user agent sniffing for various reasons.
So let's start with the best way (not sniffing user agents): Set the
saveUninitializedoption tofalseand then in your code, only set things onreq.sessionwhen an action has occurred and you actually _want_ a session to exist to hold onto that data. This means that just making a request to your site won't just automatically create a session until your code actually decided to put something into that session.Otherwise, if you really do want to do use-agent sniffing, you can simply use the "middleware wrapping" pattern, where you create a middleware but you
app.use()your own mini middleware that decides to execute the former middleware. Extending your example above, you would end with the following: