Ghost: Feature: Logging

Created on 21 Jul 2016 · 19Comments · Source: TryGhost/Ghost

Logging

goals

full logging which outputs to stdout/stderr as we do now (and better!)
ability to provide an adapter in future, to customise the destination/output of the logging system
error levels
application levels
delete logic in errors/index.js

reasons why we have to change things
we need a consistent way to a log into a file/stdout --> right now we use morgan to log requests and a custom logging (chalk and console.log) for errors
we want to log more then just a single line GET URL StatusCode for requests (especially in development mode)
what we want to achieve is printing all logs (request/response/errors) in the same format (which i think can be JSON)
we need to tidy up (or remove) /errors/index.js, because it's doing more then it needs to do, is way to complicated in my opinion (we export ~15functions) and we want to differ the errors from a logging unit
we want to create a real logging adapter (which can be replaced)
this adapter can/should be used in each Ghost microservice
we want to handle and tidy up context/help for errors
we don't have proper levels (application levels, error levels, error codes)
log into file has the background: if we log stdout into the process, the process will grow over time. if we log into a file, we can rotate on that log file an delete after x days (this is helpful for each blog running in production)
remove the logger from migrations
more reasoning/descriptions can be found in #2001 and #6526

bunyan vs. winston vs. debug

As already discussed, debug is a nice npm tool to get debug logs per file with auto-includes milliseconds between each debug log.
But our big goal is creating a logging unit, which can do more then just printing debug logs.
I've used winston in the past and we've switched to bunyan, because the bunyan API is easier to use, easier to read and it has the advantage of nice JSON prints. That's why i would go with bunyan.

create a default logging adapter

server/logging/default.js

Developers should easily add custom loggers, if they prefer a different logger.
It should provide functions like info, error, warn, and debug.

info

with info you can just log a single line like "ghost server is starting"
with info you can also log complex json objects like request/response

error

accepts an error object and optional request/response object

warn

accepts only a string

debug

accepts a single JSON object
accepts a single string

  var logging = require('/logging'); 
  logging.error({ err: err, req: req, res: res });
  logging.info('server starts now...'); 
  logging.info({ req: req, res: res });

output

Right now our server print looks like (only stdout):

GET / 200 140.121 ms - -
GET /assets/js/jquery.fitvids.js?v=4d853bd599 200 17.077 ms - -
GET /assets/js/index.js?v=4d853bd599 200 16.385 ms - -
GET /assets/css/screen.css?v=4d853bd599 200 19.621 ms - -

This print is managed by morgan, which is configured in middlewares/index.js. We will delete this.
Instead we will auto log requests/responses/errors.

  expressApp1.use(logging.utils.express);
  expressApp2.use(logging.utils.express);

function express
  res.once('finish', function() {
    if (req.err) {
      logging.error({ req: req, res: res, err: req.err });
    }
    else {
       logging.info({ req: req, res: res });
     }     
  });

   next();
});

Which can look like (transport: stdout)

Which can look like (transport: file)

We could extend our upcoming Ghost CLI to offer: ghost-cli log (which will prettify the .log file)

log into file

We should be able to configure the transport (file, stdout)
We can start defining the transport per env.

production
file only

staging
file only

development
stdout only

application levels

Every env is by default in level info.
You can either force starting it different mode (debug) by changing the config or using NODE_LEVEL.

error levels

Each error will be default in level "normal".
We can define an error as critical, which get's treated differently in Ghost(Pro).

time diffs

We can add time diffs between logs, like the npm module debug is offering. Not high prio.

ensure we don't log credentials

Filter attributes.

bring help/context into errors

Right now we pass context and help as different parameters.
We want to achieve having context and help as part of an error.

show debug logs in testing mode

NODE_LEVEL=debug grunt test or grunt test -d (--debug)
Should show print outs like errors.

TODO

[x] .gitkeep log Folder in content
[x] logging on windows
[x] add issue to ignition: source out logging middleware
[x] create issue: jsonapi format (need to solve together with Ghost-Admin)
[x] logging adapter to ignition
[x] design how to add custom loggers
[x] adapt express app handling for errors
[x] remove errors/index
[x] remove all errors and create GhostError to inherit from
[x] option pattern for errors
[x] ghost.log location
[x] care about info, warn and debug (they should all use the available logger)
[x] how many rotated files? (100 was suggested)
[x] use debug instead of logging.debug - one true way for now
[x] reconsider stdout pretty stream (show some other examples)
[x] add duration for requests
[x] show errors when running tests
[x] print into one file or print into multiple files?
[x] hide stack for html pages? (404, Maintenance & DatabaseVersion errors don't need a stack)
[x] add https://github.com/TryGhost/Ghost/compare/master...kirrg001:1.0.0-dev/error-creation?expand=1#diff-62c0cd82e9025a6fff595f2ce261aa2bR54
[x] config log file location configureabe
[x] individual ID to errors
[x] add unique request id
[x] decouple req.err?how?

feature

Source

kirrg001

Most helpful comment

@jalogisch I'm gonna call you out here - these sorts of one line comments are not at all helpful. I'm sure you have a good point to make, so tell us what it is!? Why would we want to use your suggestion? What about it is better than the proposal above?

@kirrg001 has spent a lot of time investigating and invested time in writing up a full spec - if you want to be heard, one line isn't enough 😉

ErisDS on 22 Sep 2016

👍2

All 19 comments

Maybe you can implement Structured Logs with Gelf?
http://docs.graylog.org/en/2.1/pages/gelf.html

jalogisch on 22 Sep 2016

@kirrg001 has spent a lot of time investigating and invested time in writing up a full spec - if you want to be heard, one line isn't enough 😉

ErisDS on 22 Sep 2016

👍2

Sorry @ErisDS you are right my bad. I just behave like the people I tend to ignore. Sorry @kirrg001 that I do not honor your work with a valid comment include why I think this is useful.

In the spec is already written that you will Log with JSON and that is very sweet as it will make the Logfile more useful if you transport it over to a central place.

My Idea I just pushed that rude in the round, why not implement GELF that will give not only Graylog the ability to parse the Information out of the Logfile without further work from the user.

Everything is better than the current stdout logging - and again sorry for beeing such a dick.

And - even this is not the right place - thank you for starting Ghost. For me, this is the best Blog Plattform.

jalogisch on 22 Sep 2016

@jalogisch, @ErisDS: I use graylog2 and I would appreciate a config file or adapter I could plug GHOST blogs into a lot! For development it wouldn't make my workflow a lot better, but to know when things to sideways in production, it's pure gold, also for 404 hits.

JonathanMH on 25 Sep 2016

Thanks for feedback and interest.
As described, logging should act as an adapter. It's not yet decided how this will work - so people can use a different adapter for logging or extend the default one.

As far as i understood graylog2 can act as a stream, see https://github.com/mhart/gelf-stream.

kirrg001 on 26 Sep 2016

Hey all - long time follower of Ghost, first time poster. I felt that I might be able to bring something to the table here cause logging is kinda my jam by day...

Logger (alternative)

For node applications, my team has used log4js-node (https://github.com/nomiddlename/log4js-node). It's been pretty good to us, and for those who are Java developers also, it will be familiar. It's ability to easily allow developers to configure logging and add 'appenders' makes it extremely convenient to log to any output you might think of (it comes with plenty OOB but you can also write your own).

One other suggestion I would make here, and this is critical - try find/build into a logging framework a request context. That is, on each user request, create a guid which is passed through to each subsequent log call in the system. This is especially useful for developers when you have many concurrent requests, or your system is distributed. (btw, i don't want to open a can of worms, but someone said microservices? why does ghost need a microservice architecture? i know... probably a question for another thread)

Levels

Definitely all the usual + request entry/exit i.e. error, warn, info, req entry/exit, debug, trace
As someone mentioned probably a good idea to find a way to obfuscate/eliminate sensitive info from logs.

Don't separate your http request entry/exit from application logs, honestly it makes it that much easier to trace an issue.

Developer Interaction

A few people have thrown around Graylog 2. Honestly I love Graylog, its a great OS project, but I don't really see this as being a concern for Ghost, albeit some saying printing logs in GELF format for easy Graylog ingestion. The main thing is that developers have some _easy_ way in which they can access their logs from whatever tool they feel comfortable with (whether that be tail, ELK, graylog etc.), and that the logs have a clear and consistent structure. JSON was suggested originally and I would back that. GELF has a good base structure to it, and it is in JSON, but Ghost needs to go with what works for it. One pitfall of GELF, is that although you can add custom fields to it, they are restricted to string/number. That sucks if for some reason you need the ability to log more complex objects or collections in a single message (and my team has come across that).

One pearl of wisdom here - consider the format of your timestamp carefully (many tools are restricted to millisecond precision and additional processing will have to be done to make the messages ingestable).

One other note on Graylog - its super powerful and will take just about any message that gets to it (through tcp, kafka, logstash etc, and in plaintext, json, gelf) so again, I wouldn't worry so much about the tool as producing quality logs with consistent structure, and easy adaptability.

twgraham on 2 Oct 2016

@twgraham Hey thanks for suggestions.

For node applications, my team has used log4js-node (https://github.com/nomiddlename/log4js-node). It's been pretty good to us, and for those who are Java developers also, it will be familiar. It's ability to easily allow developers to configure logging and add 'appenders' makes it extremely convenient to log to any output you might think of (it comes with plenty OOB but you can also write your own).

Bunyan supports streams, which looks like the same then "appenders".

One other suggestion I would make here, and this is critical - try find/build into a logging framework a request context. That is, on each user request, create a guid which is passed through to each subsequent log call in the system. This is especially useful for developers when you have many concurrent requests, or your system is distributed. (btw, i don't want to open a can of worms, but someone said microservices? why does ghost need a microservice architecture? i know... probably a question for another thread)

I assume you are talking about a correlationId?

kirrg001 on 3 Oct 2016

No worries if bunyan has the same concept. Yep a correlationId at the core of it - thats whats important. I guess I refer to it as the context because there might be some other pieces of data that might be pertinent to subsequent log calls (i've come across cases).

twgraham on 3 Oct 2016

Now that we have merged some basics for our new logging module 🎉 , there are still some open questions we would like to get feedback to.

Default GhostLogger behaviour

Right now the default GhostLogger (see https://github.com/TryGhost/Ghost/blob/master/core/server/logging/GhostLogger.js) is able to do the following things:

register streams depending on the logging config (stdout or/and file)
register serialisers to define the volume of the output
expose a default setting of functions for logging within Ghost: error, warn, info, request
it provides a function to remove sensitive data from a log entry

Log into multiple log files

When GhostLogger logs into file, right now it only logs into one single log file.
We already got the feedback to stream the errors into a separate file. We can do that.
So we could create a file named ghost/errors.log and ghost/access.log.
ghost/errors.log would contain HTTP requests with errors and errors which happened outside of an HTTP request.
ghost/access.logwould only contain 20X requests.

To give you a background why logging HTTP requests with errors is important:
We log into file via JSON. If we would log only the pure errors into an errors.log file and all the HTTP requests into a separate file, it's hard to find the connection between HTTP request an error. So if an error occurred within an HTTP request, then they strongly belong together.
When always logging HTTP request and error together, they both appear as one log entry in JSON log file. If you now want to send your logs to a centralised place (for example to ES/Kibana) to analyse your errors, it is very easy to read, make statistics, send mails based on these JSON data.

Location of the log files

Best practise would be /var/log/ghost. But what if somebody runs two processes of Ghost on one machine?
We could add any unique folder prefix. How about domain+port?Would look like /var/log/ghost/my-domain:8080/.
Any concerns about permissions?I don't have any.

Custom Logging adapter

I would like to publish the GhostLogger to npm when it's ready, so people can install it and inherit from it.
I think people are interested in adding custom streams (see https://www.npmjs.com/search?q=bunyan) or they want to override the whole existing logging functions, also fine.

Current function set

setStreams
This function reads the logging config and based on it, it creates a stream for file and a stream for stdout.
People could override this function and set their own streams.
I would suggest having a more smaller function likesetStream, which they can call.
This takes a couple of required parameters. So they can just call setStream and add another stream or override a default one.
So people could still use the stdout stream, but can override the file stream.

setSerializers
Ghost will send by default three parameters: req, res, err. This can't be changed.
So it makes sense to only have one serialiser for now?I don't see a reason to override the serialisers?

Rotate files

By default we have configured 3 rotated files each 1 week. This was just a default configuration to start with.
I think it's fine to keep the last 10 weeks for example. And people can increase that value per config. What do you think?

Default logging config

Right now we have a default logging config, which is:

production

    "logging": {
        "level": "info",
        "rotation": true,
        "transports": ["file"]
    }

development

    "logging": {
        "level": "info",
        "rotation": false,
        "transports": ["stdout"]
    }

There is for example no level per stream/transport. And i don't see any reason why we would need that?
In real scenario, you have disabled file stream and enabled stdout stream in development mode.
In real scenario, you have disabled stdout stream and enabled file stream in production mode.
You would never enable both. Any opinions?

Stdout pretty stream

We log into file with pure JSON. But that looks not pretty when just printing JSON to stdout in development mode.

Current usage

current

current1

This is basically a fork of node-bunyan-prettystream, the way we output was just rewritten within 20mins.
We can extend this pretty streamer with a short mode to not get a huge output if you don't want to see it.
We can make this implementation abstract and publish. Right now it strongly depends on the Ghost structure, which is req, res, err.

Alternatives

I've researched a lot of times and never found a nice npm pretty streamer. But let me introduce them:

https://github.com/mrrama/node-bunyan-prettystream

I've tried that one in the past and i was never satisfied. See here:

alternative

First of all: not good maintained. There are a lot of forks, people just adapt what they want and that's it.
It's not bad, but has some ugly unhandled cases like: req.params=[Object].

https://github.com/benbria/bunyan-debug-stream

Not good support for req/err/res output.

https://github.com/Wiredcraft/bunyan-logger

Has not a good code design. No granular options. Hard to integrate!

https://github.com/thlorenz/bunyan-format

Always throws an Error for me. And expects always a string when calling the loggers, which is weird.

One thing to notice: we can replace the stdout pretty streamer at any time, it's one single line and if people don't like it, they can create a custom logger adapter and only override the stdout streamer.

If you find something you like, please suggest!!! I am happy about it!

kirrg001 on 4 Oct 2016

Location of the log files

Best practise would be /var/log/ghost. But what if somebody runs two processes of Ghost on one machine?
We could add any unique folder prefix. How about domain+port?Would look like /var/log/ghost/my-domain:8080/.
Any concerns about permissions?I don't have any.

Perhaps change the colon : into an underscore or something? Or maybe just the domain itself. Having the colon in the log file could break a few things on different OS' (Windows machines for example)

jloh on 5 Oct 2016

👍1

Another TODO I wanted to log here (maybe we need to split these into must-haves and nice-to-haves 😉 ) is to do with grouping errors - it's not really related to logging but to error handling, but this is kinda of the catch all issue for that stuff now 😁

One of the fiddly bits we have at the moment, is the idea that we might return multiple Validation errors. That's a cool idea and super useful for returning nice API responses to our clients. However, it's a PITA from an implementation perspective.

Also, Bluebird now throws warnings if you try to call Promise.reject() with a non-error, e.g. if you call it with an array of errors.

With themes, we wrap multiple errors inside a ThemeValidationError. For validation errors, I think it might be a good idea to wrap multiple errors inside an AggregateValidationError or to support multiple errors inside of one ValidationError? In short, lets always pass 1 error around, even if it contains many errors.

I think this will simplify the code, whilst letting us still return super smart JSON error responses via the API.

Note: I'm just recording this here quickly, I know I also need to respond to items in this issue 😁

ErisDS on 7 Oct 2016

to support multiple errors inside of one ValidationError? In short, lets always pass 1 error around, even if it contains many errors.

Yes i like that. Each error can have a set of errors. Something like:

  - message
  - errors [{ message: ...}, {message: ...}]

kirrg001 on 10 Oct 2016

Hi @kirrg001 – great work on this!

You asked if anybody had concerns about the location of the log files and permissions, and I do. I am in the process of attempting to get Ghost to run on now (an immutable server) so attempting to write to file locally would error. Heroku (an ephemeral server) could lose locally written files. This means a storage adapter solution that allows removal of all local writes is important. See https://gist.github.com/colinmeinke/978812f69bc9ea83390b231f8889318e for more detail about the problem.

I appreciate files have to be written locally by default. I think /content/logs/... would be a natural choice. Keeping the content directory as the only local directory Ghost writes to, and then having a storage adapter system that could abstract away each subdirectory to external servers would be ideal. This also solves the problem of multiple Ghost instances on the same server.

colinmeinke on 15 Oct 2016

@colinmeinke Hey!

This means a storage adapter solution that allows removal of all local writes is important. See https://gist.github.com/colinmeinke/978812f69bc9ea83390b231f8889318e for more detail about the problem.

I see. That's why we offer the ability to add a custom logging adapter, which inherits from the default logging adapter. The custom logging adapter can define or override a stream where to send logs to. https://www.npmjs.com/search?q=bunyan+stream

/content/logs/... would be a natural choice

Do you have any references?I am aware of /var/log as natural choice. But would like to see some references.

kirrg001 on 19 Oct 2016

@kirrg001 sorry, I meant the local ghost content directory, not at the root of the file system.

colinmeinke on 19 Oct 2016

sorry, I meant the local ghost content directory, not at the root of the file system.

Ah ok - sounds like a good idea 👍

kirrg001 on 19 Oct 2016

Tested logging on Windows 10 with gitbash.
Log into file: worked
Log into stdout: worked

I just created one small issue to add the log folder to content/logs and add a README.md or .gitkeep, because it's not a nice experience starting Ghost in production and you would get an error that the logs folder does not exist.

kirrg001 on 27 Oct 2016

Added an issue to Ignition to source out the log-request middleware: see https://github.com/TryGhost/Ignition/issues/14

kirrg001 on 27 Oct 2016

Will close this issue now. All tasks are completed. Thanks for all comments.

We are happy about any feedback. Just create a new issue if you find a bug or join our slack channel if you have ideas or improvements to share.

kirrg001 on 27 Oct 2016

Was this page helpful?

0 / 5 - 0 ratings

Related issues

413 Entity Too Large errors swallowed silently for image and import uploads

jgreenemi · 4Comments

Issue with attribute 'data-members-name'

fueko · 3Comments

Post Scheduling doesn't work when running Ghost on a path

marcuspoehls · 4Comments

Ghost won't start - nodejs issue

ArthurianX · 4Comments

AMP Analytics

mattferderer · 4Comments

Ghost: Feature: Logging

Logging

goals

reasons why we have to change things

bunyan vs. winston vs. debug

create a default logging adapter

output

log into file

application levels

error levels

time diffs

ensure we don't log credentials

bring help/context into errors

show debug logs in testing mode

TODO

Most helpful comment

All 19 comments

Logger (alternative)

Levels

Developer Interaction

Default GhostLogger behaviour

Log into multiple log files

Location of the log files

Custom Logging adapter

Current function set

Rotate files

Default logging config

production

development

Stdout pretty stream

Current usage

Alternatives

https://github.com/mrrama/node-bunyan-prettystream

https://github.com/benbria/bunyan-debug-stream

https://github.com/Wiredcraft/bunyan-logger

https://github.com/thlorenz/bunyan-format

Related issues