Julia: Basic logging in Base

Created on 9 Oct 2015  Â·  43Comments  Â·  Source: JuliaLang/julia

As @StefanKarpinski noted here, it would be nice to have (very basic) logging as part of Base, which logs messages to output (STDOUT or other). This could then be extended by packages to add more complicated logging features (syslog, etc.).

Some ideas:

  • Use/add standard log level functions (info, warn, error, debug, critical) to Base, perhaps wrapped in a module (Logging.info, Logging.debug, etc.)
  • Define a new interface with a single log function (Logging.log(DEBUG, "some debug message"); Logging.log(INFO, "Here's some info"), etc.)

Thoughts? If we go with the first idea above, I would propose to move Logging.jl to the JuliaLang organization as a start.

For additional reference and ideas, current logging packages include my Logging.jl and Lumberjack.jl.

Most helpful comment

Ok, I've got a WIP julep over at https://github.com/JuliaLang/Juleps/pull/30 :-)

All 43 comments

seems sufficient to me to keep this as a package. no need to make #5155 harder

Forgot to mention: one challenge right now is that, since info and warn (and error) are already defined in Base, overriding them in a package without warnings.

(Previously, only error was exported from Base, and Logging.jl got around this by using err for the error function.)

Which does leave the second idea above as an option for packages.

@vtjnash, to clarify option1 above: why not at least move info, warn, and error to, e.g., Base.Log. These could then be called as Log.info, Log.warn, etc., and would free up those names to be used in packages.

This would possibly be a breaking change (in a good way) by not having error throw an exception (after a suitable deprecation period).

I quite agree, all these dances around extending logging functions in Base are miserable. I like the idea of Log.<level> functions in Base which gives a clear message about logging infrastructure. It would be nice to define some AbstractLogger type and default console logger (current implementation of info and warn would suffice). Certainly, deprecate info and warn in favour of Log interface.

I feel if anything deserves to be "blessed" as official, it's a logging system so that you could eg set up in your main script that logs of a certain severity should save to a certain directory and trust that all packages will respect that. Whether that means this logging module should be in Base or just that community coalesce around one logging module or API that shares configuration data, I don't have thoughts on.

I've run into the same issues in the past when implementing my own
logging. I would love for there to be a decent option in Base, and I would
also love to be able to swap it out for a package implementation without
generating warnings/errors.

On Fri, Oct 9, 2015 at 4:23 PM, Jon Malmaud [email protected]
wrote:

I feel if anything deserves to be blessed as official, it's a logging
system so that you could eg set up in your main script that logs of a
certain severity should save to a certain directory and trust that all
packages will respect that. Whether that means this logging module should
be in Base or just that community coalesce around one logging module or API
that shares configuration data, I don't have thoughts on.

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/13515#issuecomment-146978055.

seems sufficient to me to keep this as a package. no need to make #5155 harder

I emphatically disagree with this. We need to provide the basic user-side logging API in Base because every package should use that API. By default it should just print logging messages, but it should be possible to load different logging backend packages that do different, more advanced things with the log messages.

I like the info, warn, error, debug, critical approach better because it's natural to use even when writing code that isn't "enterprisey", but allows you to collect messages and log them consistently as soon as you decide you want to use a real logging system – and it will work with everyone's code. The alternative seems to be to have to write Log.error to throw an error, which seems terrible.

it seems confusing to me to have so many functions that do the same thing (presumably they should be implemented by dispatching to a single function "log" rather than duplicating code), and yet the meaning of Log.error is different from Base.error === Base.throw.

and it will work with everyone's code

I don't see this as a real argument. Enterprisey logging should be source and domain aware so that appropriate filtering can be done (e.g. log(DEBUG, (Module, :DOMAIN), "some debug message")). However, at that point, I see no intrinsic value in ensuring that there is only one such system.

it also can't address the fact that many C library wrappers and existing packages will never switch, so the "one true solution" is not attainable either.

Certainly, deprecate info and warn in favour of Log interface

sure, it sounds reasonable to stop exporting them from base. it looks like Pkg.jl / libgit2.jl is the only user of warn and info that isn't just interactive messages intended for stdout, so perhaps it should just provide a set_info and set_warn function of it's own so that when the user loads a "real" logging package, Pkg will make use of that.

Are "enterprisey" packages supposed to not use any non-enterprisey packages? What makes something "enterprisey"? Does it have something to do with XML? The point is that when you need to start logging what's happening with your application, you want to log events in packages that you use, so they should all use a common logging infrastructure – which should therefore be provided by Base. We should come up with a lightweight but usable logging API and then encourage people to use it. We can automatically insert things like file name, line number and module, so making that an argument to logging seems silly.

The main argument for warn("something happened") versus log(WARNING, "something happened) is that the former is a lot shorter and less of an eyesore. The "avoid code duplication" argument doesn't make sense since warn("something happened") can just be defined as log(WARNING, "something happened").

I cooked up some general logging interface using a dependency injection and call function, https://gist.github.com/wildart/fe2ea28228f9935db38d.

that the former is a lot shorter and less of an eyesore

what, by 4 characters? we print WARNING in all caps, why shouldn't that be the input form also?

@wildart let's not go crazy with the code generation and number of alternative ways to call into this. an enum (or other equivalent value object) should suffice for the level. it shouldn't allow passing a number instead of a category. and it really doesn't need any of this indirection via type parameters. having log be a value instead of a (const) function circumvents the benefits that stefan claims this would have for adding to base, since every library would still have it's own log object, without gaining any centralization or commonality.

I hate that we print "warning" in all caps.

But are you not warned?

@vtjnash I agree that it was far fetched to have a non-enum level but I would like to see a a user defined levels and enums are not extensible.
My bigger concern is a use of functions for logging, the calling convention breaks when you try to use more then one logger warn(msg...) vs warn(logger, msg...). A suggested object-based interface comes with common calling interface and give a variety of loggers without breaking a calling convention, e.g. log(INFO, msg...) and logger(INFO, msg...).
As for Base default logging, aliases (in a manner of current warn, info & etc.) can be provided to the default system logger object with an ability to inject a custom implementation. However, it is a bad practice to change a default system behavior, no matter how convenient is its usage (which all logging packages try - hijack a error call). So, it is reasonable to deprecate old logging interface in favor of a new one which would not use verb functions.

I hate that we print "warning" in all caps.

yes, i do seem to recall that was part of your commit message that made all WARNINGS uppercase ffe157ed8bbf9f490fd7b71cfea8e424e618ae99 :)

user-defined levels would be possible essentially by using a reimplementation of enums as an open set. however, i suspect this is not a good feature to have. it's more complicated to implement, reduces code commonality if used, makes it harder for a client application to design an interface and filter for all cases, and makes it harder to discover/enumerate the possible error levels.

log(ERROR) and error() should be separate domains. the first indicates an error occurred that the program handled (but which the user may be interested in later), the second is a directive to the program to break execution due to an exceptional condition, which may or may not get handled. (although that handling may be to call log(ERROR) at a higher level).

log(ERROR) and error() should be separate domains. the first indicates an error occurred that the program handled (but which the user may be interested in later), the second is a directive to the program to break execution due to an exceptional condition, which may or may not get handled. (although that handling may be to call log(ERROR) at a higher level).

That's certainly the way it works right now. I had proposed above that we move entirely to using throw(), and deprecate the functionality of error which breaks program throw (single mechanisms for doing things and all that).

(Although, it is true that throw(SomeException("Some message")) is a bit verbose. Perhaps throw("message") could be made equivalent to error("message"), and we deprecate error...)

log(DEBUG, (Module, :DOMAIN), "some debug message")

Can't this be written as @debug("some debug message"), and a suitable macro definition would figure out the module etc.?

what, by 4 characters? we print WARNING in all caps, why shouldn't that be the input form also?

Because 99% of people don't care at all about levels and don't want to spend one microsecond learning about them. To be used by them, your API has to be just like println because it is the de facto standard for the lazy people; only the name of the function/macro is allowed to vary. The user will try debug("hello world") and check that "hello world" is prominently displayed in the output; the rest of the output will be utterly ignored and thus is irrelevant. If that function call fails because of a non-standard calling convention (i.e. "not println"), then they will use println. At least that's how it is when you're starting out, which is when the choice of what to use is typically made and then inertia cements it there. When people start to care about the extra bells and whistles of a logging system, it's already too late.

Python is an example of a failure here: many people use print for debugging. Success cases would be e.g. NSLog() of Apple's flavor of Objective C and console.log() of JavaScript. Both of these are drop-in replacements for print that work even for people who just want to say console.log("hello world"), and one does not need to fiddle with imports or logging settings to make it work. In Python, you have to do from logging import *; debug("hello world"), and even then it will print nothing. That's too complicated for a "Hello World".

The most effective way would be to make debug("foo") easy and then require a file argument for println so you'd have to type println(STDOUT, "foo") if you really aren't using it for debugging :)

@RauliRuohonen gets it.

+1 for deprecating error in its current form (then throw is the new
error), and eventually replacing it with error(x...) = log(ERROR, x...)

+1 for debug, info, warn, error, etc becoming logging functions
with STDOUT as the default. Then if someone wants to swap out those
methods for something with more bells and whistles, they can do that
without breaking key components of control flow.

I actually don't care if the Base versions are anything more than printing
to an IO, but I'd like to be able to re-use the verbs with more complex
loggers.

On Tue, Oct 27, 2015 at 10:16 AM, Stefan Karpinski <[email protected]

wrote:

@RauliRuohonen https://github.com/RauliRuohonen gets it.

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/13515#issuecomment-151509646.

After much though, I see only one solution - move current logging functions into separate module. That is the only way to reuse verbs.

I've started some work on a minimal logging frontend API which I think addresses most of the points raised here: https://github.com/c42f/MicroLogging.jl. We need some good logging verbs in Base fairly soon, there's already at least four different somewhat inconsistent logging libraries :-/

Design overview:

  • Dirt simple logging verbs (@ChrisRackauckas makes some good points)
  • Very efficient early bailout before log message construction
  • Redirect logs from all modules to a custom handler with a line of code
  • Automatic logger hierarchy, mirroring the module hierarchy. Also supports custom logger contexts if necessary.
  • No XML

A tiny taste:

julia> x = 42
julia> @info "My value is $x"
INFO:  [Main:REPL:109]: My value is 42

I think we can call this closed by https://github.com/JuliaLang/julia/pull/16213

Ugh, how did I miss that? Thanks @tkelman.

Regardless though, I don't think #16213 addresses all the problems with the current logging backend in 0.5, and in fact introduces some new problems to the user-visible API. As I see it:

  • The extra arguments to info(), for example prefix, conflate the act of creating a log message with the act of formatting it into a string. Log records are not strings; they're a user-defined message string with a bunch of associated metadata (log level, context(module), source location, backtrace, etc). Memento.jl gets this right.
  • Any function-based frontend can't naturally support zero cost logging for messages which are expensive to produce, but filtered out. This is a big problem if you want to leave the ability to do debug logging in a production system, to turn on only in an emergency. (Yes, you can use a closure to postpone the work of creating the message, but a do based interface is not pretty.)
  • A log stream is not a character stream; it's a stream of log records. Thus, an IO is not a good abstraction for a log handler. Just try parsing a plain text log file where multiline log messages are mixed in with the metadata, and you'll see what I mean. Structure is lost and you end up using a bunch of nasty regexes to rebuild information you threw away.

@colinfang @Rory-Finnegan - you guys have had a recent go at this stuff in packages, and I'd like to hear your thoughts. What I'd really like is a logging API in Base which feels just as easy to the casual user as println(), but allows an industrial strength logging system to be plugged in for formatting and redirecting logs, if so required.

I should add that I'm grateful for the work done in #16213, I just don't think it hits the API sweet spot yet.

What I'd really like is a logging API in Base which feels just as easy to the casual user as println(), but allows an industrial strength logging system to be plugged in for formatting and redirecting logs, if so required.

Would love to have this in 1.0. If you have thoughts on how to accomplish such a design it would be excellent to have. Since effective logging often needs to be spread through libraries that one uses, the ideal is to have a single logging API that all packages can use, and which the user can then connect to whatever logging backend they want (or none if they just don't care).

Log records are not strings; they're a user-defined message string with a bunch of associated metadata (log level, context(module), source location, backtrace, etc)

I strongly agree with this (as you may have guessed), but I'm not sure what the best way to represent those records is. I see log records sitting somewhere between a custom type (for dispatching) and a dict. Maybe folks have a clearer perspective of how records should be represented and stored?

Any function-based frontend can't naturally support zero cost logging for messages which are expensive to produce, but filtered out. This is a big problem if you want to leave the ability to do debug logging in a production system, to turn on only in an emergency. (Yes, you can use a closure to postpone the work of creating the message, but a do based interface is not pretty.)

Assuming I'm understanding what you mean... I don't really feel like this is a big issue as there are several ways you could address this. For example, Memento.jl only creates a log record and logs it if the level is >= to the logger level. Now I need to do that check for each parent logger in the hierarchy, but I don't think that's overly expensive in most cases. Alternatively, we could probably use macros to turn those calls into no-ops, but that may get complicated with the hierarchical logging functionality that some folks want (myself included).

A log stream is not a character stream; it's a stream of log records. Thus, an IO is not a good abstraction for a log handler.

I strongly agree with this. I do like the simplicity of logging to an IO type, but when you get into sending http requests to a logging service that design starts to break down. I'd be fine with some kind of AbstractLogStream type with an IO like interface though. This would also make providing utility methods that operate on streams of logs easier (e.g., log batching or filtering for http requests).

Finally, I'm not sure I agree that the logging API needs to be in base julia by 0.1, but it would be nice if we could consolidate our requirements and preferences into a single package, maintained by multiple members of the julia community (I realize Memento.jl didn't help with that objective).

@StefanKarpinski Great, I'll keep working on MicroLogging.jl with an eye to getting some part of it into Base after 0.6. Hmm, in retrospect that name is confusingly similar to @colinfang's MiniLogging.jl, but it's just a prototyping area at this stage.

Maybe folks have a clearer perspective of how records should be represented and stored?

@Rory-Finnegan What I'm mainly interested in is how the user-visible API should look and behave, so that the package ecosystem is encouraged to use something consistent. It's this API consistency for users who "don't care about logging" which is important for julia-1.0; the actual logger backend details are a secondary concern. I'm not sure, but it's possible that the details of a record type could be left to any logger package which plugs into a minimal frontend provided by Base. A custom record type makes sense so we can at least do things like overload show.

For example, Memento.jl only creates a log record and logs it if the level is >= to the logger level. Now I need to do that check for each parent logger in the hierarchy, but I don't think that's overly expensive in most cases.

The issue here is that the the client side (that is, the module doing the logging) may have expensive work to do before being able to log the message. For example, suppose I've got a largeish matrix A, and I want to know the condition number for debugging. This is a moderately expensive operation and should only be done if debugging is turned on. With a function call interface, this is hard to achieve without postponing the work by passing a closure. With a macro, we can have

@debug "Condition of A is $(cond(A))"

turn into

if shouldlog(module_logger, Debug)
    logmsg(module_logger, "Condition of A is $(cond(A))")
end

which solves the problem. It also seems to solve the problem of the work required to look up (and optionally autocreate) a per-module logger, as this can be done at compile time.

I realize Memento.jl didn't help with that objective

Hah :) and neither did myself creating MicroLogging.jl, though keeping in mind that this is a prototyping area for the API.

FemtoLogging.jl?

I don't want to distract too much from the conversation here, but I will point to my feeble attempts at logging/debugging in the new HTTP.jl package. The logging is way, way trivial, basically just printing a statement to an IO if the user said verbose=true on the function call, but the debug is more useful. It allows the setting of a module-wide compile time constant and if false, the @debug macro expands to nothing. Otherwise, it prints out the filename + line number + a @show-like printing of the debug string or expression.

Anyway, just count me in as interested in seeing what develops, even if I can't provide that much actually useful code. I will say the approach of having Base define an extensible, but universal user-facing interface, w/ packages able to define their own custom LogRecord type to capture more and actually log the info differently sounds appealing. Cheers!

I'd call it ZeptoLogging.jl, but then I'd have to limit the source code to one byte ;-)

To me the really hard part is that you want logging to have negligible cost when you're not using it, yet ideally be able to turn it on dynamically to see what's happening in an already-running production process. Otherwise, it can be done with macros which compile to emit no code when logging is off. The only solution I can think of that might work for zero-cost is something like patch points.

@quinnj I think something along those lines (@verbosedebug? @staticdebug?) is a great tool for package authors when debugging their package implementation. It's not much use for anyone else though - as Stefan says, you want to be able to enable debug logging dynamically when something goes wrong in production.

@StefanKarpinski patch points are super cool, but perhaps somewhat overkill for this use case. I think I've already got a rough solution for per-module log level testing which can get the cost of a non-logged message down to an integer load, comparison, and branch to skip the logging . The idea is that you look up the module logger in lexical scope (via current_module()) at macro expansion time, and hardcode the resulting reference into the AST. The test shouldlog(logger, msglevel) = logger.level <= msglevel then determines whether logger should log at msglevel level. Looking up logger.level might still be a double indirection at the moment, I'll have to check.

To me the really hard part is that you want logging to have negligible cost when you're not using it

I'm pretty sure you also want negligible cost when you are using it, otherwise the logs will probably be too big to be useful.

Would be nice for this to generalize the DEBUG_LOADING pattern here (https://github.com/JuliaLang/julia/blob/021e11e701297c75e0a68d4379b9ac8ebdb7ca47/base/loading.jl#L241-L244), and probably also the depwarn command line flag. For other prior work, I'll also point to Gtk (https://developer.gnome.org/gtk3/stable/gtk-running.html#GTK-Debug-Options) and LLVM/Clang (http://llvm.org/docs/ProgrammersManual.html#the-debug-macro-and-debug-option). Common patterns here I would point out here is that:

  • they aren't structured logs, it's a stream of text notifications
  • can be filtered by sender
  • they aren't really intended for continuous logging, warnings, or errors

For general logging, Glib provides a more structured API for connecting to the system log: https://developer.gnome.org/glib/stable/glib-Message-Logging.html. There's some overlap (they can also be filtered by sender), but seems to be a mostly different use case / different implementation?

If you're logging so often that performance of logging becomes an issue then I think you need to be doing some kind of lightweight statistical logging like statsd clients do.

Nice, the DEBUG_LOADING pattern is exactly the kind of thing I'm trying to cater for with custom logging contexts (ie, log contexts not tied to the encompassing module). Currently this could be handled in the prototype using something like:

module Test

using MicroLogging
import MicroLogging: Logger, get_logger, Info, Debug
# logger setup, and add to the hierarchy with current module as parent.
const load_logger = Logger(:JL_DEBUG_LOADING, get_logger(current_module()))
configure_logging(load_logger, level=haskey(ENV, "JULIA_DEBUG_LOADING") ? Debug : Info)

function usage()
    @debug load_logger "Rejecting cache file due to it containing an invalid cache header"

    @warn load_logger "Module Foo with uuid cc4d520e-f2d7-422d-845f-04fec687968a is missing from the cache."
end

end


Test.usage()

I guess depwarn would be similar, though complicated by the fact that some of the C code is also involved in depwarn.

If you're logging so often that performance of logging becomes an issue then I think you need to be doing some kind of lightweight statistical logging like statsd clients do.

This makes a lot of sense.

There's a separate reason to have low cost on the "invisible logs" path: it's not that you want to turn those logs on, but when using third party packages it would be nice if the logs you're not seeing have negligible cost. So it would be nice to make design choices such that writing the slow version is more work for the user.

Having a requirements list for logging infrastructure would be a great first step. Once we can get agreement on that, we can figure out what the design needs to look like.

Would it make sense to put those requirements and proposed solutions into a Julep?

Yes, absolutely. That's a great start, IMO.

I've started reorganizing my thoughts into a julep which I'll put up as soon as I think it's presentable.

Ok, I've got a WIP julep over at https://github.com/JuliaLang/Juleps/pull/30 :-)

24490

Was this page helpful?
0 / 5 - 0 ratings