On discourse a workaround for this feature was provided, but it is not future proof, uses type piracy and cannot be used well in any module. See: https://discourse.julialang.org/t/creating-a-custom-log-message-categorie/29295/11
My feature request: Add a function to the module Logging:
register_loglevel(<LogLevel>, <msg>)
Example usage:
using Logging
const Trace = LogLevel(500)
register_loglevel(Trace, "Trace")
@trace "A trace message..."
This function should do two things:
a. make a macro available with the name of the new LogLevel
b. extend the printing of log messages to use the new, registered name
Two questions:
I don't think much of a change is needed, actually 🤔
Changing the LogLevel
struct from
struct LogLevel
level::Int32
end
to
struct LogLevel
level::Int32
name::String
end
would allow the show
function to print the user defined name instead of the default LogLevel(level)
.
As for the macro, @logmsg Trace msg
seems sufficient to me, as it would also print the correct name with the change to the struct and be compatible with other custom loggers. I'm not sure macro generation is a good idea, since the core macro already exists.
The real question is if it should be incorporated into a general logging rework.. I think there are already a bunch of things planned in that direction?
Would that be a braking change or could it be added to Julia 1.4?
I'm not a core developer, but I think changing the printing as I proposed with a fallback constructor to LogLevel(level, "LogLevel($level)")
for LogLevel(level)
shouldn't be breaking -- everything visible from the outside should still work as it used to.
Now we just need a pull request...
Good feature request! I agree with @Seelengrab's assessment: there's some relatively minor fairly backward compatible changes which could make this work better, and having a macro frontend is orthogonal to the internal changes required to CoreLogging
.
I'm not entirely sure that changing LogLevel
itself is the right direction; rather I think we could allow custom user-defined types as log levels.
The real question is if it should be incorporated into a general logging rework
Yes, I do have some thoughts about this. In particular it would be nice to resolve the following design problems:
Trace
: one might mean "really verbose tracing with importance Debug < Trace < Info
", another might mean "high level tracing info with importance Info < Trace < Warn
". In general the logging frontend will never be able to guess the semantic of custom log levels, so we will need some functions defined for the user's custom log levels to tell the frontend what it needs to know.This is the central design question; here is my current answer: I think log levels conflate two different concepts:
LogLevel
: log levels can be compared with <
and filtered based on their importance.LogLevel
type independently of the importance level.To really solve the design issues above we need to tease apart these concepts in a coherent way.
A possible solution to this is to model log levels as user-defined types with a minimal interface, possibly a new generic function Logging.importance
, and use Base.show
for printing: For example:
module Logging # stdlib (possibly imported from CoreLogging if required)
"""
An abstract type for log levels
"""
abstract type AbstractLogLevel; end
"""
importance(log_level)
Return an `Int` defining the default importance level of `log_level`, as it will be seen by
the current logger. Logging backends may choose to upgrade or downgrade the importance
separately from the log level itself.
User defined importance levels should be relative to the standard log levels which are defined to have importance levels of `importance.([Debug, Info, Warn, Error]) == [-1000, 0, 1000, 2000]`.
"""
function importance
end
end
Now, to use this, we'd do something like:
module A
using Logging
struct Trace <: AbstractLogLevel ; end
Logging.importance(t::Trace) = 500
Base.show(io::IO, t::Trace) = print(io, "Trace")
end
Note that with this design, a user's module B
can do the exact same thing here, and logging backends will be able to distinguish between the labels A.Trace
and B.Trace
, by virtue of these being different user-defined types. This solves design point 3 above.
To solve design point 2, I think we should modify CoreLogging
to call importance
immediately (as a replacement for this call to convert), and to pass this to the logging backend as the "default importance level". In addition to this, we should also pass along the user-defined log level to the backend.
This allows a user-defined logging backend to decide "Oh, the author of SomeVerboseLib.jl
has way too much logging; I'm going to downgrade all messages from that library by a factor of 2000 in importance units". But at the same time as downgrading this importance, not loose the original label of the message. Thus, you may have Warn
messages coming from SomeVerboseLib
which end up at importance -1000 and get filtered out.
@oxinabox I'm interested in your thoughts on this too.
For this I think we could supply some convenience functions in stdlib Logging
to generate the code for your own logging macro @trace
(ie, expose a thing like CoreLogging.logmsg_code
which can be called from user code). This would be exposed as a public API via the Logging
stdlib. If we do this, this should be a separate PR.
Sound promising! Having the opportunity to filter log messages independent of the severity level is sometimes important to me, for example if I am debugging different sections of a service.
I would like to make the argument for:
Log levels being only importance.
The categorical part is covered by the group
argument.
(which is available at shouldlog
time. Where as the loglevel is available at minlevel
time, which is earlier. But either way both are available before the log message is computed)
cross-ref: https://github.com/oxinabox/LoggingExtras.jl/issues/13
If we stick to that, then we don't need to worry about changing the importance while also tracking the original label.
I already have an example in LoggingExtras readme for upgrading loglevels, and downgrading would be similar.
https://github.com/oxinabox/LoggingExtras.jl#raising-http-debug-level-errors-to-be-info-level
Could be trivially extended to also store the old level if one wanted, in a kwarg.
Most of the time you only want to filter by importance, that is why we have loglevel filters.
but one can filter on group
, or module
or file
just fine,
and even content.
oxinabox wrote:
I would like to make the argument for:
Log levels being only importance.
The categorical part is covered by thegroup
argument.
I do not agree.
LogLevels - from my point of view - should have a a meaning AND an importance as distinct attributes. Categories are a way to group log messages, this is a separate feature that can be useful in different situations.
To give an example: You can group messages by function or by module.
At some point in time you are ONLY interested in Trace messages in the function XYZ. You could use a filter from LoggingExtras to filter the Trace message in the group XYZ.
The default level of a Trace message could be between Info and Warn, but it could be moved to between Info and Debug if required. But the level and the identity are distinct. Grouping by "Category" is an additional, useful feature.
Thanks Lyndon! Yes, there's a tension and potential overlap with group
and it would be good to resolve that. However I think that the very fact that we refer to log levels by name and not by numerical value is a sign that they carry important categorical information which is not just an ordinal importance level. I didn't call the four standard levels @log0
, @log1
, @log2
@log3
; though perhaps in hindsight I should have!
Just thinking out loud here: One possible, speculative and rather breaking way to take your idea seriously would be to make the default value of group
equal to loggroup(level)
, and have loggroup(Info) == :info
, etc. This is certainly in conflict with the current default value of group
which is essentially lexical (set to the base name of the originating file). But it would make a certain kind of sense.
Having said that, I feel like there might be multiple types of category simmering below the surface here and if we could only identify what they are we'd have a better idea of what to do.
@error
=> callee thought this was an error; @debug
=> callee thought this is strictly optional information for inspecting internal state). I think we loose information if we modify the log record by overwriting the intent of the callee.group
referring to some aspect of the software component emitting the message (like a more granular version of the module?) TBH I've always wished I pinned down the semantics of group
better; I felt like it's useful but left it largely unspecified whether it should refer to the software component or to some other aspect which cuts across software components.At some point in time you are ONLY interested in Trace messages in the function XYZ. You could use a filter from LoggingExtras to filter the Trace message in the group XYZ.
The default level of a Trace message could be between Info and Warn, but it could be moved to between Info and Debug if required. But the level and the identity are distinct. Grouping by "Category" is an additional, useful feature.
Just to be clear: I am not saying you shouldn't be able to group by log level.
I am saying that if you change the loglevel, i.e. importance, and you are grouping by loglevel,
then the grouping should change.
The very fact that you chose somewhere to change that level (with a Transformer logger),
to me says you mean to change that level.
But I am not 100% convinced I am right, so lets think about this closely.
So the real interesting case is if you do not control the full logging pipeline,
and that is I guess what we should focus on when discussing this.
So consider we have top level package MyApp.jl
that we are developing.
which has a direct dependency on GitHub.jl
,
which has a direct dependency on HTTP.jl
But App.jl
does not have a direct dependency on HTTP.jl
.
So we imagine that GitHub.jl
was doing something like:
function get_repos(user)
# lower logging level for HTTP as we can recover from all the faults via standard fallback
raw = withlogger(downgrade_logger(current_logger)) do
HTTP.get(api_url; user=user)
end
is_valid(raw) || fallback_fetch_repos(raw, user)
return JSON2.load(raw)
end
downgrade_logger(sink) = TransformLogger(logger) do log
new_level = log.level=Error ? Warn : Warn ? Info : Debug
return merge(log, (; level=new_level))
end
(important aside: I'm not sure that universally dropping all levels, like that is a realistic use case. More reasonable might be just to drop Info down to debug and leave warn and error as is. This is a question of would decreasing important across the board be done?)
So now as the developer of MyApp
.
We are going to say: _"Give me all log messages of Warn level"_
presumably we are doing this because we think something relevant to use is going to trigger a warning.
_Do we want to see the warnings from HTTP.jl that Gitlab surpressed down to Info?_
I suspect that we don't, we do not know about HTTP, since we don't depend on it.
We depend on GitHub.jl.
and if it has surpessed a warning that it could perfrect not recover from,"
then that is a bug in that package.
And to get to that stage, we would be openning an issue on GitHub.jl saying:
_"Seem to be getting invalid repos back, no warnings in the logs"_.
I think one goes to a very complicated place, if one starts from the assumption that the libraries one is calling have bugs.
If we were going to go deeper we would be ] dev --local GitHub.jl
and at that point we can change how it is messing with the logs.
Note that it is a feature of the logging system that any package (e.g. GitHub.jl) can do what ever it likes with the logs for any function it calls (e.g. the functions from HTTP.jl).
Including completely surpressing them, and there is nothing the top level caller (e.g. MyApp.jl) can programatically do about it. (but the developer of MyApp can totally open issues and PRs to GitHub.jl or fork it, or use a different library, so this is no deal breaker on the logging system's design)
Lets write some more user stories like this, to try and get more clarity on what we are trying to solve.
@oxinabox I love user stories! But they're only good when they come from the real user (I'm not a real user for this stuff right now), and are (at least initially) available in gory detail in the context of the real user's workflow. When you have a bunch of those real stories there's at least a chance of removing the problems entirely by design. Unfortunately it's actually really hard to gather real stories with enough detail. User's time is precious and there is a tendency to simplify which often removes the most important parts.
Anyway, philosophy blah blah. I agree that composibility of transitive logging (LibA
calls LibB
calls LibC
) is a key difficulty and we should have a set of standard filters which can be used by LibB
, but which allow LibA
to retain ultimate control of message visibility. In some sense, intermediate filtering should generally be "suggestions" rather than hard filters. Modifying message importance by adding or subtracting an "importance offset" is an example of this.
But this is straying somewhat from the original feature request (albeit related in design space). @ufechner7 perhaps you could describe what you're trying to achieve in a bit more "gory detail"? What do you think "trace" would imply for your application? In what circumstances would you use trace rather than @info
or @debug
? How would you like to gather trace records, and more importantly, how would you use them? Would you look at them as a time series? Make a summary report? Or something else?
I have real user stories of running Memento in production, with on-call paging, not quiet the same system but some of the same problems. That is where the story about using groups, rather than severity for alerts comes from
Out-side of prod
Sometimes we do bump a nlevel up dependency to not have its debug be filtered out.
(Not by transforming, but by adjusting filters)
Outside of Memento:
Sometimes we do conditionally on an enviroment variable change the current logger to TensorBoardLogger, and otherwise surpress all log messages because they are only interesting to TensorBoard (thinking about that now, might want to switch to using a group at call-site)
But this is straying somewhat from the original feature request (albeit related in design space).
Yes, we can take this to another thread after.
I think the core notion that one should be able to add new log-levels at different numbers is a very solid one.
I have real user stories of running Memento in production, with on-call paging, not quiet the same system but some of the same problems
That's excellent (and I'm happy to be distracted by hearing about those. In gory detail if possible :-) ) Especially about debugging live distributed systems. I've dealt with a little of this in the past, but not enough that I think I have much insight into the problems faced by users of logging backends. I guess it would help if we had a JuliaLang/Logging
repo as a place to discuss these things. I've been toying with this for a long while but the fact that we still need hacks to get Pkg into the stdlib dampens my enthusiasm a bit.
important aside: I'm not sure that universally dropping all levels, like that is a realistic use case. More reasonable might be just to drop Info down to debug and leave warn and error as is.
Yes! I think the fact that you want to treat Warn
and Error
differently from Info
and Debug
is a sign that they carry some categorical information which is not merely an importance level.
My current feeling is still that log level should be seen as a category which
In contrast, we could then consider group
to be a category which somewhat aligns with software components (serving as an alternative to module
when a component-wise grouping is required which doesn't exactly mirror the module structure). This is consistent with the default value for group
, but maybe people want to use group
in a different way in practice. Keen to hear about that if so.
Yes! I think the fact that you want to treat Warn and Error differently from Info and Debug is a sign that they carry some categorical information which is not merely an importance level.
A fair point.
Does that mean one also wouldn't want to do:
importance -= 1000
?
Instread something like importance -= importance < 10_000 ? importance : 0
or maybe importance -= 0.1importance
?
Good question. All of those do seem rather dissatisfying TBH. Going with your nonlinear transformation idea, perhaps it could be argued that it's the current linear spacing of default importance levels which is at fault, and we should have debug->-100, info->0, warn->1000, error->10_000
, or some such. (That is, bake the nonlinearity in by default, rather than having to invent weird heuristics in the logging backends.)
@ufechner7 I've prototyped my proposed solution on a branch (cjf/logging-custom-levels), but I'd really like to hear more about your use case before making a PR. Knowing more about your use case is really important! In fact, the reason that the system doesn't neatly support this stuff "out of the box" in julia-1.0 is really due to a lack of motivating use cases.
@info
or @debug
?My current use case for trace messages is a database backend.
The core module has about 1000 lines of code, 68 debug, 22 trace, 21 info and two warn messages.
The trace messages are used to verify the correctness of the code execution. I put them in a file and give them to a senior manager to check them for the correctness of the order of the execution steps.
Info messages are more for myself, they are more detailed and I use them to understand what is going on and to compare different versions of the program.
If I need to debug a section in even more detail I change debug statements into info messages, after finished debugging that section I change them back.
Sometimes I am interested in the execution time report, then I change one statement from debug to info, sometimes not, than I change this back.
In the end the code shall run on a cluster and there will be a central logging server for all services.
Does this answer your question?
Thanks, I've been thinking about this and doing a bit more prototyping.
It strikes me that Base.depwarn
is also a good example of where a custom level could be considered worthwhile. Especially because there seems to be a desire to have "soft depwarns" for Base which are compiled away completely in 1.x, but which could be enabled in the future, or enabled on demand during package development.
Hi! I'm coming from the Oceananigans.jl repo where I'm working on helping out with the project's logging.
I started work on a custom logger that is focusing on formatting the way the logging macros print their output. There is a desire to define custom LogLevel
's and associated macros so that the project can have tailored logging. The first of these desired macros is @diagnostic
. My understanding is that this would be used to report on the health of a model as it is executed.
There is also a desire to report specifically on when the project is performing setup of the environment (which I think would where a @setup
macro could come in).
It strikes me that Base.depwarn is also a good example of where a custom level could be considered worthwhile. Especially because there seems to be a desire to have "soft depwarns" for Base which are compiled away completely in 1.x, but which could be enabled in the future, or enabled on demand during package development.
This can be done basically as well with groups (or some other othoganal marker).
Better I would argue because making it a level requires ordered importance.
This actually sounds like an excellent example of something that should be a group.
If it were a level then to disable by default would have to be somewhere below info.
So info could as is standard be preseved.
But it is actually a warning level if you want it at all.
So if I had raised by min level up to warning because I didn't want the spam of so many info messages; I might still want depwarns.
(Which I absolutely do do with Memento all the time)
Remember that Depwarn is used by packages as well as Base.
And more generally some people might want to ensure there code is free of deprecated functionality even if it is not immenently being removed.
And one aslso might want to copy all depwarns to a file as a CI artifact so one can generate a report of how a set of packages are doing.
I guess one could do this all by treating a depwarn level purely as a categorical concept.
But then for what point are we setting this ordinal level?
I feel like we have a nice ordinal loglevel property,
and a nice categorial group property
and a pretty good job was done putting them in.
The fact that something can be said to have a purpose makes it seem more like it is a group.
Then you could have different levels of severity.
To use the @diagnostic
example, which _"used to report on the health of a model"_.
That could easily be a macro which is @diagnostic LEVEL msg
becomes @loglevel $LEVEL msg group=:diagnostic)
. (LEVEL could default to Info
level.)
Then one could have
@diagnostic Debug "0.2% Fitting Data Missing, skipping"
@diagnostic Info "5% Fitting Data Missing, Interpolating"
@diagnostic Warn "20% Fitting Data Missing, Interpolated but accuracy may be limitted"
@diagnostic Error "Model Did not converge to r_tol=0.1, Strong recommend investigating rather than using the results"
I missed this one before;
At some point in time you are ONLY interested in Trace messages in the function XYZ. You could use a filter from LoggingExtras to filter the Trace message in the group XYZ.
Solution is to make group a tuple, and make sure group handling code can deal with it.
One would basically do
EarlyFilteredLogger(current_logger()) do logmsg
(:trace in logmsg.group) && (:XYZ in logmsg.group)
end
or compose two of them:
xyz_logger(logger) = EarlyFilteredLogger(logger) do logmsg
(:XYZ in logmsg.group)
end
trace_logger(logger) = EarlyFilteredLogger(logger) do logmsg
(:trace in logmsg.group)
end
trace_logger(xyz_logger(current_logger))
Look at all the logging things that have been happy with orginal levels only representing severity,
not representing what / who it is for.
Sorry I think I wandered off topic again.
We absolutely should have the ability to add log levels,
but any facilities we add for it,
should be all about being able to add things like a
Notice between Info and Warn,
and Critical above Error,
and Trace (which as standard meaning different to @ufechner7 example) below Debug.
People should be able to (IMO) abuse it for categories if they want,
but we shouldn't spend too much time worrying about making that easy,
or getting confused as to if it represents category or severity.
@oxinabox I get that you would like log level and group to be orthogonal concepts which are ordinal and categorical respectively. But people using other logging libraries just don't describe log level this way!
Every link you posted talks about both the categorical and the ordinal components of log levels. The most clearly written of those links is https://reflectoring.io/logging-levels/; they call importance
"urgency" (another word I'd considered for the proposal above), but each level is described in terms of some categorical conditions. To quote:
- ERROR: Users are being affected without having a way to work around the issue.
- WARN: Something bad happened, but the application still has the chance to heal itself
- INFO: document state changes in the application or some entity within the application
- DEBUG: log any information that helps us identify what went wrong
- TRACE: all information that helps us to trace the processing of an incoming request through our application
To quote https://stackify.com/logging-levels-101:
Logging levels are just labels
Of course, just because other logging libraries conflate ordinal and categorical concepts doesn't mean we need to! But we need a concrete way forward from the current state of affairs. Currently we have:
@info
rather than @logmsg 0 "asdf"
; @warn
rather than @logmsg 1000 "qwer"
etc. If we name and describe levels as categories (with importance/urgency which is then derived from the level) it's inconsistent to say "People should be able to abuse it for categories if they want". It's not abuse if the library encourages categorical thinking in our naming scheme and documentation ;-)group
is set to splitext(basename(__source__.file))[1]
by default, so it is currently a software-component based category by convention. In hindsight this was always inconsistent with setting group=:depwarn
as is currently done by @deprecate
because these are different concepts. (Though your interesting idea of having group
as a tuple could help there, perhaps.)Coming from https://github.com/JunoLab/ProgressLogging.jl/issues/9, I'm interested in what kind of "category" is best for choosing a special kind of handlers like ProgressLogging.jl and TensorBoardLogger.jl. For those application, desirable properties are:
(Those points may indicate that using logging system for this is not the right direction. For example, you probably can build a similar system with a global Channel
or a hook registry. I didn't create those packages but my guess is that it's efficient to piggyback on the logging system as it has a special payload in Task
. Or maybe because it's very handy (= point 3)? Anyway, I assume that we want to use logging system in the comments below.)
For point 1, using custom level
type actually seems like a nice way to declare that certain log events are not interacting with the usual LogLevel
ordering. But ignoring level
and checking _group
in shouldlog
may not be so bad.
For point 2, I don't think the Symbol
-based _group
is a good approach. _group=:progress
may mean different thing for ProgressLogging.jl and AnotherProgressLogging.jl. I think it'd work if arbitrary objects (maybe only, say, isbits
ones?) are allowed for _group
. A convention to use a tuple to fan-out the log events sounds good. But if the intent of _group
was to describe software components of the program that is logged, it doesn't sound like a nice way to do it.
Achieving point 2 and 3 together is somewhat tricky. We do point 2 typically by dispatching which requires a type to be imported, contracting to point 3. One solution may be to add a group namespaced by the package identity:
struct NamespacedLogGroup
pkgname::Symbol
pkguuid::UUID
key::Symbol
end
This way, you can send an event to a package without importing it. Similar namespacing is possible with log level by bringing the package identity to type domain.
struct NamespacedLogLevel{pkgname, pkguuid}
level::Int32
end
Yes, group
should a be an iterable,
and should be empty by default (since we already have file
no?)
The question of level as ordinal or categorical really boils down to:
Should you be able to put the "categorical" meaning of a level i.e. debug, info, warn etc,
out of sync with the severity (i.e. -1000, 0, 1000).
To which I say you probably should not.
I am suggesting that we should not make it easy to have a Warning level deprioritized down to Debug level, via numerics.
Its fine to do it via transorming the level,
but adding the ability to do it numerically seems like it adds confusion for too little gain.
Yes, group should a be an iterable, and should be empty by default (since we already have file no?)
That would be consistent and probably a helpful clarification. I've always been uncomfortable about how group
is derived from file
which seems redundant and kind of a bit useless in the current system. Further up the thread you argued that group
should contain the categorical part of the level. There's some sense to that, but do you have a specific proposal for how that would work? It sounds like you've changed tack a bit here with proposing group = ()
by default?
Should you be able to put the "categorical" meaning of a level i.e. debug, info, warn etc,
out of sync with the severity? ... To which I say you probably should not.
This is a good question. So your point of view is: "in a logging backend, one should transform category, not importance"? That does make sense (I'm not completely sold, but I think I see some merits to this).
Practically, how should we model custom levels in a way which they can interoperate with arbitrary backends? One possible way would be to declare that custom levels correspond in some way to an (importance::Int,label::Symbol)
pair; this could be literally true, or extracted from a type per user-defined level and wrapped before passing to the backends. How exactly would the wrapping occur?
BTW, having a type per level visible to the logging frontend has interesting benefits; at least including the ability to define a new level-based shouldlog(::YourExtraVerboseLevel) = false
to completely remove all code related to some given levels by default. (People have long wanted this ability to completely compile out certain kinds of debug logging.)
This is a good question. So your point of view is: "in a logging backend, one should transform category, not importance"? That does make sense (I'm not completely sold, but I think I see some merits to this).
To be fair I am not completely sold either. I waiver on it.
Further up the thread you argued that group should contain the categorical part of the level. There's some sense to that, but do you have a specific proposal for how that would work? It sounds like you've changed tack a bit here with proposing group = () by default?
Yeah, my lastest thinking is that it should be multi-categorical.
ideally it would be a FrozenSet
, but we don't have an efficient type for that.
Practically, how should we model custom levels in a way which they can interoperate with arbitrary backends? One possible way would be to declare that custom levels correspond in some way to an (importance::Int,label::Symbol) pair; this could be literally true, or extracted from a type per user-defined level and wrapped before passing to the backends. How exactly would the wrapping occur?
I think a singleton type per log level makes sense.
with a severity(lvl)
function defined.
and lessthan
overloaded.
Do we need the ability to instantiate them from symbols?
If so we could definte parse(AbstractLogLevel, sym)
to check against subtypes(AbstractLogLevel)
I guess, or just do eval(:($sym)())
.
BTW, having a type per level visible to the logging frontend has interesting benefits; at least including the ability to define a new level-based shouldlog(::YourExtraVerboseLevel) = false to completely remove all code related to some given levels by default. (People have long wanted this ability to completely compile out certain kinds of debug logging.)
Constant propagation should get us this also,
though I always feel more comforable relying on type specialization than on constant propagation,
and i think it is clearer.
Practically, how should we model custom levels in a way which they can interoperate with arbitrary backends? One possible way would be to declare that custom levels correspond in some way to an
(importance::Int,label::Symbol)
pair
I can imagine defining importanceof(::AbstractLogLevel) :: Int32
and groupof(::AbstractLogLevel) :: Tuple
functions. The integer returned by importanceof
is used as the level
and the returned value of groupof
would be union
ed with _group
before passed to shouldlog
and handle_message
.
BTW, having a type per level visible to the logging frontend has interesting benefits; at least including the ability to define a new level-based
shouldlog(::YourExtraVerboseLevel) = false
to completely remove all code related to some given levels by default.
Can this be done out-of-the-box? As fieldtype(Task, :logstate)
is Any
, I suppose this won't happen automatically?
One possibility may be to use the trick @sync
is using. Something like
const logstatevar = gensym("logstatevar")
# `@logmsg` etc. use `@current_logstate()` instead of `current_logstate()`:
macro current_logstate()
var = esc(logstatevar)
quote
$(Expr(:isdefined, var)) ? $var : current_logstate()
end
end
macro with_logstate(logstate, ex)
quote
let $logstatevar = $logstate
$ex
end
end |> esc
end
then users can put @logmsg
etc. behind a function barrier
function f()
g(@current_logstate)
end
function g(logstate)
@with_logstate logstate begin
@debug "hello" # may be compiled away if debug information is in `typeof(logstate)`.
end
end
As this is rather annoying to write, I think it can be automated via a macro @taskspecialize
used like
@taskspecialize function f()
@debug "hello"
end
which is expanded to something like above where the name of inner function g
is gensym
'ed (like how keyword arguments are handled).
THe idea of including the log level in the group had not occured to me.
I am not sure how I feel about that.
why not also include the module in the group on that basis?
I was thinking at the extreme position where all categorical things go into group
. The reason to prefer _not_ putting module
, id
, file
, and line
(hmm... is it categorical...?) in group
may be to distinguish mandatory and optional categorical variables. It may be useful to assume that level category may not exist for supporting "raw level" like @logmsg -1 "hello"
(although it can be done with introducing a default level category). But I'm not sure if this is the right direction.
I can imagine defining
importanceof(::AbstractLogLevel) :: Int32
andgroupof(::AbstractLogLevel) :: Tuple
functions. The integer returned byimportanceof
is used as thelevel
and the returned value ofgroupof
would beunion
ed with_group
before passed toshouldlog
andhandle_message
.
Yes, this was exactly the kind of thing I had in mind; if we claim that group
should be for storing an arbitrary set of "category labels", there is some sense to just doing a union of labels. It does, however, mean that log level looses its special status as the primary category. All that's left is the importance level as an integer and a set of labels which may have come from various places. I think this is not so good because the intent of the callee may be obscured.
More concretely, how should we display messages which have had their level deconstructed prior to the backend seeing them? If we just preserve the level that the user supplies, we can simply prefix the message with show(level)
or some such.
Can this be done out-of-the-box? As
fieldtype(Task, :logstate)
isAny
, I suppose this won't happen automatically?
This is exactly the problem for the current shouldlog
taking the logger. It seems fundamental to the design principle that no user of the frontend macros should ever mention the logger directly. It's possible to hack up local solutions (ie, one function deep) as you've described but I don't find this terribly satisfying. My current thought is that we could allow @info "blah" _logger=lg
for users who really need the utmost control and performance and are willing to pass their logger lg
through a function barrier by hand. XRef https://github.com/JuliaLang/julia/pull/33121#issuecomment-526825404. This does run the "risk" that people will subvert the task-scoped design by passing loggers by hand as you do in other languages. Maybe no big deal and could be solved by good documentation.
What I'm proposing with shouldlog(level)
is a restricted version which first checks whether the level itself is enabled (this is essentially a global setting for all tasks, so no good for making any dynamic decisions). Essentially we already have something rather like this in the current implementation, but it assumes that you first want to convert to LogLevel
and compares to a global setting. I'm proposing that we generalize it by replacing the following line with if shouldlog(level)
This allows for levels which are truly off at compile time.
More concretely, how should we display messages which have had their level deconstructed prior to the backend seeing them?
Ah, this is the difficulty I wasn't aware of. It makes me think groupof(::AbstractLogLevel)
was perhaps wrong. If we try to extract a level category from _group
, it's not clear what to do when _group = (Info, Error, ...)
. (Maybe pick the one with the highest level?)
This makes me realize that _group::Set
(or ::Tuple
) is not a good approach for encoding mutually exclusive categorical labels (i.e., a log cannot be of level Info
and Error
at the same time). So, maybe putting categorical label in level
is the way to go?
What I'm proposing with
shouldlog(level)
is a restricted version which first checks whether the level itself is enabled (this is essentially a global setting for all tasks, so no good for making any dynamic decisions).
If the output of shouldlog(level)
can only be determined by typeof(level)
, the log with this level type can never be logged, right? So shouldn't it at least depend on (type of) the logger? Then isn't it impossible to compile out if shouldlog(logger, level) ...
?
Alternatively, I can imagine adding a module-local callable __shouldlog__(level)
like __init__()
so that the definition of __shouldlog__
can be changed at run-time via @eval
. But maybe too hacky and restrictive to be an official API?
My current thought is that we could allow @info "blah" _logger=lg for users who really need the utmost control and performance and are willing to pass their logger lg through a function barrier by hand
If someone needs that they can just do:
withlogger(lg) do
@info "@blah"
end
This makes me realize that _group::Set (or ::Tuple) is not a good approach for encoding mutually exclusive categorical labels (i.e., a log cannot be of level Info and Error at the same time). So, maybe putting categorical label in level is the way to go?
Which is the status quo.
I think we should leave group
for free-form user-defined categorical information.
I think that is naturally additive,
and if they do something weird then it won't break the logging system itself.
So, maybe putting categorical label in level is the way to go?
Which is the status quo.
Right. I meant that, if we can agree that the current design has been in the right track all along, we can be confident when start exposing/documenting public API to make new levels easier. I also didn't realize that implementing new level could be done already https://discourse.julialang.org/t/creating-a-custom-log-message-categorie/29295/17
if they do something weird then it won't break the logging system itself
This sounds like an important property to have.
I also didn't realize that implementing new level could be done already https://discourse.julialang.org/t/creating-a-custom-log-message-categorie/29295/17
Can we summarize what remains to be done in this issue,
and if the answer is:
_Only stuff relating to groups_
then we can close it, and open as new one summarizing the state and proposals,
Called "Taking Log Groups seriously"
I think we need at least some documentation to mention that you need to implement at least
Base.isless(a::MyLogLevel, b::LogLevel) = isless(a.level, b.level)
Base.isless(a::LogLevel, b::MyLogLevel) = isless(a.level, b.level)
Base.convert(::Type{LogLevel}, level::MyLogLevel) = LogLevel(level.level)
for a custom log level. But they are rather redundant. It would be nice if it can be done by something like
Base.Integer(x::MyLogLevel) = x.level
We can also recommend implementing Base.print
for adding custom log level names (instead of show
).
If someone needs that they can just do:
withlogger(lg) do @info "@blah" end
But withlogger
puts the logger in the current Task
in a box so all type information is (almost certainly) erased. The whole purpose of a _logger
argument is to preserve full type information for those people who need performance. Let's move this part of the discussion over to https://github.com/JuliaLang/julia/pull/33121 though.
I think this issue as a whole could be renamed "Taking log categories seriously" :-)
To move forward here, I think we need to
convert(::LogLevel, mylevel)
here was always a bit confusing.Regarding compatibility: any changes clearly cannot modify the syntax of @info
and friends as those are exported from Base
. However, I think we can and should formalize the meaning of _group
in a way which is broadly consistent with the way packages already use it. I think we can also tolerate some limited (especially soft) deprecations for logging backend methods if these greatly improve the API.
Looking back at my original proposal (https://github.com/JuliaLang/julia/issues/33418#issuecomment-536823903), I find my thoughts on a concrete way forward for log levels are largely unchanged. The conversation here about group
has been interesting and useful, but if we can agree that level is indeed categorical (albeit with a mapping to urgency) then group can be handled separately.
For log level API, we could debate a better name for importance
: urgency
or severity
perhaps?
One point which was mentioned in my original proposal (and may have derailed the conversation) is the idea of passing along the "current severity" through the backends in a way which is decoupled from the level. I still think this might work out but it could probably be explored separately.
- Agree on a clear semantic for log level
How about specifying that the minimal and mandatory API is that it has a mapping to an integer (= importance
)?
As an optional API, it can overload print(io, level)
to control how it is printed by the message handlers.
For log level API, we could debate a better name for
importance
:urgency
orseverity
perhaps?
I find severity
nicer since importance
or urgency
can be used in contexts other than logging (e.g., work queues, hooks).
the idea of passing along the "current severity" through the backends in a way which is decoupled from the level
Can't this be already done with AbstractLogLevel
? Something like
struct LabelTransform <: AbstractLogLevel
label::String
original::AbstractLogLevel
end
severity(level::LabelTransform) = severity(level.original)
Base.print(io::IO, level::LabelTransform) =
print(io, level.label)
We can also change severity while keeping the log label (print
):
struct SeverityTransform <: AbstractLogLevel
level::Int32
original::AbstractLogLevel
end
changeseverity(f, level::AbstractLogLevel) =
SeverityTransform(f(severity(level)), level)
Base.print(io::IO, level::SeverityTransform) =
print(io, level.original)
Then, you can use it like this
using LoggingExtras, Setfield
downgrade_logger(sink) = TransformLogger(logger) do log
return @set log.level = changeseverity(x -> x - 1000, log.level)
end
I am in agreement with @tkf on basically everything just said.
How is this for deciding on a semantic for log levels.
Log levels characterize the severity of a log message.
They are ordinal: ordered by increasing severity.
The standard levels following:Debug() < Info() < Warn() < Error()
.
Each level can also be considered as distinct in purpose:
Debug
is for the developer of the program,
Info
is for general infromation to the user (Think of it as an alternative to usingprintln
directly),
Warn
means something is wrong and action is likely required but that for now it is still working,
andError
means something is wrong and it is unlikely to be recovered -- it will normally be use to display information prior to an exception being thrown.
It is possibly when working with loggers to select by level (e.g.log.level==Warn()
) or by severity (log.level > 2000
.There is an orthogonal concept of group.
Log messages can be assigned to a group.
if you are looking at labelling some of your log messages you can do so either with a new log level or by assigning them to a group.
When deciding which consider:
- If your label would fit nicely into the ordinal structure of the standard levels; such that it would be reasonable to say "Silence all log levels of severity equal to that of your label and below", and that you can imagine when investigating a problem wanting to see "all log messages o severity equal to your label"; then it likely should be a log level. Common examples include
Trace()<Debug()
excessively detail log messages, andCritical() > Error()
for failures that may have consiquences beyond the crash of the program.- Where as if you can imaging having your label with varying degrees of importance, then it likely is a best served with a group. For example
:missing_data
group could be tracking all log messages relating to data collection failures, and these may vary in level depending on how important that data was, or how much is missing.Don't worry too much about getting it wrong, you can always edit your program later.
And you can define your own logging macros for both new levels and groups like so:...
Excellent. Yes, I agree that severity
is preferable (actually I did change that yesterday on my WIP branch because it "seemed better" but I couldn't have articulated exactly why. I think your reasoning is spot on.) Note that AbstractLogLevel
is not currently a thing, but it also exists on my WIP branch (cjf/logging-custom-levels — hopefully I'll get to publishing a draft PR soon so we can discuss specifics).
@oxinabox I like that discussion of level vs group, and I think your description of the standard levels is sharper and clearer than what we have in the current Logging docs (see https://docs.julialang.org/en/v1/stdlib/Logging/#Log-event-structure-1)
Log levels characterize the severity of a log message.
I would describe L <: AbstractLogLevel
as a category with a natural mapping to severity via the severity
function. I feel like it's consistent and useful to allow user defined log levels with the same severity as Info
, but with a different meaning.
Error means something is wrong and it is unlikely to be recovered -- it will normally be use to display information prior to an exception being thrown.
Yes to the first part, but I do disagree with the second part of this: this advice will generally cause errors to be logged twice (once before the throw
, and a second time after some high level part of the program catches the exception and also logs the exception as an error). It may be more times if the second catch also results in a throw. I've seen this in production systems before and it always annoyed me! I feel like "if only we had a nice way to attach more context to exceptions" we could easily communicate that context to the catch site and that would be a better place to log the information as a single log event rather than multiple events spread out over various packages.
I would describe L <: AbstractLogLevel as a category with a natural mapping to severity via the severity function. I feel like it's consistent and useful to allow user defined log levels with the same severity as Info, but with a different meaning.
Ah, I had been thinking of
Categorical with a natural mapping to a subset of reals an ordinal as equiv.
But it is not one-to-one, since there can be ties.
Is partially ordinal a term?
Ordinal allowing ties.
Anyway, purely academic.
But it is not one-to-one, since there can be ties
Exactly. The technical term seems to be preorder but this refers to a whole grab-bag of ordering relations which are "almost" a total order but not quite due to missing some subset of the total ordering axioms. In this case our ordering relation is reflexive, transitive and total, but not antisymmetric; this last point means that it fails to be a total order.
An (exact?) analogy is ordering the complex numbers by magnitude: this is also reflexive, transitive, and total, but not antisymmetric.
(A partial order is a particular type of preorder which is reflexive, transitive and also antisymmetric but not total; this is not what we have here!)
I've got a PR up for this which is mostly complete in implementation: https://github.com/JuliaLang/julia/pull/33960.
I have ditched any attempt at passing the severity
through the shouldlog
/ handle_message
chain because ultimately this idea is an efficiency hack which is intrusive to the API. That could still be worthwhile, but should go in a separate efficiency-focused PR and justified via benchmarking.
Please do review, I value all your feedback!
Continuing some thoughts brought on by the mention of depwarn
at #33960...
I think it's instructive to consider how we would like log level and group for depwarns to relate to julia --depwarn=no|yes
.
It seems there's at least three ways to do this:
Warn
or BelowMinLevel
, depending on the value of the --depwarn
flag. The group is set to :depwarn
.DepWarn
, which has severity(::DepWarnLevel)
depending on the value of the --depwarn
flag.Warn
level with group set to :depwarn
and let the logging backend decide what to do with them. --depwarn=no
could add a log filter by default.In principle, I feel that a "deprecation warning" does not cease to be a "warning", even though it may have a severity which is lowered by setting --depwarn=no
. So this would argue against the current option (1).
Option (2) is quite easy to implement _efficiently_ if #33960 is merged in the current form, though it's still a global setting which clashes with the idea of configurable logger backends.
Option (3) is quite conceptually clean: the level is always Warn
, but may be filtered out by searching for :depwarn
in the group. The issue is that it's hard to make this really efficient with the current depwarn stack walking machinery used to get the module, but maybe we just need to work harder at it.
A thought about efficient handling of group
: as with anything related to shouldlog
, the best efficiency can only come from devirtualizing part of the calculation in the style of min_enabled_level
. group
and module
handling is generally about set membership (in a small set), so it would be interesting to investigate whether fixed width bloom filters could be used in the "fixed pipeline" early filtering (in the style of JULIA_DEBUG
but without the limitations).
I think another way to propagate categorical information to from the logger frontend to the handler is to use a custom type of _id
. It can already be of any type so nothing can stop from people using it. Arguably this is the most faithful way to propagate category of the logging information to the log handler. Log record transformation/routing/filtering pipelines may alter the log level (e.g., increment or decrement). But it is very evil of them to alter the identity of the log record. It is also passed to shouldlog
so the log handler can know the category before the full record is constructed.
I think another way to propagate categorical information to from the logger frontend to the handler is to use a custom type of _id
This is a good point.
So.. I think the problem with log record categories is that we simply have too many candidates for what might be considered categorical information. We have, at least:
This is somewhat embarrassing complexity!
If not for lazy record construction, I'd suggest that message
would actually be a good place to put categorical data, as part of the message type. For example, we already do this for progress logging and I think it's working out well.
Some other connections to this suggestion:
@show
to the logging system. It's basically @info
without the message. What is the message here?! The message may naturally be a singleton ShownVars()
, with the attached variables going into the usual key-value set.@info
should not imply "log with info level", but rather "log an Info message with info level"Putting aside compatibility for just a moment in the pursuit of conceptual clarity, what if we had the following correspondences:
@info "msg"
@logmsg InfoSeverity Info("msg")
@warn "msg"
@logmsg WarnSeverity Warning("msg")
@show a b c
@logmsg InfoSeverity ShownVars() a b c
@logprogress "msg" val
@logmsg ProgressLevel ProgressString("msg", #= ... =# progress=val)
Does this seem conceptually clean?
Most helpful comment
I think another way to propagate categorical information to from the logger frontend to the handler is to use a custom type of
_id
. It can already be of any type so nothing can stop from people using it. Arguably this is the most faithful way to propagate category of the logging information to the log handler. Log record transformation/routing/filtering pipelines may alter the log level (e.g., increment or decrement). But it is very evil of them to alter the identity of the log record. It is also passed toshouldlog
so the log handler can know the category before the full record is constructed.