Crystal: Proposal: improve stdlib logger

Created on 27 Mar 2018  ยท  84Comments  ยท  Source: crystal-lang/crystal

This is piggybacking off of some chat discussion by @RX14 and @straight-shoota earlier today.

Currently there is no great way for library authors to log without friction. A library can't just write to STDERR arbitrarily because that's a potentially unwanted side effect. It instead has to take an optional Logger? that is nil by default, which requires boilerplate and is unlikely to actually be used.

If we instead establish a class-based logging hierarchy into the stdlib, library code can log unconditionally, with downstream users selectively enabling logs from the libraries they care about. I propose improving the situation by making the following changes:

Structure loggers hierarchically, with granular level controls

A Logger should be able to take a parent, to whom the actual log writing is delegated. By default, a child Logger's level should be nil, which means it inherits the level of its parent.

main_logger = Logger.new(STDOUT)
child = Logger.new(main_logger)
noisy_child = Logger.new(main_logger)

main_logger.level = Logger::WARN
noisy_child.level = Logger::DEBUG

child.info{ "This logger uses its parent's log level and this message does not get printed" }
noisy_child.info{ "This message will get printed because noisy_child has its own log level set" }

Make logging class-aware by default

It should be trivial to define a logger that is associated with a particular class or module. That logger will inherit from the logger of its parent in the module namespace hierarchy. Parents can be created automagically.

module MyProgram
  class App
    include Loggable # creates a Logger and adds helper methods e.g. `warn`, `info`
    def initialize
      info{ "creating an app" }
    end
  end
end

# Sets the effective log level for every Loggable under the MyProgram namespace.
# Note that the Logger for MyProgram got created behind the scenes at the same time as the Logger for App
Loggable.get_logger(MyProgram).level = Logger::DEBUG

# Logs the following (note the class name is included): 
# I, [2018-03-26 14:43:17 -07:00 #25575] INFO -- MyProgram::App: creating an app
app = MyProgram::App.new

Loggable will have a root level logger that by default does not log anything, so library code will stay quiet. To up the verbosity, just do:

Loggable.get_root_logger.level = Logger::DEBUG  # Enables logging for every class and module in your program, including libraries.

Unrelated to the above points, Loggers currently only operate on IO objects. Modern applications frequently eschew traditional log files in favor of structured logging, so this behavior should be more customizeable. Thus, I also propose to

Abstract Logger away from IO

Instead of storing an IO directly, Loggers should store a LogWriter, which is a module with abstract def write on it. The standard library should include an IOLogWriter class that takes on the formatting behavior of the current Logger, but is interchangeable with custom-defined classes like so:

class GelfLogWriter
  include LogWriter

  def initialize(@transport, @host, @port: 12201)
  end

  def write(severity, datetime, progname, message)
    # build the document and send it over the network here...
  end
end

Loggable.get_root_logger.handler = (GelfLogWriter.new :udp, "graylog.example.com")

I'll be happy to work on implementing these changes if the core team thinks they'd be useful.

to-design feature stdlib

Most helpful comment

Logging vNext

The following is the logging module Log we've been designing recently. The implementation is not done yet, but we've been iterating on different scenarios and requirements to reach the current proposal.

Generating logs

The are a couple of severity levels it can be used to emit logs: debug, verbose,info, warn, error, fatal.

Usually you will type Log.[level] { "a message" }

Log.info { "Hi there!" }

Note: Using the result of the block for the message will allow the runtime to ignore possible conversions, string interpolations to build the message if it is not going be emitted.

The Log is a new top-level constant in the std-lib (Logger will be deprecated and removed).

When generating logs you will use a Log instance. Each Log instance is bound to a source or path. Log.get(source : String) : Log will allow you get a new instance. The top-level Log acts as an instance of an empty string source Log.

When defining a new shard or module you will define a Log constant for that namespace.

module DB
  Log = ::Log.get("db")
end

Thanks to the relative resolution of constants, using Log.info { ... } inside DB types will resolve to the one previously defined and not the top-level one.

If you move code around during development, you will not need to deal with logger/@logger/@@logger substitutions.

If you need nested sources, you can use the parent namespace Log and avoid hard-coding the source name.

class DB::Pool
  Log = DB::Log.get("pool")
end

Or, if you want to hard-code it.

class DB::Pool
  Log = ::Log.get("db.pool")
end

The result of Log.get(source : String) will return the same object instance for the same source. This will allow overriding the severity level of the log during specific moments programmatically.

Log.get("db").level = :verbose # or Log::Severity::Verbose
do_something_with_db
Log.get("db").level = :none # Log::Severity::None disables all levels

# DB::Log can be used instead of Log.get("db")

Note: Setting the level is specific to a source and not it's nested ones. The previous example doest not affect the db.pool source.

Note: Note that setting the level of a Log instance will affect all other Fibers that are concurrently using it.

When building a program it will be possible to ignore completely as part of the compilation the debug calls. So there is no penalty even to check if those logs should be emitted. The debug methods can be replaced by noop during compile-time.

Writing backends

The std-lib will come with a basic backend to write to stdio but you can define your own backend by inheriting Log::Backend.

  • #initialize(Log::Backend::Config) will be used to set up the backend with some given configuration
  • #write(Log::Entry) will be called to write the given log entry.
  • #close to close resources used by the backend
  • The Log::RegisterBackend annotation will define the name to reference it in the configuration.
@[Log::RegisterBackend("custom")]
class MyCustomBackend < Log::Backend
  def initialize(config : Log::Backend::Config)
  end

  def write(e : Log::Entry)
  end

  def close
  end
end

Whether the write operation in sync or async it will be up to the implementation. There might be some base modules to implement easily an async backends (similar to IO::Buffered).

Configuration

There will be a way to configure a series of backends and use them to write the logs of certain sources.
When requesting the Log for a source, the runtime will load the configuration and bind the Log instance with the corresponding backends.

The std-lib will read a crystal-log.yml from the current directory. Or setting CRYSTAL_LOG_CONFIG you will change the file location.

A sample of the configuration file with potential options are:

version: 1

backends:
  stdio:
    type: stdio # -> Search for @[Log::RegisterBackend("stdio")]
    stderr_level: warn
    formatter: Log::DefaultFormatter
    format: # pattern/template 
    color: true

  file: # if no type property is given the name will be used as such. So @[Log::RegisterBackend("file")]
    file_pattern: app.%d{yyyy-MM-dd}.log
    location: /var/logs/
    max_file_size: 10mb
    formatter: Log::DefaultFormatter
    format: # pattern/template 

  elastic:
    url: localhost:9200
    index: app-%YYYY-%mm-%dd

bind:
  - sources: * # other options: โŠ” (empty string), db, db.*
    level: WARN
    backends: stdio, db

  - sources:
      - db.query.*
      - http.client.*
    level: INFO
    backends: 
      - stdio
      - db
  • โŠ” (empty string) will show only the top-level source
  • * will show all the sources
  • a.b.* will show a.b and every nested source.
  • a.b will show only a.b
  • sources and backends can be array or comma separated values for convenience

The default configuration when no config file is found is:

version: 1

bind:
  - sources: # (empty)
    level: INFO
    backends: stdio

You can set CRYSTAL_LOG_LEVEL, CRYSTAL_LOG_SOURCES to change the default configuration.

Implicit Context

The context information will live in the fiber. A new fiber will start with an empty context. Context are immutable but can be extended with convenient methods. The context are immutable so the backend can keep a reference to the context that was used when generating the Log::Entry.

The current context can be read from Log.context.

You can change the current context with clear and set methods.

Log.context.clear # => clears current context
Log.context.set request_id: 34, user_id: "[email protected]" # => extends the current context

When spawning Fibers you can inherit the context explicitly:

c = Log.context # current context
spawn do
  Log.context = c
  # ...
end

The context.using can be used with a block to restore to the previous context.

Log.context[:request_id] # => 23

Log.context.using do 

  Log.context.set request_id: 57689
  Log.context[:request_id] # => 57689

end # previous context will be restored at this point

Log.context[:request_id] # => 23

Under the hood the context.using version uses .context/.context=. The context.using block example is equivalent to:

Log.context[:request_id] # => 23

previous = Log.context
begin

  Log.context.set request_id: 57689
  Log.context[:request_id] # => 57689

ensure
  Log.context = previous
end


Log.context[:request_id] # => 23

Other types

Disclaimer: this is more the structure description rather than the actual definition.

alias Bakend::ConfigValue = Bool | Int32 | Int64 | String | Array(ConfigValue)
alias Bakend::Config = Hash(String, ConfigValue)

alias ContextValue = Bool | Int32 | Int64 | Float32 | Float64 | String | Time | Hash(Symbol, ContextValue)
alias Context = Hash(String, ContextValue)

record Entry, source : String, severity : Severity, timestamp : Time, message : String, context : Context

All 84 comments

The biggest selling point has not even been mentioned: class-aware loggers should also be hierarchical, meaning you can configure logger settings for MyProgram which also apply to enclosed namespaces like MyProgram::App unless the setting is overwritten somewhere down the hierarchy. This allows for very fine grained control of the log level for individual components.

It would be great to have this integrated in the stdlib so every shard can and should use it as a single logging solution which integrates with every Crystal library.

Just a note: I don't think a Loggable module makes much sense. Log methods should not pollute the instance namespace but use the logger as explicit recipient. Then it comes down to essentially LOGGER = Logger.new(self.class) or something similar in the class scope, which really doesn't need to be extracted in a module.

  1. How does encapsulating loggers solve anything?

For example, this requires a child logger to know the parent app or library, but that's impossible unless the parent configures the child... why bother with encapsulation, then, instead of directly passing a logger instance?

  1. IMHO a better approach would be to introduce a global Logger actor. It would unning on its own fiber, and be configured once for the whole application (level, filters, output adapter such as IO, syslog, ...). Each library could then send it messages to log, with a library identifier to filter.

    That doesn't solve the hierarchical issue, but it does solve the proliferation of having many logger instances.

I think a better solution is to have many loggers, but centralised configuration. Each logger will know what namespace it's at (::HTTP::Client for example), then when you change configuration, the central logging configurator will apply rules based on starts_with?, ordered by priority according to size. For example ::.log_level will configure every logger (every logger's namespace starts with ::) and ::HTTP.log_level will configure just loggers in the stdlib's HTTP library. The ::HTTP configuration will always override the :: configuration because it's a longer string.

For example, this requires a child logger to know the parent app or library, but that's impossible unless the parent configures the child...

If you index loggers based on class name (as given by self.class.name), you can find a parent by splitting on "::" and dropping the last element.

My reasoning for using Loggable (which needs a better name but I'll go with it for the moment) was that the system really benefits from widespread adoption, and including a module is the most low-friction, crystal-idiomatic way to indicate that a class should do a thing. LOGGER = Logger.new(self.class) is less atomic than include Loggable. Without knowing anything else about the system, the latter says to me "this is the correct way to do loggy things" more than the former.

@RX14, I don't understand what that offers over what I or @ysbaddaden have suggested. Is the configurator setting log levels on each logger one by one? If I do ::Foo::Bar::LOGGER.level = Logger::INFO and then LoggerConfigurator.set_level(::Foo, Logger::WARN), what happens?

@ysbaddaden I agree logging should be global. Local Logger instances would just be thin wrappers and delegate to the main logger, decorated with the current type/logging scope.

The design described by @RX14 would probably work out pretty similar and it seems to me that it would merely be an implementation detail whether distributing log messages to subscribed handlers happens local with global config or global.

@ezrast Well, it could also be a macro (for example def_logger or whatever) instead of include Loggable. My point is that there is no need for having a module for nothing more than defining a single class constant. And IMO it shouldn't need anymore else. That's enough for Java's excellent logging system, why would we need anything more?

I think a global / centralized Logger in Crystal makes little sense. Unlike Elixir, Crystal doesn't provide solutions to structure services in an application.

Having scattered logger instances is a pain, and the solutions proposed here are really nice, but they assume a particular way to design and model an application (namespaces, hierarchical class names, central configuration). Again: Crystal's stdlib doesn't provide nor encourages anything more than spawning fibers and communicating through channels. I'm afraid assuming more in Logger will make it a pain to use in different contexts. Assuming namespaces / class names are hierarchical and meaningful out of any context is very weird.

I believe it's a framework job to come with solutions for this. See for example Earl::Logger which assumes a lot about the program, but Earl's job is precisely to structure apps.

That being said Logger must evolve. For example we could support and thus allow the configuration of:

  • one to many handlers, so we could log to STDOUT and to a syslog server for example;
  • filter mechanisms;
  • maybe formatters (but that may be strongly related to handlers).

@ysbaddaden You're quite right if we build a big web application, but for smaller ones like a system tool/API server aLogger will be very welcome. A framework for this will be overkill in this case, maybe a shard but using directly the stdlib's logger would be nice!

It's true that Crystal doesn't rely on strictly hierarchical namespaces for library components. But I don't think that necessary limits the usage of a hierarchical logging system. The logger namespaces don't even have to correlate to Crystal namespaces. It doesn't have to assume anything. It just provides a simple interface that is capable of logging hierarchically. If and how this feature is adapted depends on how libraries and apps use it.
For shards it is common and encouraged to put library code in a namespace. That would be at least some form of namespacing used in Crystal.

@j8r Application size is irrelevant. Logger should be generic enough to be useful in any environment.

@straight-shoota I'm reading on java.util.logging and I'm revising my judgement. There is a single Logger from which you can request a logger instance, reating a hierarchy; namespaces are any dot separated string; you can configure loggers from anywhere; it doesn't assume much about application design. That looks good.

WDYT about defining just the interface of the logger and leaving configuration / instantiation to the framework?

That way the shards and components could agree on how to send info to the logger.

It would be comfortable to be able to pass around the instance of a logger from a parent to a child so for example an http server can pass the loggers to the handlers.

It would be nice to support some kind of namespace/activity logging configuration. And allow them to be specified while calling the logger so the instance can be shared between parent and child.

So, from the current log(severity, message, progname) we could introduce a context that I imagine it will match usually the namespace/class of the caller.

For tracking the activity I am not sure if should lay in the stdlib. Let me show an example to ilustrate more: let's assume there is an app with and api to register a user. The user registration involves some db statement and some call to external services. If we want to control the _user registration_ is not enough to state the the db namespace or the http api namespace should be logged. So the code base needs to say that the activity been done is user registration. How this information is passed around is either explicit in every call (IMO ugly) or the responsibility is of the framework + logging system used. If we took this later approach, then there is nothing to do regarding the tracking of activity, just to not block the possibility.

So my proposal for the stdlib is to make a module that will define the interface of logging with an additional context. Have a basic STDOUT implementation that only obey severity, and allow logging framework shards to play nice with that idiom (implementing filters by context).

@bcardiff, Have you looked at https://github.com/crystal-lang/crystal/pull/5921? Over there the interaction with IO is split out into an IOAdapter class that can easily be swapped out to provide arbitrary logging behaviors, and the Handler class can be improved to allow filtering above and beyond the standard log level threshold. I think it basically lines up with what you're saying, except that Handler provides a default way of filtering selectively based on context. Are you saying that you'd rather have this functionality omitted from the stdlib, or extracted into a concrete implementation of a more basic/abstract handler class?

@ezrast there I checked the PR. I do meant to leave the stdlib as plain as possible without assumptions of the lifecycle / locking of loggers. I would vouch for

  • providing a basic logger implementation that will filter by severity level
  • change reduce the interface of logger to decouple implementors (and it's configuration) from consumers
  • provide idiom to: state a component will use logging and forward loggers from parent component to child components.

More or less this is cover in https://gist.github.com/bcardiff/c5126a9c19fe36987819bc6e86ea7d3f so we can check ideas in actual code.

I would leave more advanced configuration options and loggers architecture out of the stdlib but compliant with the interface exposed in the gist.

@bcardiff regarding your gist, how would you centrally configure all the outputs for every logger? I also don't like the design because it uses macros and introduces methods into the classes which want to use loggers.

My ideal logging interface would be:

class Foo::Bar
  @log = Logger.get(self.class) # (same as Logger.get("::Foo::Bar") - the :: could be replaced by . for easy parsing)

  def do_something
    @log.trace "doing something"
  end
end

I'm think that loggers should have "one way to do things". Passing around a logger instance, each instance having a namespace which by default inherits it's formatter and loglevel from it's parent, sounds flexible enough to me. And if java's had good success with the same model, why not?

Configuring loggers then becomes easy like:

Logger.get("::").level = Logger::INFO
Logger.get("::").handler = Logger::IOHandler.new(STDERR) do |severity, datetime, progname, message, io|
    label = severity.unknown? ? "ANY" : severity.to_s
    io << label[0] << ", [" << datetime << " #" << Process.pid << "] "
    io << label.rjust(5) << " -- " << progname << ": " << message
  end
Logger.get("::HTTP::Server").level = Logger::DEBUG

The first two lines would be the default of course.

I won't expect by default that the loggers have a logger factory pero consumer class. But rather pass the instance from parent to child component. Whether that instance is the logger itself or if it instantiate objects in the middle it's up to the logger implementation.

But if the Logger.get(self.class) or other global registry is used, then the user won't be able to override the logger per component instance.

In the gist the first parameter of Logger.using "app.foo" would have a default value mangling the namespace of @type, definitely. So if no the component has dependencies onlye a Logger.using would be needed.

I like how calling debug info as protected method keeps the code clean, but is a personal opinion. I could go with log.debug, but in order to do that, in my proposed design where logger instances is shared, the instance returned by log (in log.debug) should be a struct that has the context information for component where it is been called. I would try to make a gist with that option to illustrate if polluting the object protected methods is a strong no.

There are many opinions on how is better to configure the logging. I think we can leave that for the framework / app, and decouple.

the user won't be able to override the logger per component instance

what do you mean?

In Java's scheme you never need to "override a logger", since the Logger class becomes a shell, delegating to the handler, which contains all the logic.

the instance returned by log (in log.debug) should be a struct that has the context information for component where it is been called

It seems we've arrived at essentially the same design - except that your context struct is renamed to Logger, is mutable, and that the instances come from a registry. The instances coming from a registry allow for more flexibility than reaching straight for macros.

The problem I have is with your design is that "passing the logger from parent to child" is boilerplate, and doesn't make sense for logging from class methods.

As a final note, my example above is actually incorrect, @log should be @@log.

Let's say you create two db connection, that will create statements. If you don't allow logger per instances and passing loggers from parent you child, how are you going to allow different logging configuration between the two db connection?

Maybe I am missing something that allow that to work: like registering in a global logger configuration per instances...

It's true that I was not considering class level logging. But will do.

I play a bit on how to get rid of the method pollution from the previous proposal, add default naming context and bind that context when invoking the logger. The code might seem weird but it achieves that the memory footprint is just a reference to a Logger (8 bytes).

@bcardiff there's absolutely nothing stopping you creating arbitrary named loggers whenever you want and passing them wherever you want. In the above example I was just expressing the common idiom. Having a registry with custom string namespaces allows you to implement whatever idiom you want in your code - even the common idiom of 1 logger instance per class becomes a single line. Your proposal requires a macro to implement that.

The difference without your suggested approach where named loggers are created and passed around is that, once passed, the context will be the one used at the creation of the logger and not where the logger is been called.

Maybe it's enough, but I was aiming for using a single logger instance in different code context (with potentially different log levels)

I will give it another spin to see what happen...

@bcardiff, I dislike the coupling of modules that comes along with your forward mechanism; adding or removing a child module shouldn't force you to update the implementation of the parent.

Additionally, I'm under the impression that part of the point of centralized configuration is that it makes it easy to deal with logs coming from code you don't control. Under your paradigm, if I require shards that want to log things, then to change logging behavior I have to invoke logger= on the top-level module of every single shard (either manually or through a forwarder of my own), and hope that the shard author set up all their forwarding correctly. It's not a ton of work but it's not the most streamlined.

With centralized configuration I just set behavior once at the root level, and every logger does what I want by default unless explicitly overridden. I can still have different components log at different levels or to different places; the only difference is whether the configuration is all stored in a single structure or scattered throughout all of the program's logger instances.

the context will be the one used at the creation of the logger and not where the logger is been called.

But that's usually what you want. It offers the flexibility of both. If you want to specify the logger namespace at the callsite just call .get(...).log, if you want it to be per-class, put in in a class var, if you want it to be per-instance, put it in an ivar, if you want the whole logger to be passed into the class just do it.

I'd love to hear the usecase you have in mind (in code) which is difficult with this approach.

My previous proposed design is more about how I see the architecture of an app, but I don't want to impose the design, but rather find a balance between flexibility and comfort.

Given the last feedback and the idea to keep things as decouple and flexible as possible I came to https://gist.github.com/bcardiff/55ebbf2e20b1670681e9352fbaa51be0 where Loggers are consumed by the user and have a context determined on instantiation and LoggersHandlers that actually emit the information to a stream configured by set_handler_factory.

The gist is more similar to #5921 and support the usecase @RX14 vouch for I think. There some differences thought: the logger itself has no configuration of level, that is up to the handler factory.

The framework/app is required to configure a handler_factory to create/return, for a given context, a handler that will receive log(severity, message, context) message. The logger, used by the user has a reference to the handler and the context at creation. So invocation to #log(severity, message) and #{{severity}}(message) will be able to be forwarded to the handler with the context.

A issue with this last gist is that class variables in the stdlib would be initialized _before_ the framework defines the handler factory. This issue is also present in #5921 I think.

To solve that I see two options, the first, which I don't like is to assume the configuration of the handler factory is done via monkey patching (really ๐Ÿ‘Ž, I don't want to relay on monkey patching for this).
The second is defer the invocation of the factory until the first log event. This is done in https://gist.github.com/bcardiff/ea21543eff6437c508abb48c060116ef .

I didn't spend time deciding which should be a reasonable default value and logger for the stdlib. Once we agree with the common interface we can move forward into that and implement nice logger tweaking configurations outside the stdlib.

@bcardiff I'm fine with that interface, it looks super flexible and has basically the same interface to the logger - although it seems like it's harder to configure on the app's side than just assuming inheritance. I'd personally stick to making assumptions here, but i'll wait for others to weigh in.

Even with lazily instantiated handlers, there's no way to gracefully change log levels after program initialization. If you want to enter "debug mode" after you've been running for a while you have to force all your modules to update their loggers somehow.

5921 doesn't have quite the same issue in practice. It's true that you can't make libraries use any handler other than the default, but using the default handler is what you're expected to do 99% of the time anyway, and you can change its levels at any time.

That does raise the question of why I even bothered letting other Handlers be instantiable, so I probably have some simplification to do there.

The limitation which affect also #5912 that I was referring to is that any logger used in the std and initialized as a class variable wont be able to use an overridden configuration invoked after the prelude is required.

Whether or not a log level change is supported in runtime or only on boot, in last proposal it will be up to the implementation since there is actual no implicit behavior in the std lib part.

@bcardiff In the end, your proposal uses a centralized instance, the logger factory. In #5921 this is called default_handler and it's a handler instance, instead of a proc.

This setup can avoid the initialization issue when Logger#log delegates to Logger.default_handler. You don't get individual handlers for each logging source by default, so this really calls for a central registry based on namespaces to configure log levels etc. This registry might or might not be included in the stdlib implementation.

@bcardiff my solution doesn't have that problem since .get is memoised. There's still the problem that you can log before the logger is configured, but I don't think that's solvable. At least you can configure all loggers regardless of when they're initialized, since you can always access any logger instance through a centralised interface.

How about this? https://gist.github.com/ezrast/98bb9ed3341c0b22c660549eae049b5b

It's similar in approach to #5921, but I think simpler and more streamlined. Adapters aren't separate from Handlers anymore, SeverityFilter does what Handler used to do, and if you want a completely generic interface just pretend that SeverityFilter and IOHandler don't exist and assign a custom Handler instance using Logger.base_handler=.

@ezrast While I support the general idea, this example has too much clutter. SeverityFilter is IMHO completely useless. A centralized logging system based on hierarchical namespaces needs a capable registry that allows more detailed configuration per namespace than just log level filtering. Instead of having SeverityFilter in stdlib, we should just go the full way.

It's debatable if such a registry should be included in the stdlib and used by default. If no, there shouldn't be a half-baked SeverityFilter either, but just a plain handler.

Hmm, so we need to decide 1) what is the minimum viable feature set necessary for a logging system, and 2) do those features belong in the stdlib. I'd like to hear more about what you have in mind for (1), but my initial stance on (2) is that this is a nice battery to have included, and I don't like defining an interface without a reasonable reference implementation.

@ezrast @straight-shoota
TBH What I really want is to automagiclly get the ability for logger to be

  1. set via MyClass.log (output, severity)
  2. output: MyClass::MySubclass message

Yeah, sorry for wandering off, I just wasn't sure where to go next since a lot of the feedback I was getting was calling for removing things that I see as table stakes.

Instantiating a new logger should be one line of code. Changing the log level should be one line of code. These are fundamental operations and most of the time are the only things users actually want to do. If my first experience with a library as a user is that I have to instantiate a factory and pass it a callback just to flip a switch from "medium" to "medium-high", I'm going to back away slowly and use puts instead.

Individual components need their own log levels. This is the only low-friction way for libraries to be able to log things without getting in the way of users. Without this built into the logging system, any library that wants to log things has to have its own interface for controlling verbosity - they can't just pollute STDOUT/STDERR without giving users some way of turning it off. There's no reason to make every shard author reinvent this wheel when we could just put it in the standard library once and be done with it.

I don't think an implementation that doesn't have these things is worth considering. This means that some logging configuration will be "global". This is 100% okay. STDOUT is global already; the abstraction we put in front of it can be global too. There's no business logic lurking in there to ruin all your encapsulation when you're not looking.

I don't want to bikeshed this too much so I'm going to hack up another PR (might be a few days before I have time). I'll keep the filtering semantics of the default handler as isolated as I can from the interface so y'all can add to or remove from them as you please. If there are any other features or use cases that haven't been addressed by the proposals so far, please mention them.

https://forum.crystal-lang.org/t/introducing-crylog/797

I ended up making a logging shard before noticing this issue. It's based on the popular PHP logging framework Monolog, and seems sorta similar to https://github.com/crystal-lang/crystal/issues/5874#issuecomment-381690608.

While it might be a bit much for the standard library, my pipe dream for this would the stdlib define the core concepts and implement the basics. Then other shards/people can implement extra third party handlers. This way, eventually every project would be using a common framework and the handlers/formatters/processors would be plug and play as they are developed.

Just my 2 cents.

I may have missed some points, but I think the default Logger is quite flexible and can do a lot of this stuff with class_properties. This is flexible, easy to understand (no new methods or DSLs), and works right now. It is also type-safe. No symbol or string lookup of Loggers. You're calling methods all the way down

Tons of flexibility with class properties and regular methods

class MyProgram
  class_property logger : Logger = Logger.new(nil)
  class_property log_level : Logger::Severity = Logger::Severity::INFO
end

class MyProgram::SomeClass
  def do_something
     MyProgram.logger.log("some_message", level: MyProgram.log_level)
  end
end

# or if you want to allow a particular class to have a custom logger
class MyProgram::OtherClass
  class_property logger : Logger = MyProgram.logger

  def do_something
     self.class.logger("some_message")
  end
end

# Customize like so
MyProgram::OtherClass.logger = Logger.new(STDOUT)

You can also configure log levels with class properties. This is what we do on Avram (but we use Habitat for settings instead of class_properties, but the same ideas apply). It works quite well. You can configure different log levels, and loggers as desired. https://github.com/luckyframework/avram/blob/c4f0358c14efb7f7ee178180db24040298613481/src/avram.cr#L21-L23

Using IO::MultiWriter

You can also use IO::MultiWriter to write to multiple IO objects. https://crystal-lang.org/api/0.30.1/IO/MultiWriter.html so you could write to STDOUT and a tmpfil if you want. It is pretty awesome.

One thing I'd change. Use data not just string

A lot of log data can/should be teated as data and not just a raw string, and should also be formatted differently depending on the context. This is why I wrote Dexter: https://github.com/luckyframework/dexter

It is nearly identical to the built-in logger with one difference: it accepts a NamedTuple so that it can be formatted in just about any way you want. For example in development we have this PrettyLogFormatter that prints something like this:

luckyframework_2019-Feb-28

And in production, using the exact same log data, print JSON!

I highly suggest taking the approach of Dexter and using data and not just a string, e.g. logger({request_id: "123", user_id: "123}) so that it can be formatted depending on environment, and so that libraries can write data and the end-user can decide how it should be formatted. I would be happy to help with this. It should be pretty easy to do as Dexter is quite small and is nearly 100% logger compatible already. The source is tiny and most of it is reimplementing methods in the Logger with just tiny changes to support key/value data instead of strings

@paulcsmith This plain approach definitely has a nice ring. But it misses an IMO essential feature and that is implicit logger aggregation. This is especially important when multiple components work together. You could surely rig class-var logger instances together, but that's going to add a lot of complexity to the individual application.
And most importantly, it doesn't offer much in a way of runtime configuration. That's what I'd expect from a more sophisticated logging system.

(My thoughts):

I think the main problem is ATM you have to do mostly everything yourself. There isn't a clean/good way to custom how things work in diff envs without doing it yourself like you did.

Which, by the time you got all that setup, you basically implemented your own logging framework; which happens to probably not be compatible with any other person's implementation. Again, having something in the stdlib that external shards can easily build upon would be the :100: solution, i.e. a framework. The main question is what that should look like.

I agree on the including data idea, however I would make it optional, i.e. def log(message : String, context : Hash). Then you are able to define a message like user logged in, then provide context that relates to that event, like their name/id. How do you handle runtime generated logs if it accepts a named tuple? I'm assuming there is also a Hash overload or something?

From what I can gather there are now three logging implementations.

  1. https://github.com/luckyframework/dexter
  2. https://github.com/blacksmoke16/crylog
  3. https://github.com/hydecr/strange

I tend to like the approach of mine and @watzon more.

Having the concepts of multiple named loggers that can individually be configured for their purpose is a key point. In the end it comes down to how we want to implement this.

Have the stdlib be more of the framework and common features, and allow others to make PRs/shards to add fancier handlers/formatters/processors OR take a simpler approach and just add more features to the current logger system.

@straight-shoota Could you expound upon implicit logger aggregation? I'm not sure I understand that.

For example in Lucky we do this

# config/logger.cr
logger = Dexter::Logger.new(STDOUT)

Avram.logger = logger
Lucky.logger = logger

Is that what you mean? Maybe I am misunderstanding.

And most importantly, it doesn't offer much in a way of runtime configuration. That's what I'd expect from a more sophisticated logging system.

What do you mean by this? Could you provide an example. I'm not sure I understand.

I think the main problem is ATM you have to do mostly everything yourself. There isn't a clean/good way to custom how things work in diff envs without doing it yourself like you did.

Learning a new logging DSL is also work, and often means you are locked into a certain of handling logging. I feel I must be missing use cases because class_property is pretty easy to use and I wouldn't say is any more work than calling Crylog.configure do |registry| and retrieving it from the registry. But maybe I am missing something. I think I don't understand the use cases that provides.

The other downside of such an approach is that you lose type-safety. Using strings will mean you can get runtime errors when trying to access loggers, which IMO is against the Crystal ethos of catching bugs at compile time.

I think that if there was some kind of "framework" for logging that it may have some kind of macros, but under-the-hood it would use class_property for type-safety. For example:

class MyProgram
  # defines a `logger` class_property with a default Logger.new(nil) (or something)
  # defines a `log_level` class_property
  Logger.loggable default_log_level: :warn, default_logger: Logger.new(nil)
end

MyProgram.logger  = my_wonderful_logger

This way it's all type-safefor both setting and getting loggers.This is of course not that fleshed out, but I hope it shows the rough idea

Runtime logging of dynamic data

I agree on the including data idea, however I would make it optional, i.e. def log(message : String, context : Hash). Then you are able to define a message like user logged in, then provide context that relates to that event, like their name/id. How do you handle runtime generated logs if it accepts a named tuple? I'm assuming there is also a Hash overload or something?

In general I think enforcing that is it key/value makes it much easier to use in prod. So Dexter allows passing a string but converts it to {message: the_string}. In your example above I'd do something like: event: "user_login", user_id: 123" That way it is fully machine readable.

You could definitely do a Hash of String => String or something, but so far I (and others using Dexter) have not needed that. Usually we know the keys ahead of time but if we don't, then if we need to log a hash or something then I think the best bet is to to_json it and log it as a value, e.g. {event: "github_webhook", payload: payload.to_json}

It's basically explained in the OP under Structure loggers hierarchically, with granular level controls. Your example would be a kind of aggregation, every logger context feeds into a single logger instance. But with different components having different logger needs, the goal is to have separate rules for individual logger contexts and be able to feed them upwards in the hierarchy.

As an example for runtime configuration, this totally made up config format specified (at runtime) that all log messages above warn should be printed to stdout and only from the namespaces Foo and Bar::Foo all log messages above debug.

logging:
  basic:
    handler: stdout
    level: warn
  debug_foo:
    handler: stdout
    filter:
    - Foo
    - Bar::Foo
    level: debug

I feel like this should be a shard, proliferate as one at least. I like having a KISS logger in stdlib, rather something complex I don't need 90% of the time.

@jhass I'm kind of with you. I would not want most of this (but I may be convinced otherwise!), but I would love if the regular logger accepted data as I think that is a much better way to go. And can still work with strings as I mentioned above :D

@straight-shoota Oh I see. I was trying to avoid runtime config like that since it is so easy to have typos and bugs when it is using regular methods. But I can also see the use-case for something like that! Even if it could be made type-safe, I don't think it is necessarily true that things use namespaces in a way that make them "automatically" configurable like that. I feel that the lib authors would need to provide hooks for specifying what log level is used for certain messages, especially as sometimes different methods in the same class log totally different messages. For example, in Avram we have a log level for logging queries as well as logging invalid saves. Those both behave differently and have different log levels, but configuring by class wouldn't make sense. You kind of have to expose that option as a library author.

One thing I think might make things easier is to have a top-level logger on Crystal that shards can use by default, and can then allow customizing things as necessary. For example:

Crystal.logger # By default is Logger.new(nil)

# But we'll set it to log to STDOUT
Crystal.logger = Logger.new(STDOUT, level: Logger::DEBUG)

if dev?
  Crystal.logger.level = Logger::DEBUG
end

# Libs can then rely on Crystal.logger without requiring people to pass the logger in!
class MyProgram
  def do_something
    Crystal.logger.info("Did something wonderful")
  end
end

# This class does something special. We may want to make it so people can configure it differently. 
class Something Special
  class_property logger : Logger = Crystal.logger
  class_property log_level : Logger::Severity::INFO

  def something_dangerous
    self.class.logger.log(self.class.log_level, "Something crazy happened")
  end
end

# I don't care about this message so I'll set it to info so I don't see it in prod, but want to know about it in dev
SomethingSpecial.log_level = Logger::Severity::DEBUG

What if we really don't want to use class_property?

Maybe we have a very simple module that makes it easy to edit how a class logs (similar to what OP posted)

class Something Special
  # Would create
  # class_property logger : Logger = Crystal.logger
  # class_property log_level : Logger::Severity::INFO
  # Also creates `logger` and `log_level` instance methods  for easier access
  Logger.make_loggable default_log_level: Logger::Secerity::INFO

  def something_dangerous
    logger.log(log_level, "Something crazy happened")
  end
end
  • Type-safe!
  • Removes some boilerplate
  • Can configure the logger and log-level by default
  • Can of course still use class_properties if you want further customization

I kind of like this approach but would be curious to hear thoughts on it

But also I wonder if there are too many little things here so the discussion is going a bit all over. Maybe we can split this into a few discussions

  • How to handle formatting?
  • How to handle log messages (should it be key/value data?(
  • Should we expose a global logger that shards can rely on by default (but can of course open things up for customization if they want via class_property or whatever other mechanism we create)
  • How do we configure log_level/logger (registry? class_property? YAML?)

I think this may be beneficial to split as these can all be handle separately (for the most part)

For example I think a Crystal.logger that shards can use by default would be awesome. That way we don't need to pass that in to every single shard. That's also something that could be added before all that other stuff is decided, if we can agree upon it

@jhass Totally! It has already been established in the course of this discussion, that the implementation should not be in stdlib.
But the idea is to define an interface that every shard can use. Stdlib provides the framework and a basic logger implementation which should suffice for most use cases. But when needed, it can be backed by a more complex logging system that hooks into the framework.

So for stdlib it means little more than the current implementation, just a little bit API refactoring to enable it for more advanced features. The key property is, that the framework is defined in stdlib and for a shard using it, it should be completely transparent.

@paulcsmith

For example I think a Crystal.logger that shards can use by default would be awesome. That way we don't need to pass that in to every single shard. That's also something that could be added before all that other stuff is decided, if we can agree upon it

That's generally the idea behind this issue. But: A centralized logger instance needs to have some kind of context-awareness. Consider two components/classes/shards happen to send log messages with the same content. When you use it like Crystal.logger.info("Did something wonderful"), there is no way to tell where the log message originated from. This way it's more clearer : Crystal.logger.info("Did something wonderful", namespace: "MyProgram") But calling it like this would add more verbosity.

So this discussion has been about ideas how to improve this. For example, calling it as Crystal.logger("MyProgram").info("Did something wonderful"). And the call to Crystal.logger can be cached in local scope to just call it logger.info("Did something wonderful") and it will assign the appropriate context information.

Some more of my thoughts; with some examples from Crylog on how I see this best working, also see the core concepts.

Learning a new logging DSL is also work, and often means you are locked into a certain of handling logging

This is true for anything. The goal would be to have it well documented and easy enough where it's not really an issue. Assuming whichever implementation framework shard is "chosen", other libs/frameworks would be updated to use that. Having a singular flexible logging framework that can be used everywhere would be the best case scenario. Mainly since then other handlers/processors/formatters could be implemented, either in the actual lib, or as external shards.

But maybe I am missing something. I think I don't understand the use cases that provides.

It mainly allows for all the logger information to be defined in a single place. You can also control what processors/handlers/formatters each _named_ logger has conditionally.

One feature that is kinda nice is how each logger is named and independent. You have total control over how each aspect of each logger works.; thus you can do stuff like [2019-05-19T00:45:05.558218000Z] worker.DEBUG: Hello from the worker!. In this case the async worker would have a diff logger with diff requirements, which automatically includes where that message came from.

I feel that the lib authors would need to provide hooks for specifying what log level is used for certain messages

what log level is used for certain messages, especially as sometimes different methods in the same class log totally different messages

That would be one of the features of the framework. Each "handler" would be responsible for controlling if it should log a given message. I can give some examples from Crylog. See handles? and bubble.

  • How to handle formatting?

Provide a sane default but allow each handler to define a custom formatter.

  • How to handle log messages (should it be key/value data?)

It should be up to the user. Have them able to log only key/value pairs, string and key/value pairs, or only a string. It should also be the responsibility for the handler/formatter in order to transform the message object into the format the handler expects. I.e. you could have an Elasticsearch (ES) handler with an ES formatter to transform the message into the format ES requires.

  • How do we configure log_level/logger (registry? class_property? YAML?)

I would love to have the option of a YAML based configuration. However I haven't thought a lot on how that would work.

It mainly allows for all the logger information to be defined in a single place. You can also control what processors/handlers/formatters each named logger has conditionally.

I think where I'm confused is that you don't need a framework to have named loggers or conditional logging. You can use an instance variable and create a logger, then assign it

file_logger = Logger.new(io: File.new("tmp/test.log", mode: "w"),)
standard_logger = Logger.new(io: STDOUT)

This is also type-safe (no accessing registries with strings). No new DSL, and infinitely customizable because you can use conditionals or whatever else.

A real life example is here: https://github.com/luckyframework/lucky_cli/blob/master/src/web_app_skeleton/config/logger.cr

Lots of conditionals, different loggers, super customizable, type-safe, etc and no new code. I wonder if part of this is we just need guides and examples on how to do it?

Creating multiple named loggers can be done right now as shown in the example so I must be missing something else.

That would be one of the features of the framework. Each "handler" would be responsible for controlling if it should log a given message. I can give some examples from Crylog. See handles? and bubble.

Could you explain this more? How is this different than this:

quiet_logger = Logger.new(io: STDOUT, level: :info)
loud_logger = Logger.new(io: STDOUT, level: :debug)

WonkyLibrary.logger = loud_logger # tell me everything!
TrustworthyLibrary.logger = quiet_logger # just the info and errors please

As a separate issue, configuring the log_level that the library uses could be customized using some kind of lib like yours (or like I suggested, class_property which is already there and documented). Though could be wrapped in a simple macro like:

https://github.com/crystal-lang/crystal/issues/5874#issuecomment-530138093

Overall I think the goal should be as minimal code/DSL necessary to learn and use. If we can use regular methods and local vars, why not!

I think where I'm confused is that you don't need a framework to have named loggers or conditional logging.

No you don't, but it does help in wiring everything together. Especially when having additional features other than a simple logger class.

Could you explain this more? How is this different than this:

Sure. In your example each named logger is mapped 1:1 to a logger instance. In Crylog a named logger has an array of "handlers", which is simply something that sends the logged message to somewhere/something.

When a message is logged the first handler "handles" that message. If bubble is true, the same message flows to the next handler, which can either handle or ignore the message. This allows a single named logger to send each logged message to multiple locations.

For example you could have a handler that logs the same message to STDOUT, a log file, and optionally sends an email if the severity is high enough. IO::MultiWriter could be used for most IO related things, but alas, sending an email, or pushing a message into ES/sentry is not IO.

Since each named logger instance consists of indenepent handlers, they can each be "built" from all your "blocks" (i.e. handlers/processors/formaters) to best handle the use cases that that specific named logger will be used in. For example, doing HTTP requests, or handling async messages. This inherently gives you context on where a message came from.

Similarly, since each Crylog logger is not directly a Logger instance, it can have additional features, such as processors. A processor is a something that can be defined that will add "metadata" to every logged message. Take this snippet from Athena for example.

# user_processor.cr
struct UserProcessor < Crylog::Processors::LogProcessor
  def call(message : Crylog::Message) : Nil
    user_storage = Athena::DI.get_container.get("user_storage").as(UserStorage)

    # Return early if a message was logged in a public endpoint there won't be a user in storage
    return unless user = user_storage.user?

    # Add the current user's id to all log messages
    message.extra["user_id"] = user.id
  end
end

# other_file.cr
Crylog.configure do |registry|
  registry.register "main" do |logger|
    ...
    logger.processors = [UserProcessor.new]
    ...
  end
end

Now imagine that the UserStorage class contains the current "logged in" user in your API/app. This processor would automatically add extra context to every logged message that passes thru the named logger.

Crylog.logger.info "User logged in"

[2019-05-27T22:40:03.445515000Z] main.INFO: User logged in {"user_id":1}

Another scenario would having a processor specific to your http request logger that can add like the body/headers/host of every failed request.

Logging vNext

The following is the logging module Log we've been designing recently. The implementation is not done yet, but we've been iterating on different scenarios and requirements to reach the current proposal.

Generating logs

The are a couple of severity levels it can be used to emit logs: debug, verbose,info, warn, error, fatal.

Usually you will type Log.[level] { "a message" }

Log.info { "Hi there!" }

Note: Using the result of the block for the message will allow the runtime to ignore possible conversions, string interpolations to build the message if it is not going be emitted.

The Log is a new top-level constant in the std-lib (Logger will be deprecated and removed).

When generating logs you will use a Log instance. Each Log instance is bound to a source or path. Log.get(source : String) : Log will allow you get a new instance. The top-level Log acts as an instance of an empty string source Log.

When defining a new shard or module you will define a Log constant for that namespace.

module DB
  Log = ::Log.get("db")
end

Thanks to the relative resolution of constants, using Log.info { ... } inside DB types will resolve to the one previously defined and not the top-level one.

If you move code around during development, you will not need to deal with logger/@logger/@@logger substitutions.

If you need nested sources, you can use the parent namespace Log and avoid hard-coding the source name.

class DB::Pool
  Log = DB::Log.get("pool")
end

Or, if you want to hard-code it.

class DB::Pool
  Log = ::Log.get("db.pool")
end

The result of Log.get(source : String) will return the same object instance for the same source. This will allow overriding the severity level of the log during specific moments programmatically.

Log.get("db").level = :verbose # or Log::Severity::Verbose
do_something_with_db
Log.get("db").level = :none # Log::Severity::None disables all levels

# DB::Log can be used instead of Log.get("db")

Note: Setting the level is specific to a source and not it's nested ones. The previous example doest not affect the db.pool source.

Note: Note that setting the level of a Log instance will affect all other Fibers that are concurrently using it.

When building a program it will be possible to ignore completely as part of the compilation the debug calls. So there is no penalty even to check if those logs should be emitted. The debug methods can be replaced by noop during compile-time.

Writing backends

The std-lib will come with a basic backend to write to stdio but you can define your own backend by inheriting Log::Backend.

  • #initialize(Log::Backend::Config) will be used to set up the backend with some given configuration
  • #write(Log::Entry) will be called to write the given log entry.
  • #close to close resources used by the backend
  • The Log::RegisterBackend annotation will define the name to reference it in the configuration.
@[Log::RegisterBackend("custom")]
class MyCustomBackend < Log::Backend
  def initialize(config : Log::Backend::Config)
  end

  def write(e : Log::Entry)
  end

  def close
  end
end

Whether the write operation in sync or async it will be up to the implementation. There might be some base modules to implement easily an async backends (similar to IO::Buffered).

Configuration

There will be a way to configure a series of backends and use them to write the logs of certain sources.
When requesting the Log for a source, the runtime will load the configuration and bind the Log instance with the corresponding backends.

The std-lib will read a crystal-log.yml from the current directory. Or setting CRYSTAL_LOG_CONFIG you will change the file location.

A sample of the configuration file with potential options are:

version: 1

backends:
  stdio:
    type: stdio # -> Search for @[Log::RegisterBackend("stdio")]
    stderr_level: warn
    formatter: Log::DefaultFormatter
    format: # pattern/template 
    color: true

  file: # if no type property is given the name will be used as such. So @[Log::RegisterBackend("file")]
    file_pattern: app.%d{yyyy-MM-dd}.log
    location: /var/logs/
    max_file_size: 10mb
    formatter: Log::DefaultFormatter
    format: # pattern/template 

  elastic:
    url: localhost:9200
    index: app-%YYYY-%mm-%dd

bind:
  - sources: * # other options: โŠ” (empty string), db, db.*
    level: WARN
    backends: stdio, db

  - sources:
      - db.query.*
      - http.client.*
    level: INFO
    backends: 
      - stdio
      - db
  • โŠ” (empty string) will show only the top-level source
  • * will show all the sources
  • a.b.* will show a.b and every nested source.
  • a.b will show only a.b
  • sources and backends can be array or comma separated values for convenience

The default configuration when no config file is found is:

version: 1

bind:
  - sources: # (empty)
    level: INFO
    backends: stdio

You can set CRYSTAL_LOG_LEVEL, CRYSTAL_LOG_SOURCES to change the default configuration.

Implicit Context

The context information will live in the fiber. A new fiber will start with an empty context. Context are immutable but can be extended with convenient methods. The context are immutable so the backend can keep a reference to the context that was used when generating the Log::Entry.

The current context can be read from Log.context.

You can change the current context with clear and set methods.

Log.context.clear # => clears current context
Log.context.set request_id: 34, user_id: "[email protected]" # => extends the current context

When spawning Fibers you can inherit the context explicitly:

c = Log.context # current context
spawn do
  Log.context = c
  # ...
end

The context.using can be used with a block to restore to the previous context.

Log.context[:request_id] # => 23

Log.context.using do 

  Log.context.set request_id: 57689
  Log.context[:request_id] # => 57689

end # previous context will be restored at this point

Log.context[:request_id] # => 23

Under the hood the context.using version uses .context/.context=. The context.using block example is equivalent to:

Log.context[:request_id] # => 23

previous = Log.context
begin

  Log.context.set request_id: 57689
  Log.context[:request_id] # => 57689

ensure
  Log.context = previous
end


Log.context[:request_id] # => 23

Other types

Disclaimer: this is more the structure description rather than the actual definition.

alias Bakend::ConfigValue = Bool | Int32 | Int64 | String | Array(ConfigValue)
alias Bakend::Config = Hash(String, ConfigValue)

alias ContextValue = Bool | Int32 | Int64 | Float32 | Float64 | String | Time | Hash(Symbol, ContextValue)
alias Context = Hash(String, ContextValue)

record Entry, source : String, severity : Severity, timestamp : Time, message : String, context : Context

I think this'll definitely be an improvement. However, I have some questions.

  1. Will interfaces (modules) be defined for the types, like Log::Backend and formatters, etc? This would make it easier to expand/offer custom implementations while still being compatible everywhere.
  2. Has any thought went into including processors as a third concept? It can be helpful, especially when paired with DI to include context about the current user/request etc to every logged entry.
  3. How would Log#get work when working with multiple fibers, such as HTTP::Server. Would each request get its own instance, or would they be shared? It sounds like the context would be fiber specific which is good.
  4. Related to 3., are logger instances instantiated once at program start or?

@Blacksmoke16:

  1. I don't see why not. We have in mind some base classes that could be used to create new backends, but a module to start from scratch might be defined as well
  2. What advantage have processors instead of just settings the injected context? Processors would be called everywhere, even for context where is no values to be set.
  3. Log instances will be shared between threads. Context belongs to the current fiber.
  4. There is no implementation yet, but with the suggested idioms to create Log instances, they will be created on program start.

What advantage have processors instead of just settings the injected context? Processors would be called everywhere, even for context where is no values to be set.

Actually let me back up and ask another question. Since the context belongs to the current fiber, that would mean the context is fiber wide and not tied to a specific logged message.

For example, say I did Log.context.set user_id: user.id within my authorization class, then everything logged within the context of that fiber would have a reference to that user_id context correct?

If yes, then yea, a processor isn't really necessary.

For example, say I did Log.context.set user_id: user.id within my authorization class, then everything logged within the context of that fiber would have a reference to that user_id context correct?

Exactly! The context is actually immutable and a new instance is created every time you set a different one. This is to allow backends to flush log entries asynchronously without having to copy the context every time. So at the end, the log entries have just a reference to the current context at the moment of calling the log method.

I love this! Fixes a lot of issues, and it's a pretty unique take on logging in a standard library.

Formatters will be something of each backend, but we don't discard some helper modules.

We are also iterating on some API to extend the log entry with Exception objects.

Thanks for the feedback.

I like the proposed design :+1:
My main concerns are about default behaviour and initialization, like how we can avoid having every crystal program try to read a crystal-log.yml file on startup. But such details can be sorted out when the base functionality is implemented.

Having Log instances as constants in Crystal namespaces sounds good. I figure you can still use local instances like log or @log, which would be handy when you want to separate log messages into different log namespaces even when the code lives in the same Crystal namespace.

Another question.

  def initialize(config : Log::Backend::Config)
  end

I'm assuming the Config object would be something similar to YAML::Any?

Ideally I think you could use https://github.com/crystal-lang/crystal/pull/8406 along with https://github.com/crystal-lang/crystal/pull/7648 to essentially build out the discriminator automatically; and also gain the ability to directly new up the backends.

Thanks @bcardiff for sharing the new design! this definitely solves a lot of the shortcomings of the current logger.

Has there been any thought about whether some of this should live in a first-party shard, instead of directly in the standard library?

I personally think that the standard library should only include the core logging functionality (writing logs, an interface for outputting them, and an implementation of that interface for writing to stdio). Including a parser for dynamically loading a config file seems like something that would be better left in a shard.

Perhaps the stdlib should only include the ability to configure loggers in code, and then a shard can add functionality to read a config file and map that to the configuration exposed by the stdlib.

This would avoid having all Crystal programs read a crystal-log.yml file, and give application developers the option to embed the config in the binary (or load if from another source).

Even if it does all end up in the stdlib, I would really like to see the ability to define all this config in code instead of relying on a file or environment variable.

I agree, I really really love this design, it's everything I ever wanted. Thank you so much to @bcardiff and @waj. My only comment was to request that the yaml configuration reader live in a seperate (blessed) shard, which @willhbr already covered :)

@bcardiff Thanks for the thorough outline. I'm really digging a lot of this stuff. I have a few things I love and a few concerns/suggestions.

Implicit Context

This is awesome. I think the way you've outlined it sounds perfect. Shared using a fiber, easy to use, easy to copy to other fibers if needed. Awesome!

Rename to Log

I think it reads nicely and is a bit shorter. The perfect name IMO ๐Ÿ‘

Consistent interface

I'm loving that you just always use Log everywhere. That is a very awesome innovation :)

Using yaml config, and string based log and backend getters

I think this will make it much harder to detect issues with yaml config, code, and documentation since it is not as well type-checked as real Crystal code.

Here are the types of problems I think people will run into

Typos and unused/incorrect arguments

# Typos, especially for Log in shards since it is 
# harder to see what loggers are available unless it is well documented by the shard
Log.get("db.queyr") # Whoops, runtime error

This also applies to the yaml config:

backends:
  file: 
    # Typo in arg name. I'm guessing this would be ignored or need special code to catch stuff like this
    file_pattrn: app.%d{yyyy-MM-dd}.log 
    location: /var/logs/
    max_file_size: 10mb
    # Another spot where typos and inconsistencies can come up
    formatter: Log::DefaultFormatter 

One more thing to learn

There are now multiple ways to get/configure things. You can reference the class, or the string name. Not a big deal at all, but a bit of a downside.

Documentation

It will be harder to document what loggers are available and what they are called since the string name of the logger is not exposed in docs with the current implementation. You'd have to manually document it. Maybe a macro/annotation could be added to make it easier, but this is still extra work to get something that could "just work" if done with pure Crystal.

Another issue is with documentation of backend options. Since the options are just generic types the shard author will have to manually document all the options and expected types of those options. They will also have to make sure to keep them updated if the options change.

Proposed solution

I think we can make a lot of these things available "for free" by taking a slightly different approach.

Using concrete constants and methods

I would propose configuration with code by using an actual constant (module or class, no preference) and configuring those in code rather than in YAML. I think this will make for a program that has much better compile-time guarantees, documentation, and flexible configuration.

Here's roughly what I think that would look like:

module DB
  Log = ::Log # no need to give it a name, instead reference it directly
end
# Only allow changing via constant, not string lookup
DB::Log.level = :verbose # or Log::Severity::Verbose
# Configure logger in code
Log.level = :info
Log.backends = [MyCustomBackend, StdioBackend]

# The cool part is you can easily use conditionals, ENV, and all other Crystal features
if ENV["CRYSTAL_ENV"]? == "production"
  Log.level = :info
else
  Log.level = :debug
end
Log.backends = [MyCustomBackend.new, StdioBackend.new]

How to share log defaults

Let's say you want a base db logger and a few loggers (like db.pool in the examples) that live "under" that logger.

We could use constants here, much like proposed, just without the string names

module DD
  Log = Log

  class Pool
    Log = ::DB::Log
  end
end

Then configure the defaults for DB and optionally set overrides for the Pool logging:

DB::Log.level = :debug
# And override specific ones if you want
DB::Pool::Log.level = :info

Admittedly I have not done a deep dive on how it would inherit the config. I assume that's been figured out already but if not I think it is definitely possible. May need a bit of macros to inherit the settings but I am unsure

The main advantages with this approach are:

  • Enhanced compile-time guarantees. Compiler errors for spelling, incorrect options, etc.
  • No need to document what to use to configure a logger. You just use the constant every time
  • Much more powerful config since you can use all of Crystal, ENV, etc.. Also config wouldn't require reading YAML which I think is nice as @willhbr mentioned

Backends

I think that the backends are basically an IO but need a few like allowing config through YAML and formatting data before writing. By using some built-in Crystal awesomeness I believe we can get better compile-time guarantees for arguments, better documentation of allowed arguments, and (maybe if we're lucky) you can use any IO (like file IO, etc.) as a starting point. This may already be possible with the existing implementation but I'm not sure...and it is ok if that's still not possible.

module Log::Backend
  macro included
    class_property formatter : ::Log::Formatter = Log::DefaultFormatter
  end

  # some_method_for_formatting_input_before_writing
  # abstract def write, etc.
end

# Maybe you can inherit from any other IO by using a module that you include
# instead of a class you inherit from? Not sure, but either way module is more flexible IMO
class FileBackend < File
  include Log::Backend # This would set some class properties like `formatter`, and other methods needed by Log
  # Add extra config with class properties
  class_property? skip : Bool = false

  def write
    # In real life you'd do something actual useful, but this shows the concept
    unless skip?
      # write
    end
  end
end

# Configure it
FileBackend.formatter = MySuperSpecialFormatter
FileBackend.skip = true

This is all a bit of pseudo code but the important parts are:

  • Using a module instead of a class. This allows a bit more flexibility, similar to when HTTP::Handler was changed from a class to a module you include
  • Configuration with code. Whether with class_properties, an initializer, whatever

    • Much more flexible configuration, automatic documentation, better compile time guarantees

New feature suggestion: key/value data

One of the main reasons I wrote Dexter was to accept key/value data so that you can format things in hugely different ways. So you can do logger.info(user_id: "foo:") and format it any number of ways json, splunk style, etc.

Maybe this is already a part of the proposal and I missed it, but I think this is hugely important to making formatters and logging far more flexible. For example Dexter has a JSONformatter, but we also have a Lucky formatter that makes logs much more digestable by humans. It would not be possible without key/value pairs.

What we do when passed just a string is we put put it under a message key. That way you can still pass strings. Formatters can omit the key if they want, or ignore keys completely, but if you do want to print keys differently then you can.

Would you be open to adding that? It was fairly simple to add to Dexter and opens up a lot of possibilities.

I'm super stoked about this initiative!!

module DD
  Log = Log

  class Pool
    Log = ::DB::Log
  end
end

Maybe this is a typo, but here Log always refers to the same constant, so I can't understand how this results in different log instances, nested instances and so on.

What I hope Log.get has is that it can accept any object and just call to_s on it. Or maybe just String and Class, so you could be able to do:

module DB
  Log = ::Log.get(self)
end

Then DB::Log's name is "DB", automatically inferred from the type name.

@asterite you are right. Good point. I like that idea a lot. That way you don't need to set it yourself. You could also still configure it with Crystal's type-safety goodness ๐Ÿ‘ I dig it. I imagine the interface can stay largely as planned that way

I'd suggest changing Log.get to Log.for(self). Since it isn't really getting a new one, and that way it discourages trying to use it as a getter in other places. But this a minor thing so totally fine if no one likes that ๐Ÿคฃ

module DB
  Log = ::Log.for(self)
end

DB::Log.level = :info

I do agree that using a yaml config might not be the best idea. May as well just have the log config happen in code, no?

Should the yaml config be a separate shard?

We aim for a design that will allow injecting ways providing the configuration. But without having one proposal, it will be chaotic for the user experience. How do I configure logging? It depends on the shard. And there is no default. That will cause a lot of stress because you will need to learn how each app or framework decide to do things. While that will still be an option, it is not required for all the apps.

Typos and unused/incorrect arguments

How to detect typo vs not-binded source? I'm not sure there is a way to solve.

The only thing I imagine is something that will emit the paths requested by the user, maybe a Log.verbose in log source when creating a log :-P. And leave the user to notice all the paths that are potentially used. Of course it won't work on dynamic sources.

Issue in wrongly config backends

It will be up to the backend, in the initialization to complain if something is wrongly configured.

The config type will support a narrow set of types similar to JSON::Any, but without schema validation.

Using concrete constants and methods

It does not work since Log = ::Log will not create a new instance; they are making an alias. It's hard to allow users create their own loggers but also constraint all the available options.

(ok, @asterite covered)

Backends

For IO based backends there might be some code reutilization available, but not all the backends are IO and formatted oriented. Elasticsearch could store the whole entry object.

New feature suggestion: key/value data

That would be the usecase of a context. We thought of adding some overloads to extend the context at the same time the message is sent, but we didn't reach an option that make us happy enough.

We also want to enforce there is always a message per log entry. Not just context data.

Does this mean I won't need to do require "yaml" to work with YAML because Crystal's runtime will already require it? (which also means YAML needs to be parsed and compiled on every app)

Oh, the issue is the libyaml dependency.

Ok, by default it will only read the CRYSTAL_LOG_LEVEL/CRYSTAL_LOG_SOURCES env variables and if require "log/config" is included in the app the yaml configuration will be available.

I think it would be nice to not have YAML be a big of a part of the crystal project as it is now. I YAML has kind of won out. Here is a breakdown of some config languages and I like languages written in Crystal like Con. It would be nice to not have libyaml be required by a crystal app but it might happen.

@paulcsmith It is a necessary requirement that logging can be configured at runtime. Sysadmins need to be able to configure complex logging setups without having to rebuild the binary. That also means it's simply not possible to ensure type safety.

That being said, we can obviously expose an internal interface to the configuration. This would be useful for example for setting up default config. Or if your app doesn't need runtime configuration.

@all Please let's not get into detail about choice of configuration format and other specifics. The primary goal right now should be to discuss and implement the general architecture.
Details of automatic configuration is two steps ahead. Contemplating on that only keeps us from discussing what's on the table right now.

Thanks for the response @bcardiff

Should the yaml config be a separate shard?

We aim for a design that will allow injecting ways providing the configuration. But without having one proposal, it will be chaotic for the user experience. How do I configure logging? It depends on the shard. And there is no default. That will cause a lot of stress because you will need to learn how each app or framework decide to do things. While that will still be an option, it is not required for all the apps.

I definitely agree it should not be separate. There should be one sanctioned way to do things so config is not different for every shard. My suggestion is to not use YAML at all and use code instead. You can inject configuration using Crystal code, and if you really want to you can ready YAML or ENV vars or whatever using Crystal, but that is up to the end user.

Typos and unused/incorrect arguments

How to detect typo vs not-binded source? I'm not sure there is a way to solve.

I think this could be solved by only allowing config via the constant. Then you will get compile time guarantees from Crystal, but maybe I am misunderstanding. I think the name binding is only helpful for YAML, but if that is not used then it is no longer an issue.

The only thing I imagine is something that will emit the paths requested by the user, maybe a Log.verbose in log source when creating a log :-P. And leave the user to notice all the paths that are potentially used. Of course it won't work on dynamic sources.

Issue in wrongly config backends

It will be up to the backend, in the initialization to complain if something is wrongly configured.

The config type will support a narrow set of types similar to JSON::Any, but without schema validation.

I think this is not making the best of Crystal's awesome type-system and compile time guarantees. If instead it is configured in code (not from YAML) you get nice documentation on what args and types are used, and you get nice compile time errors pointing to the exact problem and where the code is that caused it.

Using concrete constants and methods

It does not work since Log = ::Log will not create a new instance; they are making an alias. It's hard to allow users create their own loggers but also constraint all the available options.

(ok, @asterite covered)

Backends

For IO based backends there might be some code reutilization available, but not all the backends are IO and formatted oriented. Elasticsearch could store the whole entry object.

That makes sense ๐Ÿ‘ But I still think that using a module instead of class is better. So that if you do have a base class you want to use, you can

New feature suggestion: key/value data

That would be the usecase of a context. We thought of adding some overloads to extend the context at the same time the message is sent, but we didn't reach an option that make us happy enough.

We also want to enforce there is always a message per log entry. Not just context data.

Can you explain a bit more about why you want to enforce a message per log entry? At Heroku and in my own apps we almost always use key/value data for everything: Log.info(operation_name: "SaveUser", author_id: 123) with no particular "message". We just want to log some info about how things went. Would that be possible in the proposed solution?

Under the hood I imagine it could use existing constructs:

def log(data)
  Log.context.using do
     # Set the context data
  end
end

About YAML

I think I'm confused because I don't see the advantage of using YAML over Crystal code. Crystal code is safer at compile time, way more flexible (can use ENV, can use conditionals, etc). But maybe I'm missing why it is useful. Could some elaborate?

@paulcsmith It is a necessary requirement that logging can be configured _at runtime_. Sysadmins need to be able to configure complex logging setups without having to rebuild the binary. That also means it's simply not possible to ensure type safety.

All my examples showed configuring logging at Runtime. I was never suggesting anything else

# At runtime
Log.level = :info
MyBackend.formatter = MyFormatter

You can also use ENV variables if needed to allow even more flexibility. But maybe I'm misunderstanding what you mean. What does YAML give us that Crystal code does not? If we add a less compile safe and less flexible way of configuring logging I think it would be good to know what the advantages are.

That being said, we can obviously expose an internal interface to the configuration. This would be useful for example for setting up default config. Or if your app doesn't need runtime configuration.

@ALL Please let's not get into detail about choice of configuration format and other specifics. The primary goal right now should be to discuss and implement the general architecture.
Details of automatic configuration is two steps ahead. Contemplating on that only keeps us from discussing what's on the table _right now_.

I don't believe this is true. The choice to use YAML affects the interfaces. If we use config in Crystal then we can use concrete types and arguments, and don't need to "register" backends or set a name for pools or do validation of backend args at runtime. Most all of it can be done by the compiler. It also means configuration of backends can happen without a hash type and can include a more concerete set of types and type arguments. If we can do everything in code that we can in YAML then I don't think we should use YAML. But if YAML gives us something awesome that's great! I'm curious what it gives us though

@paulcsmith Log.level = :info is not runtime configuration because it is defined at compile time. Configuration written in Crystal can't be changed without rebuilding. You could set up some flags to select between different config sets. But configuration for an entire logging setup can be pretty complex with maybe dozens of backends and rules. No in-code configuration can provide the necessary flexibility for that.

Which data format to use for that is really not the point right now. The important part is, there needs to be some kind of configuration that's not done in code while compiling an application. Instead, a user/sysadmin/devop should be able to edit a simple text format to configure the logging behaviour before executing the app. Or even change it without restarting the application.

Strictly, this all wouldn't even need to be in the default stdlib implementation (although there are good reasons for that, as per @bcardiff ). But the stlib configuration mechanism must be flexible to support this. It can't just work with compile time safety, because there is no such guarantee when configuration values come from outside the compilation process (i.e. at runtime).

@straight-shoota Ah I think I am starting to see what you mean. However even with YAML you would have to restart the app. Unless we also add some kind of file tracking that reloads configuration automatically when the YAML changes, but that seems a bit much.

Personally I've never worked with anyone making such signicant changes to a running system like you are proposing. If there are changes it is done very purposefull and an ENV var is introduced to configure particular parts without requiring a redploy (just a restart).

But also, I don't think we should shut down this conversation until we have a clear understanding on both sides, because I think that will help us come to an really great solution and better understanding of how people are using this ๐Ÿ‘

After some discussion with @straight-shoota i understand the argument for yaml now, and also get where I did a bad job with my examples. Thanks for helping me work through some of that.

I think we can do some kind of approach that blends the best of both worlds and good to have concrete examples tomorrow that we can talk about and see if people like it/hate it

I'm fine with the yaml config being in the stdlib if it requires an explicit require "log/config". require "log" shouldn't introduce any runtime behavior by itself. This means that shards can require log safely, always.

I'm glad we mostly seem to be agreed on the core proposal anyway.

@bcardiff After a few discussions with others I realized why people may need YAML config, and I also realized I might have misunderstood what affect YAML config would have on the Logger proposal.

My original thinking was that the YAML config would affect the backends (and possibly the Log classes). I still think that may be the case, but would love clarification.

I thought based on the examples that backends would be required to have an initializer that accepts Log::Backend::Config. Is that the case?

For example, I believe I could not do this because it does not know how to accept the YAML config:

@[Log::RegisterBackend("custom")]
class MyCustomBackend < Log::Backend
  def initialize(path : String)
  end
end

MyCustomBackend.new(path: "/foo/bar")

Can you confirm if that is correct or not? If it is I've got some ideas. If you all are set on the current approach that is ok. LMK and I'll back off :) But if you're still open to ideas I think I have some that may be pretty cool!

I thought based on the examples that backends would be required to have an initializer that accepts Log::Backend::Config. Is that the case?

Yes

The yaml config would be

backends:
  custom:
    path: /foo/bar

And the code something similar to:

@[Log::RegisterBackend("custom")]
class MyCustomBackend < Log::Backend
  @path : String

  def initialize(config : Log::Backend::Config)
    @path = config["path"].as_s? || DEFAULT_PATH
  end
end

@bcardiff IIUC, the initializer accepting Backend::Config is only required when it's supposed to be used with dynamic configuration. Also, it doesn't mean that there may not be a specialized initializer to expose a dedicated API for programmatical configuration. I suppose that's what Paul's interested in.

I'm assuming there's reason this couldn't be done with YAML::Serializable? That would remove the need for an "Any" style object.

Something like I mentioned https://github.com/crystal-lang/crystal/issues/5874#issuecomment-587167789.

@bcardiff Thank you for clarifying, and yes @straight-shoota that is partially true. I could add an intializer that is better suited to code, but the default initializer for the Backend::Config will be there. So if I write a shard, I would need to implement both otherwise the backend would not work for people with YAML. I suppose I could just leave the YAML initializer and tell people to use the code version but then that seems unfair to those using YAML and very unexpected since the method is there but just wouldn't do anything. @straight-shoota here's kind of what I mean:

# Leaving out boilerplate
class MyBackend
  def initialize(@foo: String)
  end

  # If someone tries to set `foo: "bar"` in YAML it will not be set 
  # unless I add an initializer to handle the YAML as well:
  def initialize(config : Config)
    new(foo: config["foo"])
  end
end

I've got some ideas, but will try to write up actualy samples since I'm bad at explaining in words ๐Ÿคฃ. Would also be happy to hop on a call sometime if I'm still talking nonsense! The rough idea is that I think YAML config should be talked about separately and Crystal should have a way to configure modules in stdlib or with a shard. So all shards can use it, not just the logger. And because of that, I think the Log implementation would be better off as code-first and the config idea I have would kind of work "automatically". This would lead to more type-safety when configuring with code, less duplication, and better docs (since a code-first approach means you can set the exact named args and types you want)

Another issue, that hasn't been touched yet: It would be great to be able to modify configuration settings (especially source/severity routing) during the runtime of a program. That doesn't need to be part of the initial implementation, but it should be possible to add later.
Right now Log is immutable, but I figure it would technically allow modifications. This should probably work through the builder revisiting its logger list. Not a public API to modify an individual instance.

The use case would be long-running applications where you want to change for example the severity level or log destination while the application is running. It shouldn't be necessary to restart the application just to update the logging configuration.

Ok I gave a shot at writing it down ๐Ÿคž

My goals for this comment

  • See if we can remove the YAML config from Logger completely for this first pass
  • Propose that we discuss how to make Config first-class in Crystal for all libs not just Log

Proposal for code-first approach and no YAML (at first)

First I'd like to show why I think a code-first approach makes for safer, better documented, and more flexible code. I will address my thoughts on the YAML config later in the document.

# Leaving out boilerplate for now
class MyFileBackend < Backend
  # The advantage of this is:
  #
  #   * Great and automatic documentation of the types and args!
  #   * Awesome compile time messages if you typo, give it wrong type,
  #     forget an argument
  #   * Doesn't require YAML
  def initialize(@file_path : String)
  end

  def write(message)
    # pseudo code
    @file_path.write_to_it(message)
  end
end

# Of course if YAML is kept as proposed we *could* do this in code.
# The problem is:
#
#   * Litte type-safety. `{"fil_path" => "foo"}` would not fail at compile time
#   * Validation of args must be hand rolled by the backend instead of the compiler
#   * The author must manually write documentation on what args it accepts
#   * Usage in code is not as smooth
MyFileBackend.new({"file_path" => "some-path"})

# As opposed to this:
#
#   * Compile time guarantees
#   * Can accept procs if needed
MyFileBackend.new(file_path: "some-path")

An author could implement a code approach and a YAML approach in the Backend, but that means:

  • You need to duplicate code
  • Stil need to hand roll validations in the YAML config (for example, unkown args)
  • If third party shards just implement the YAML config you have no choice but to use YAML, or use the not-so-type-safe Hash like config object (or monkey path their backend ๐Ÿ˜œ)

Ok, but what about people that do need YAML config?

  • YAML is helpful for pre-built binaries so you can configure without modifying the source code, so it'd be nice to have something for those people
  • Some people just like YAML config :)
  • YAML config should not make code worse for people not using YAML
  • If people want YAML config for Log, I'm sure they want to use it in other places too!

My proposal: a totally separate issue/PR for Crystal configuration for all libs (not just Log)

  • Shards/libs can use it for all their code :D
  • Standardized config for all projects to use so there aren't a bunch of ways to handle config (I love this about Elixir)
  • Log can use a code first approach to start. Easier to code (not so many types, no need to implement or maintain the Config alias).
  • Will make Log improvements easier to ship since we can split discussions and won't have too much stuff in one PR

How would it work

I think this can all be discussed in a separate PR and Log can move forward with a code-first approach, but I want to show a potential approach just so people can see how this might work.

This approach has been tested/used extensively in Lucky with Habitat and works wonderfully. I think we can extend it so people can use JSON/YAML with it too!

Something like this will allow:

  • Automatic documentation since we're using crystal methods and types for each arg
  • All the power of code in configuration (env vars, conditionals, procs, etc.)
  • Really nice errors and tooling to help people debug config super easily (will show examples later :D)
  • All crystal libs to share common conventions for config

The Settings module (or whatever we want to call it)

First I'll start with non-Logger code to show some examples:

class MyStripeProcessor
  Config.setup do
    # These set up class methods and instance methods (that call the class methods)
    setting private_key : String
    setting callback_url : String?, example: "https://myapp.com/strip_webhook".dump # Dump wraps it in quotes
    setting some_object : SomeObject?, example: "SomeObject.new"
  end

  def charge(amount)
    # Settings are usable in instances of an object
    Stripe.charge(amount, private_key, callback_url)
  end

  def self.charge(amount)
    # And as class methods too
    Stripe.charge(amount, private_key, callback_url)
  end
end

And to configure it in code:

MyStripeProcessor.configure do |settings|
  settings.private_key = "Something-secret" # or ENV["SO_SECRET"], etc.
  settings.some_object = SomeObject.new
end

I'll get into how we make it so you can make the types non-nilable when not using an initializer later, but suffice to say there is a pretty good solution.

Great, but what about YAML/JSON/non-code approaches

I have not done this with Habitat (what my proposal is
based on), but I think we could totally make it work in some really cool
ways.

require "yaml_config"

# Call this instead of (or in addition to) the code approach
YAMLConfig.load("config.yml") # Or ENV var, etc.

# Can still modify with code for some libs
MyShard.configure do |settings|
 settings.a_proc = -> {} # Procs can't be created from YAML so do it in code
end

# We'll talk about this later
Config.raise_if_missing_settings!

The YAML in "config.yml":

# Inferred from class name. Maybe we can allow specifying a name: `Config.create("stripe")`
my_stripe_processor:
  private_key: "123-abc"
  callback_url: "myapp.com/foo"
lucky_server:
  secret_key_base: "123-abc"
my_backend:
  file_path: /foo/bar/log
# This may require tweaking as it is not tested, but I believe we can do something like this
# Maybe by adding optional processors to YAMLConfig.load
# Or having a `Log::YAMLConfig` that just looks at `bind_logs` and loads it up. These are implementation details that I'm sure we can figure out
bind_logs: 
  sources:
    - db.*
   level: WARN
   # etc.

I think this is pretty cool. You get powerful YAML config automatically, but you also get type-safe code configuration that you can use without or in addition to YAML. You could also see a JSONConfig.load, etc.

I think the Config.create macro could have some hooks for loading in YAML/JSON config or whatever. THis extensibility means people could do all kinds of cool things. Like loading from Hashicorp Vault, or whatever they want. But they can all be seprate shards.

Some other nice stuff with this approach

  • Would allow cool tooling, like crystal config that would show all the configurable code, what the settings are, what the types are, etc. or crystal config Lucky to only return stuff for a given constant
  • Allowing YAML + Code means people can use Procs or other objects in code pretty easily in addition to YAML
  • Since we have a nice Setting DSL we can make some nice validaiton messages for YAML automatically, similar to the runtime error I showed earlier.
  • The code is actually pretty minimal (257 LOC, 60% of which are comments!). See what we've got so far: https://github.com/luckyframework/habitat/blob/master/src/habitat.cr
  • Config.raise_if_missing_settings! If you forget to set a required setting this error is raised with helpful example:

Screen Shot 2020-02-25 at 11 35 56 AM

  • Really easy/cool way to temporarily override config. Super helpful in tests:
MyStripeProcessor.temp_config(callback_url: "fake_url.com") do
  # Callback_url will be fake_url.com in this block and revert afterward.
end

Concerns

  • ~We'd have to think about how to make the config really flexible to support some of the config in @bcardiff's proposal. I think it is totally doable though~ Updated with example!
  • Might take a bit longer to get YAML config out, but I think the end product will be awesome and would be cool to have a nice
    standard way of configuring libs
  • Some people may want to set config at an instance level. In those cases I'd say they should use a regular initializer or extend the class and configure it. It can inherit the parent config and allow changing just the parts you want

Example with Backend and new setting class

class MyBackend
  Config.create do
     # I've got ideas for how to modify processors per-logger, but this is jsut a simple example for now
     setting.file_path : String
  end
end

In conclusion

I propose:

  • We modify the Log PR to use a code-first approach (no YAML config or Config alias) and focus on getting that out.
  • Open a separate PR discussion for Config in Crystal! I would be happy to start this and even work on the PR as I am very excited about it and already have a lot of code ready to use.

@bcardiff, @RX14, @straight-shoota and other core team: If this sounds good, LMK and I'll get started on the Config PR/issue. I'm confident we can make something incredibly cool and more nice to use/flexible than anything out there for config! :)

.. the initializer accepting Backend::Config is only required when it's supposed to be used with dynamic configuration.

There is nothing preventing you to initialize a Log and Log::Backend manually, but they will be out of the loop of the Log.for factory method (?)

.. So if I write a shard, I would need to implement both otherwise the backend would not work for people with YAML.

My idea of the Log::Backend::Config is to be independant of the format used for configuring it. Maybe someone prefers a json o toml based configuration, but we don't want to make them reinvent the mechanism of instantiating and wireing logs and backends, hence the builder.

Another issue, that hasn't been touched yet: It would be great to be able to modify configuration settings (especially source/severity routing) during the runtime of a program. That doesn't need to be part of the initial implementation, but it should be possible to add later.

waj and I discarded that requirement. We usually restart the app on a config change, even the logger. But yeah, something like keeping track of the log to reconfigure them could be done without changing the API. The only clash is when, during development you manually changed the log level. Upon reconfiguration, which should be the final one.

I think this will be the responsibility of the builder, and signaling a reload might need some new public api to do so. Doable but discarded for now based on personal experience.

We modify the Log PR to use a code-first approach (no YAML config or Config alias) and focus on getting that out.

Even if macros are used to introduce a config notion, I don't see how you will get rid of Log::Backend::Config. You might be able to hide it, but removing it completely, I am not sure you can.

Having a Log::Backend::Config with nothing else allows you to code validation that can't be expressed by only typing rules.

I like the story of having a configuration mechanism, but it requires some design cycles still.

The current proposal does not require yaml at all, but there is an implementation that can use that format. In the same spirit, having a config macro in the backends could wire that mechanism on the loggers.

(I need to keep thinking on this topic)

There is nothing preventing you to initialize a Log and Log::Backend manually, but they will be out of the loop of the Log.for factory method (?)

Yes you can, I just think it is less than ideal because you have to give it a hash which is not particularly type-safe, worse documentation, etc. as mentioned in the example

My idea of the Log::Backend::Config is to be independant of the format used for configuring it. Maybe someone prefers a json o toml based configuration, but we don't want to make them reinvent the mechanism of instantiating and wireing logs and backends, hence the builder.

I love this idea! I just fear that by doing this we make things worse for documentation, and a code-first approach. If you do a code-first approach you can still do more validation outside the type-system. The settings stuff can map YAML/JSON to code without a Backend::Config. At least, I'm fairly certain it can. Can you share why you think that is necessary?

My main thought is that it can be done without Backend::Config and would still be able to work with YAML or JSON. Would be happy to chat or pair through it! I'll also take a look at the PR to see if I maybe that sparks an idea

I might be missing something here, but what do we need Backend::Config for exactly? It's basically just a facilitator between config format and implementation. Functionally it seems to not be really necessary. Instead we could just pass the YAML parser directly the backend implementation. This direct dependency might not b ideal, but I would prefer that over a hardly useful interface in between.
Supporting other config formats would be more challenging, because they would need to build custom mappings to the existing backends. But I'm not sure whether that would be such a big issue at all.
And maybe this would be another incentive to develop a cross-format serialization framework like serde =)

@straight-shoota That makes sense. I think this could be further improved however using the example above. Having a config DSL that would allow code/YAML/JSON whatever. That way you do not need to really worry about it.

I have a proof of concept here that does not need YAML or the intermediary Backend::Config type. This would greatly simplify the implementation of the Log PR because you could split the Log PR from config almost completely.

Roughly what I think it could start out as:

# First set of Log PR changes. Consider using regular Crystal:
class OriginalLogBackend
  def initialize(@file_path : String)
  end

  # or
  class_property! file_path : String?
end

# my_backend = OriginalLogBackend.new(file_path: "/foo/bar")
# Log.for(DB::Pool).backends << my_backend

# Though this doesn't have YAML log config, it is still much better than what we have today! 
# We get context, nicer 'Log' name, etc.
# We miss the fancy config (for now), but we can build that in as a separate step.

What it could be later on once we've nailed a nice config PR

Note this is UGLY and not even close to production ready but it shows how a simple config DSL could be used to handle YAML/JSON/whatever without a Config hash (just code) and instead using class properties (or could be an initializer or whatever). Also shows you can hook into it to build YamlConfig, JsonConfig, YamlLogConfig (for cool stuff like bind) etc.

This is not to get feedback on how exactly it works. I just wanted something to work as soon as possible to show that it is possible to do config without the Backend::Config type.

require "yaml"

# Proof of concept for how we can use an approach for code/yaml/json
# Similar to Lucky's Habitat, but with coolness for handling YAML config
module Config
  TYPES_WITH_CONFIG = [] of Config

  macro included
    {% TYPES_WITH_CONFIG << @type %}
    CONFIG_KEY_NAME = {{ @type.name.stringify.underscore }}

    def self.load_runtime_config(config)
    end

    # This is bad and inflexible, but was easy to do. Don't judge based on this :P
    def self.format_config_value(config_type : String.class, value) : String
      value.as_s
    end

    def self.format_config_value(config_type : Int32.class, value) : Int32
      value.as_i
    end

    def self.format_config_value(config_type, value)
      raise "Don't know how to handle #{config_type} with value #{value}"
    end
  end

  macro setting(type_declaration)
    class_property {{ type_declaration }}

    def self.load_runtime_config(config)
      previous_def(config)

      # Load known config keys and cast to the correct type automatically
      if value = config[{{ type_declaration.var.stringify }}]?
        @@{{ type_declaration.var }} = format_config_value({{ type_declaration.type }}, value).as({{ type_declaration.type }})
      end 
    end
  end
end

# This could also be a JSON loader, a VaultLoader, or whatever!
module YamlConfigLoader 
  macro load(yaml_string)
    yaml = YAML.parse({{ yaml_string }})
    {% for type_with_config in Config::TYPES_WITH_CONFIG %}
      if values = yaml[{{ type_with_config }}::CONFIG_KEY_NAME]?
        {{ type_with_config }}.load_runtime_config(values)
      end
    {% end %}
  end
end

class LogBackend
  include Config

  setting retries : Int32 = 5
  setting file_path : String = "foo"
end

puts LogBackend.file_path # foo
puts LogBackend.retries # 5

YamlConfigLoader.load <<-YAML
log_backend:
  file_path: "set/with/yaml"
  retries: 10
YAML

puts "---after yaml config is loaded"

puts LogBackend.file_path # set/with/yaml
puts LogBackend.retries # 10

https://play.crystal-lang.org/#/r/8mhp

So I'm thinking we could do a log implementation without the Config type at all at first and instead use regular initializer or class_property. Then tack on config later that works for logger as well as all Crystal code.

Another alternative is move forward with this YAML approach for now with the Backend::Config type, and then later on clean it up once we have Crystal config code figured out. That could also work but seems like duplicated work if work on one version now, and then have to redo parts of it later. Thoughts? I think either way could work, but IMO going without the YAML config for the first pass would be easier/simpler. But up to you of course! @bcardiff

I think either way I'll take a crack at some config code that will allow us to do this kind of thing but far more flexibly and with the other goodies mentioned before like nice error messages and such

That whole general config thing is a bit overwhelming, unfortunately. I'm not sure about what to make of it, because I see a lot of potential issues with such an aproach that tries to solve everything for everyone. It's definitely a nice idea, though. And maybe it's not that different from a general serialization framework... Anyway, it's really out of scope here.
It might be a good idea though to focus on Log first, but without the dynamic configuration part. Divide and conquer.

It might be a good idea though to focus on Log first, but without the dynamic configuration part. Divide and conquer.

Totally agree! Just wanted to show it is possible so that we can remove the dynamic configuration/YAML stuff in this PR (if you all want) and be confident it can be handled separately

Was this page helpful?
0 / 5 - 0 ratings