Citra: Rework logging in citra

Created on 31 Mar 2017 · 8Comments · Source: citra-emu/citra

Rationale

Citra has a large number of warnings caused by improper casts when printf formatting.
Logging only writes to stderr, which makes copying logs on windows more painful than it should be (because the windows console is still inadequate) and requires a work around in macOS to even display the logs. (For the mac workaround, see https://github.com/citra-emu/citra/blob/master/.travis-upload.sh#L104)
Logging to disk isn't a hard change to make, but it is still work to roll your own sinks, meaning any future sinks you want to add will take even more work to fit it into the existing framework (not a big issue, but still one as it could add problems for cross platform sinks).

With these issues in mind, I wrote a proof of concept rework of logging to use the excellent spdlog library. spdlog uses fmtlib for log formatting which, in addition to being almost as fast as printf formatting, has a similar interface to python formatting. This allows developers to use {} as a flexible, typesafe format specifier, with a host of other options for fine tuned control of the output. See http://fmtlib.net/latest/syntax.html for more on what can be done with fmtlib. spdlog also has many other benefits such as several easy-to-configure logging sinks, support for async logging, and safe multithread logging. The proof of concept can be found here. Its not ready for PR yet, but it gives a little insight into what changes will be in phase 1 https://github.com/citra-emu/citra/compare/master...jroweboy:great-logging-rewrite-of-2017

Design

Number one goal is to keep the logging framework as close as possible to the function of the current logging system. This way the contributors don't need to learn anything new to add more logs; they will still be able to use the familiar LOG_LEVEL("") macros. Additionally the log filters should continue to work as they currently do. Luckily enough, spdlog supports all of the same logging levels that citra uses, and filtering is also supported, so migrating is not too complicated

After talking with @wwylele about the changes, he suggested to make this issue for some feedback, and also to split the rework into 3 separate phases to help it get merged faster.

Add a spdlog backend that co-exists with the current logging system, and a new macro SPD_LEVEL that uses the new logger.
After phase 1 is merged, start porting each of the existing logging statements to use the new SPD_LEVEL macros and fmtlib formatting. This can be done on a module by module basis so its easier to review and faster to merge than a full application find/replace
After all the logs are now using the SPD_LEVEL macro, we scrap the old (now unused) logging framework and find/rename the SPD_LEVEL macro to the original LOG_LEVEL name. This way any tasks can be merged in the interim without causing issues, but after phase 3, development returns to normal and everyone can just continue using the LOG_LEVEL macro.

Another benefit of the three phase style, is it prevents the task from being the sole responsibility of a single developer. If it was a single PR, it would take a long time to merge, and I would probably grow weary of rebasing it every time someone added a new logging statement. With phase two being split into modules, other developers can take responsibility for a module if they choose (and as we all know, smaller changes are much easier to merge.)

Implementation & Feedback

Actual implementation details with regards to integrating spdlog can be discussed further when I open the pull request for phase one. In this issue, I'm really looking for a seal of approval on the whole idea of changing to spdlog and asking for any feedback on the three phase idea.

proposal

Source

jroweboy

👍11

Most helpful comment

@rz5 compile time disabling logging entirely is not a goal. Citra shouldn't try to squeeze performance out of something like logging, when the benefits of proper logs out weighs allowing users to disable them.

Additionally, as part of the rewrite, the actual logging will be done on a separate thread, so it shouldn't affect performance.

jroweboy on 16 Jun 2017

👍2

All 8 comments

Could you provide some example of logging code and log output? Specifically I want to know the difference between spdlog output and current one (if any).

wwylele on 31 Mar 2017

👍2

@wwylele Currently there are a few hurdles that need to be over come. With spdlog you can define a custom format string that all messages will have, but its kinda limited and doesn't contain things like filename, line number, and so on. The full list of spdlog formatting strings that are available can be found here. https://github.com/gabime/spdlog/wiki/3.-Custom-formatting

Below is a list of what citra logs has, and what analogs spdlog has available to use in its common string formatting. If any of these work arounds are deemed unacceptable, then we can move to adding them to the list of things that need concatted at runtime.

Time since citra has started in [Seconds.Nanosecond] -- This is not possible since there is no "diff time" format specifier. I think it would be pretty easy to add to upstream spdlog. In the meantime, I'm logging time in HH:MM:SS.nn
Log class -- Exists in spdlog as I create an individual logger for each log class and name it the friendly string for the log class.
Log Level -- Exists in spdlog, but just lower cased. I doubt that'll be an issue
ShortFilePath:FunctionName:LineNumber -- Not available in spdlog as they come from macros that must be added at the LOG_* macro call site. All of these are currently concatted at runtime in citra and are added to the logs by way of the LOG_* macro. There is no specific way to add them to a spdlog log entry directly, so just prepending them to the format string should be enough for our usecase.

As a test, I tried to see if I could use constexpr functions to concat them at compile time (and see how much bloat that'd add to the binary haha). Generating the short filename is not too hard with constexpr functions, but it required cpp14 constexpr changes that were only in MSVC2017 (so we would need to update to get that). Another struggle was trying to write a function to concat them and the complexity was just getting to be too much to be worth the minor perfomance benefits it could have given. Because of this, these are concatted at runtime. I personally think this acceptable as citra currently concats them at runtime, so it can't hurt performance :)

In conclusion, I believe its entirely possible to keep the exact same format string as before with two exceptions. The Log level is now lower cased (eg: "Warning" => "warning") and the time format specifier is in system time. And if other contributors think its a good idea, I can try to write a patch to spdlog to add "Seconds since start" as a formatting string.

jroweboy on 3 Apr 2017

👍1

Assume TAG is just a wildcard for a correct identifier for a logger

The main difference with formatting is that instead of

LOG_WARN(TAG, "Some stuff: %d, %f, %s", 1, 2.5, "Wow");

it would be (assuming no scaffolding code is introduced):

spd::get(TAG)->warn("Some stuff: {}, {}, {}", 1, 2.5, "Wow");

or, if positional formatting is wanted:

// Contrived example: results in "hey, bud" being logged.
spd::get(TAG)->warn("{1}{0}", "bud", "hey, ");

or if you want to support a custom type, just overload operator<< for an ostream.

struct thing
{
  int i;
  std::string str;
};

std::ostream& operator<<(std::ostream& o, const thing& t)
{
  o << "thing{i ==" << t.i << " | str == " << t.str << "}";
  return o;
}

// Somewhere else, where logging is important.

spd::get(TAG)->warn("{}", instance_of_thing)

// Prints out:
// "thing{i == <whatever value i is> | str == <whatever str contains>}"

lioncash on 3 Apr 2017

👍2

Huge +1 on switching to something fmt-based. I'm not familiar with spdlog but it seems reasonable too.

A few comments on the implementation and migration strategy:

iirc, fmt has separate functions for printf-style and {}-style formatting. I'm guessing the SPD_ versions will use the new format exclusively then, and the format strings will be converted when each individual module is migrated?
It seems like making the timestamp relative to startup is something that could be implemented in the formatter (using a custom formatter instead of the default format-string-based one). This feature is incredibly valuable for me because it makes comparing/diffing logs between two Citra runs much easier than if they use absolute timestamps.
Trimming the path at runtime seems fine for now. It would be great if that could be done only if the log level is active. This can probably also be done inside the formatter and then moved outside of it if we can do it using constexpr later.
It looks like spd::get locks a mutex and does a hash table lookup, so we probably don't want to actually just stick that into the logging macros. I think a good idea would be pre-creating an array with all the loggers and then use the constants to index into that, which should be faster.
Related to the previous point, it would be awesome if we could actually get rid of the global logging categories altogether. Rather, expose a function that creates a properly configured logger, and then each module can store its own logger in a module-level variable instead. We can do this instead of the macro-based migration and end up with a cleaner result.

yuriks on 17 Apr 2017

👍1

Sorry, but I have some criticism on @yuriks' last bullet point. Although the concept is cool, it becomes not that good when actually implementing it.

What is "module-level" exactly? it is not file scope because a module can have multiple files, it is not class scope because a module can have multiple ones or have no one. Defines the logger as a file scope static variable or global variable? what about a header only module like core/frontend/input.h? Now it turns out that each module needs to find its own way to manage a logger, and always stays careful to not break it when refactoring, which adds unnecessary complexity to the code, while logging should meant to be easy and quick to use.

I also thought about "declare logger everywhere but not define it" or "define it everywhere using it", like the c++17 inline variable. But it came to a dead end when considering that, one may want to log in a header file, and log using the same logger in a cpp file that (deeply) includes that header, and it would end up with a multiple definition in one translation unit if one doesn't take careful of the inclusion tree.

Based on the two points above, such distributed logger registration system will actually _reduce_ the flexibility. One need to think too many thing to implement/use logging system, which is not the nature of logging.

And what do you actually gain by doing all of this? Most modules will still get the logger at initialization time (= same effect as a global logging categories initializing it self at the beginning), and when a dev want to make a new module, they still need to add some code to register the logger (= same effort to add a new category to the global logging system)

To be fair, a logger registration system is still cool and we can make it for special usage. But I would prefer keeping a centralized global logging categories list (which can be a list of preregistered loggers now) for general use, to reduce complexity and be easy to maintain.

wwylele on 17 May 2017

@wwylele I agree with some of your concerns. Namely, C++ header files throw a wrench into things. The idea I had was a single static file-level variable for the logger. Multiple files could pass in the same identifier and thus get the same logger instance at runtime, solving the "multiple files in a module" concern. In the absence of header files that would work, but with inlined and templated code I don't know how one would do that. Maybe a namespace-level global could work, but then you get the situation of having to maintain the declaration and definition across 2 different files.

One of the big motivations for this was to get around the need to retrieve the logger from the global spdlog registery on every log statement. Instead you'd have your global

static spdlog::logger logger = GetLoggerForModule("Service.APT");

And logging sites would just do logger->log(...); or LOG_DEBUG(logger, "..."); Logging across modules is something we do, but not very often, and it's honestly a sign that you should probably move that code to the module where it belongs. Mostly, I feel that misused log modules (where you accidentally pick the wrong module) are more common than intentional usage of that.

Another smaller motivation was to reduce the global dependency on the logging module, which on any modification requires a full recompile of the codebase, and also does not scale well with codebase size. (You end up with an interminable list of logging classes, which are away from the places where they're used, and which probably no-one is going to notice if they go unused, etc.). I also disagree that it's as easy to define a new log class as it'd be to just copy the one line from another file. While they're both small changes one is somewhere far removed from the code you're actually editing.

In the end, the idea seems to have gotten a lot of push-back, so I'm not really going to push on it, especially without a good solution to the header issue. I'll leave it up to everyone else to decide. (I haven't looked at @jroweboy's actual implementation yet either.)

yuriks on 20 May 2017

❤1

Assuming that logging in a 'Release' build impacts performance significantly, if logging is planned to be reworked, could you throw in a compile-time switch to disable it entirely?

Like a CMake switch, -DDISABLE_LOG=0. When set, it would make all the macros defined in https://github.com/citra-emu/citra/blob/master/src/common/logging/log.h do nothing, for example.