I haven't been able to pin down exactly what is causing these crashes. But I suspect buggy debugging output statements. (Ironic, right?) I think it's a good idea to make each output statement as safe as reasonably possible. This will probably help reduce the frequency of crashes.
To achieve this, I propose using C++-style stream I/O, rather than C-style printf type statements. This will automatically improve type safety, memory safety, etc.
Calls to VSFileSystem::vs_fprintf() can be refactored relatively easily. Calls to VSFileSystem::vs_dprintf() will be a little trickier, because of the different debug verbosity levels, but I have an idea for this too.
I am in the process of replacing vs_dprintf() with a function called vs_dbg():
static boost::iostreams::stream< boost::iostreams::null_sink > nullOstream( ( boost::iostreams::null_sink() ) );
std::ostream& vs_dbg( char level )
{
if (use_volumes || (level > g_game.vsdebug)) {
return nullOstream;
} else {
return std::cerr;
}
}
You can write to vs_dbg using the << operator, just like cout or cerr. If the current user settings mean that a given output line should be suppressed, vs_dbg() will return a boost null_sink stream, which just throws away any data sent to it. (See https://stackoverflow.com/a/8244321/5067822.)
Example usage:
VSFileSystem::vs_dbg(1) << "Not deleting mission twice: " << this->mission_name << std::endl;
instead of
VSFileSystem::vs_dprintf( 1, "Not deleting mission twice: %s\n", this->mission_name.c_str() );
Already, my changes seem to be improving Vega Strike's stability somewhat.
Let me know what you guys think.
In the process of fixing this issue, I am taking care of #3 as well. A missing std::endl stands out more than a missing "\n".
Hmm. I think I'm getting ahead of myself. Using simple C++-style output is actually less localizable. Perhaps we should use something like the Boost Format Library instead?
Open for discussion.
Further complication some screen output is handled by Python scripts there is some discussion of this on the forum ages ago ;)
In my experience, printf-style statements are best for both localization and logging, for other reasons. For instance, in multithreaded contexts, logging from multiple threads tend to get all mixed up together if you use std::cout, but using printf it's less likely (except for very long lines, printf is atomic-ish).
I think python has the best implementation here, with its logging module and keyword-based formatting options (that really help localization).
IIRC, vs_dprintf is a no-op when the verbosity level silences the output. That means you can recover some stability by decreasing log verbosity level, as long as the instability arises from bad format strings.
For that case of format strings, yes, boost format would help. I wonder whether we could perhaps refactor vs_dprintf to use boost format.
But if the instability comes from vs_dprintf argument expressions, something that's quite common, there's only a solution, which is to fix all the call sites and make them safe. Either by fixing the issues, or by wrapping them in a conditional so not even the arguments get evaluated when verbosity is set low enough:
if (VSFileSystem::isDebugLogging(1)) VSFileSystem::vs_dprintf(...);
A macro can make that easier to do at scale.
Thanks for the feedback, guys.
Perhaps the reason that std::cout output gets mixed up between threads is that each << is treated as a separate output operation?
From what I've been reading, boost::format() doesn't work that way. It uses an internal stream, and doesn't produce any output until it has the exact, full number of arguments that it should have, based on the format string provided.
Also, boost::format() will be easier to convert to, and yes, it will be easier to localize later, if we decide to localize the debug output strings.
So I propose converting all call sites to use the vs_dbg() function I mentioned earlier, together with boost::format(). It will be a lot of work, but it's the only way to make sure everything is both correct and consistent. Furthermore, I have already started the process, and I'm learning a lot about the code as I go.
How does this sound?
If you guys want to do it differently, I don't mind redoing my changes. (Well, to a certain point, anyway. LOL)
Oh, I forgot to give example code. Here's an example:
VSFileSystem::vs_dprintf(1, "Ranking item %s/%s at %s/%s\n", item.category.get().c_str(),
item.content.get().c_str(), systemName.c_str(), baseName.c_str());
becomes
VSFileSystem::vs_dbg(1) << boost::format("Ranking item %1%/%2% at %3%/%4%\n") %
item.category.get() % item.content.get() % systemName % baseName
<< std::endl;
VSFileSystem::vs_dbg(1) << boost::format("Ranking item %1%/%2% at %3%/%4%\n") % item.category.get() % item.content.get() % systemName % baseName << std::endl;
The only problem with that, is that it won't skip argument evaluation when debug output is disabled.
We want to do that so that you may get the stability improvements at runtime, without recompiling.
So it would be more like:
#define VS_DBG(level, msg) do { \
if (VSFileSystem::vs_dbg_enabled(level)) { \
VSFileSystem::vs_dgb(level) << msg << std::endl; \
} \
} while(0)
...
VS_DBG(1, boost::format("Ranking item %1%/%2% at %3%/%4%\n") %
item.category.get() % item.content.get() % systemName % baseName);
Might even find a way to hide boost::format inside the macro to streamline calls even further.
@klaussfreire Fair enough. I'll try the macro approach.
@klaussfreire I've started down the macro path. But it seems to me, thinking about it some more, that the boost::format part (with its arguments) will still be evaluated, unless they are compiled out at compile time.
What is more, I started having to #include "vs_globals.h" in some files, such as csv.cpp, that didn't include it before. This was in order to check g_game.vsdebug within the macro.
I believe the improved type safety inherent in boost::format() will already help substantially with stability. Where there are other ugly things happening, such as objects being referenced from another thread after they have already been destroyed, we'll need to tackle those using other approaches. E.g., boost::shared_ptr.
In short, I think the macro approach doesn't really give us anything. At least, not as it stands currently.
I just discovered something that might be just the ticket: Boost.Log. It comes with Boost, version 1.54 and up, which we already have access to. The documentation for Boost.Log specifically says that it's safe to log simultaneously from multiple threads; the output won't get corrupted.
We could have Boost.Log log to a text file, as well as to std::cerr. We could even log to the Windows Event Logs on Windows, or to syslog on Unix-like operating systems.
Sample code:
#include <boost/log/trivial.hpp>
int main(int, char*[])
{
BOOST_LOG_TRIVIAL(trace) << "A trace severity message";
BOOST_LOG_TRIVIAL(debug) << "A debug severity message";
BOOST_LOG_TRIVIAL(info) << "An informational severity message";
BOOST_LOG_TRIVIAL(warning) << "A warning severity message";
BOOST_LOG_TRIVIAL(error) << "An error severity message";
BOOST_LOG_TRIVIAL(fatal) << "A fatal severity message";
return 0;
}
Looks like you will not be reinventing the wheel ;)
UPDATE: I was mistaken. The version of Boost that the code used before was 1.53. Boost.Log was not introduced until one version later, 1.54.
Any objections to migrating to Boost 1.54 or later? I have successfully built and run Vega Strike on Ubuntu 18.04 LTS with both Boost 1.65 (the system version) and Boost 1.71 (the latest version).
Also note that I have confirmed that, after switching to Boost.Log for debugging output, Vega Strike uses significantly less CPU at lower verbosity levels. :grinning:
Honestly, sounds good - if you can successfully update a dependency then why not? Can you put up a specific PR for doing so?
Namely would be nice to see the delta just for updating it if possible. If not possible - too much work; then ok.
Most helpful comment
Looks like you will not be reinventing the wheel ;)