STL: Consider new conventions for preprocessor comments

Created on 4 Dec 2019  ·  10Comments  ·  Source: microsoft/STL

We have a long-standing convention of commenting all preprocessor directives. This is extremely useful when preprocessor-controlled code is large and/or nested, because the comments clearly identify region boundaries. (This is even more important now that clang-format indents all preprocessor directives at 0.) Here's an example with large regions:

https://github.com/microsoft/STL/blob/1d39dfac9929c0d3c405edf6bc7eff26bbe2a26b/stl/inc/functional#L1228-L1250

And nested regions:

https://github.com/microsoft/STL/blob/1d39dfac9929c0d3c405edf6bc7eff26bbe2a26b/stl/inc/xstring#L295-L306

But when preprocessor-controlled code is small (down to a single line) and non-nested, it starts looking silly, as @SuperWig observed in https://github.com/microsoft/STL/issues/340#issuecomment-559236933 :

https://github.com/microsoft/STL/blob/1d39dfac9929c0d3c405edf6bc7eff26bbe2a26b/stl/inc/functional#L866-L870

In newer code, we sometimes use "arrow comments" to make regions even more visible, which is an improvement for large and/or nested scenarios as usual:

https://github.com/microsoft/STL/blob/1d39dfac9929c0d3c405edf6bc7eff26bbe2a26b/stl/inc/tuple#L259-L284

But similarly, it is generally silly for small non-nested scenarios (at least with simple conditions; the comments can be useful for complicated conditions even if the controlled code is small):

https://github.com/microsoft/STL/blob/1d39dfac9929c0d3c405edf6bc7eff26bbe2a26b/stl/inc/vector#L691-L695

We should consider developing a simple set of rules for contributors to follow. Example suggestion:

  • All #else and #endif directives should have arrow comments,
  • Except when the condition is simple, all of the controlled regions are a single line, and there is no nearby nesting.

That would require a bit of human judgement (what is "simple", what is "nearby"), but not very much.

decision needed documentation

Most helpful comment

This may be tangential to the design decision asked here but wouldn’t something like this be enforceable with a clangtidy check? I see that clangtidy has a redundant preprocessor check so it looks like it has knowledge of preprocessor directives in the ast at some level.

I have only ever consumed clangtidy’s built in checks so I have no idea if this is even feasible to implement but that leads to the idea that the design decided here should be something that can be determined programmatically and remove the person from the loop.

All 10 comments

Except when the condition is simple

This doesn't seem like a useful criterion: complicated conditions result in even noisier comments than do simple conditions.

all of the controlled regions are a single line

I might revise "single line" to "single declaration or non-compound statement", i.e., what we would put on a single line if we had infinitely long lines. This makes the rule more dependent on the structure of the code than the lengths of identifiers and such.

This may be tangential to the design decision asked here but wouldn’t something like this be enforceable with a clangtidy check? I see that clangtidy has a redundant preprocessor check so it looks like it has knowledge of preprocessor directives in the ast at some level.

I have only ever consumed clangtidy’s built in checks so I have no idea if this is even feasible to implement but that leads to the idea that the design decided here should be something that can be determined programmatically and remove the person from the loop.

I agree, programmatic would be ideal. Currently we don’t use clang-tidy (only clang-format) but we could in the future.

I might revise "single line" to "single declaration or non-compound statement", i.e., what we would put on a single line if we had infinitely long lines. This makes the rule more dependent on the structure of the code than the lengths of identifiers and such.

Not that this affects me, but this is one of the cases, where actual line count is more important than code complexity. The question is if someone can quickly see what #if a given #else belongs to and that is not dependent on the the code complexity in between, but simply by distance (and to some degree formatting).

It might be a "simple" ifdef today, but 10 years from now that might not be true. Additionally, I find it's easier to enforce discipline with this sort of thing if the rule is "always add/update the comment" rather than "add/update the comment, except under the following circumstances"

Personally, I like the status quo. I think the arrow comments are a bit noisy, and are more likely to get out of sync with the code, but I'm not strongly opposed to them. But I'm just some guy so...

I would like to add that the current convention regarding #else is quite confusing.

#if foo

#else // ^^^ foo / !foo vvv

#endif // foo

Labeling the #endif with foo rather than !foo is highly confusing especially if the regions get larger

I also dislike “hybrid” arrow comments; I want to see // ^^^ !meow ^^^ at the end.

Strict reviewing is pretty good about keeping comments in sync, if we can just decide on a consistent convention 😸

Fellow in the same circle😸

Just copying a note from #457: when this is audited, we should probably exempt the "exactly 3 lines on off" and "exactly 5 lines on off" cases. There, there's no chance of accidentally nesting incorrectly, and the on/off condition often gets extremely noisy to repeat.

Specifically, I think we should not comment the else and/or endif when:

  • Only 1 line is in each controlled block
  • The 1 line in the controlled block just sets another macro to something else and/or emits an error message
  • There is no chance of confusion due to being nested within surrounding code

Examples where I would argue the comments clutter rather than clarify (and I note I put a lot of these here once upon a time!):

https://github.com/microsoft/STL/blob/b3504262fe51b28ca270aa2e05146984ef758428/stl/inc/yvals.h#L56-L60

https://github.com/microsoft/STL/blob/b3504262fe51b28ca270aa2e05146984ef758428/stl/inc/yvals_core.h#L466-L507

I don't see how it can be easily automated for every case, there ate too many of them.
Consider the below, how to comment #endif in an non-silly way?

#if defined(_M_IX86) || defined(_M_X64)
   while (__intrinsics_with_full_fences()) {
       __mm_pause();
   }
#elif defined(_M_ARM) || defined(_M_ARM64)
   while (__intrinsics_with_precise_fences()) {
        __yield();
   }
#else // ^^^ defined(_M_ARM) || defined(_M_ARM64) ^^^
#error Unsupported hardware
#endif

Or this, I wanted to put TRANSITION word in arrow, but it then wraps...
https://github.com/microsoft/STL/blob/12c684bba78f9b032050526abdebf14f58ca26a3/stl/inc/atomic#L1003-L1020

Was this page helpful?
0 / 5 - 0 ratings