Bazel: Absolute paths are embedded in produced binaries

Created on 2 Mar 2016  路  27Comments  路  Source: bazelbuild/bazel

It would be useful to allow -fdebug-prefix-map to be used to create debug info that references a workspace's source files (e.g. /home/<user>/workspace/foo/bar.cc) rather than symlinked working tree sources (e.g. /private/var/tmp/<_bazel_<user>/4dfa01e59a69e8a99b4743b0270c4ad8/workspace/foo/bar.cc). This may require a CROSSTOOL substitution that referenced the current working tree, such that one could write:

compiler_flag: "-fdebug-prefix-map=%working_tree%=%workspace%"

Alternatively, Bazel could unconditionally add -fdebug-prefix-map to the list of compile options via its own special magic, since it seems a good idea to always reference the original sources in debug info.

Rationale

Even when invoked with relative paths, Clang's debug information will contain absolute paths to sources deep within the working tree, as above, even though those sources are, in fact, just symlinks back into the workspace. This confuses some tools, such as CLion, which attempt, and fail, to set breakpoints in lldb using _workspace_ paths, not _working_ _tree_ paths. lldb only knows about the working tree sources, and doesn't recognize that these paths actually reference the same files, and thus ignores the breakpoint.

Newer versions of clang appear to support GCC's -fdebug-prefix-map, which will presumably solve this problem.

P3 team-Rules-CPP bug

Most helpful comment

Any update here? This is breaking build caching for us.

All 27 comments

That's non-hermetic, and prevents any cross-user caching of the results. All the paths embedded into outputs should be relative; if that's not happening, that's a bug.

Certainly for genfile sources, at least, absolute paths into the working tree are clearly visible by running strings or symbols -fullSourcePath on an OX clang debug binary. It's possible there are multiple issues here.

Any update here? This is breaking build caching for us.

Hi James,

do you mean the absolute paths in the binaries? Could you open a new issue for that, maybe with a concrete repro instructions, so we can give it the attention it deserves?

Thanks!

I've renamed this bug, as relative paths would fix both the original issue and any extant caching issues.

How does caching work at all without this? I guess all the machines that things are being built on have the same directory structure? This is definitely not the case for most users :).

I have a prototype of this working now, it did require a small change to bazel though:

diff --git a/src/main/java/com/google/devtools/build/lib/rules/cpp/CppModel.java b/src/main/java/com/google/devtools/build/lib/rules/cpp/CppModel.java
index 2304de3..bb8a721 100644
--- a/src/main/java/com/google/devtools/build/lib/rules/cpp/CppModel.java
+++ b/src/main/java/com/google/devtools/build/lib/rules/cpp/CppModel.java
@@ -369,6 +369,7 @@ public final class CppModel {
     CppCompilationContext builderContext = builder.getContext();
     CppModuleMap cppModuleMap = builderContext.getCppModuleMap();
     buildVariables.addVariable("output_file", builder.getOutputFile().getExecPathString());
+    buildVariables.addVariable("execution_root", this.configuration.getOutputDirectory().getPath().getParentDirectory().getParentDirectory().getPathString());

     if (featureConfiguration.isEnabled(CppRuleClasses.MODULE_MAPS) && cppModuleMap != null) {
       // If the feature is enabled and cppModuleMap is null, we are about to fail during analysis

Then, I added this into my CROSSTOOL:

  # TODO(gary): WORKAROUND
  # Because we don't know how to apply a feature by default
  # other than the bazel source, replicate one that gets turned on by default
  # (the random_seed option), and then use the implies feature to ensure our
  # remove_debug_prefix is set.
  feature {
    name: 'random_seed'
    flag_set {
      action: 'preprocess-assemble'
      action: 'assemble'
      action: 'c-compile'
      action: 'c++-compile'
      action: 'c++-module-compile'
      flag_group {
        flag: '-frandom-seed=%{output_file}'
      }
    }
    implies: 'remove_debug_prefix'
  }

  feature {
    name: 'remove_debug_prefix'
    flag_set {
      action: 'preprocess-assemble'
      action: 'assemble'
      action: 'c-compile'
      action: 'c++-compile'
      action: 'c++-module-compile'
      flag_group {
        flag: '-fdebug-prefix-map=%{execution_root}=.'
        flag: '-gno-record-gcc-switches'
      }
    }
}

I concur with @ulfjack : %{execution_root} is not a good idea, because it is only meaningful on the system where Bazel is running, so if any actions are executed remotely, it breaks horribly because it'll be different for every user and also becuse the remote paths are probably not the same as the paths on the box where Bazel is running.

Unfortunately, the trivial fix, -fdebug-prefix-map=. doesn't work (I checked). So I don't really know how this could be fixed without adding a wrapper script around the compiler invocation :(

I believe the wrapper script wouldn't be sufficient, as it would still need something similar to the execution root to be able to send into the debug-prefix-map flag. I think we'd need something similar to execution root that would work for remote workers. Is there any such concept?

Why do you need something like the execution root? If we map whatever the workspace root is to, say /workspace using -fdebug-prefix-map, that will produce identical binaries regardless of the location of the workspace root on the worker.

Indeed, it might be best to have Bazel add the appropriate flag value automatically, thus avoiding the need to expose the variable. If there is a need to disable this functionality, a field can be added to the CROSSTOOL proto.

@lberki ah, okay. I think I misunderstood. Agreed, if we have bazel remap the workspace root automatically, that would be ideal.

I did some research to find out how we can fix this, and it does not look good. I checked ccache and distcc, and apparently they are rewriting the output files after compilation to remove the absolute paths:
http://jlebar.com/2010/3/21/ccache_3.0.html
https://lists.samba.org/archive/ccache/2011q1/000735.html
https://ccache.samba.org/manual.html

The tricky part about -fdebug-prefix-map is that we don't know the absolute path ahead of time in the case of remote execution. We can't use the local path, because then remote caching doesn't work. We could conceivably make it so remote execution runs in a fixed path, but then we can't cache between local and remote execution, unless we can (somehow) make local execution also use a fixed path, which isn't possible in all cases. Sandboxing locally helps, because we control the file system, but we can't rely on that at this point in time.

A wrapper script could rewrite the command line to pass the correct absolute path to -fdebug-prefix-map. Not nice, but I don't really see any other option. Am I missing something?

An alternative would be post-processing object files instead in a separate action?

I just told @mhlopko that I'd prefer post-processing because that doesn't assume that there is a bash shell wherever the compiler runs, but then again, post-processing has to know the format of the .o files, which is a bigger deal.

I guess a wrapper script is okay as long as we can come up with a way not to mandate it e.g. for Windows.

Note that this mainly affects MacOS right now, where we already use a wrapper script. Extending that to also handle -fdebug-prefix-map doesn't sound too bad. On Linux, we set PWD=/proc/self/cwd, which makes the outputs deterministic. On Windows, we need to check whether MSVC is even writing absolute paths, and whether it has options to suppress that. Or are you concerned about gcc / clang on Windows?

We could also ask whether upstream Clang could provide an option for this. It seems weird that multiple projects are working around it instead of getting a fix into upstream.

I'm recategorizing this as a bug since it's preventing cache hits with remote caching / execution. We may need to fix this soon-ish-er.

lberki: Action-based post processing would require changes to bazel, wrapper script is hidden behind the crosstool, which I think is a better place for this arguably toolchain-specific logic. Or we want to move forward with crosstool-triggered extra actions, that would give us best of both solutions. Which is not as stupid idea as it may sound, as we could then move lto and fdo to crosstool, too. And who knows what our polished .so story will look like, maybe we'll want to call chrpath or patchelf (I don't think we will but...). But it's definitely not something that can happen "soon-ish-er" :)

ulfjack: +1 to at least trying to make it into Clang. Wrapper script is the simplest solution. Anyway, I'll gladly make this all happen. Does current milestone and priority correspond to "soon-ish-er"? :)

It's preventing cross-user caching between Mac machines when debugging is enabled. The current implementation is experimental, but we'll need to improve the situation in the coming months, as we get more and more users who want remote caching or execution. We should target 0.7 at the latest.

Understood. Thank you!

If we implement it in CROSSTOOL, everyone who writes a CROSSTOOL file will have to add a wrapper script (I'm not saying it's a bad thing, just noting it). I agreee in principle that CROSSTOOL-triggered extra actions would work, but they aren't happening soon enough, are they?

I think as long as we are fine with a naive CROSSTOOL leaking absolute paths, a wrapper script is fine, at least for the time being.

Note that fixing gcc doesn't help people who have to live with older versions of it.

So, after some spelunking and trial-and error, I realized that this works, at least with gcc 4.8.4 and Clang 3.8:

PWD=/proc/self/cwd gcc a.cc -o a -g -fdebug-prefix-map=/proc/self/cwd=/kitten

This will report a.cc under /kitten. Clang in addition understands -fdebug-prefix-map=.=/kitten.

That said, this won't help OS X much because it doesn't have procfs.

On OS X we don't care about gcc though, right?

That's correct, I think.

I've opened a similar issue at https://github.com/bazelbuild/bazel/issues/5031. We have a patch incoming for OSX binaries built on OSX (we patched wrapped_clang).

That said, this doesn't fix Android builds since this doesn't call wrapped_clang, so I believe the best approach may be to patch CppCompileAction and CppLinkAction.

While /proc/self/cwd does work when Bazel is run on Linux, when run on MacOS it doesn't work. See the new issue.

This is still a problem for Android builds on OSX.
Which a simple helloworld compiled with the android ndk:

$ bazel build //:main --crosstool_top=@androidndk//:toolchain-libcpp --cpu=arm64-v8a --host_crosstool_top=@bazel_tools//tools/cpp:toolchain -c dbg
$ dwarfdump bazel-bin/main
----------------------------------------------------------------------
 File: bazel-bin/main (183-1162626592)
----------------------------------------------------------------------
.debug_info contents:

0x00000000: Compile Unit: length = 0x00000047  version = 0x0004  abbr_offset = 0x00000000  addr_size = 0x08  (next CU at 0x0000004b)

0x0000000b: TAG_compile_unit [1] *
             AT_producer( "Android (4691093 based on r316199) clang version 6.0.2 (https://android.googlesource.com/toolchain/clang 183abd29fc496f55536e7d904e0abae47888fc7f) (https://android.googlesource.com/toolchain/llvm 34361f192e41ed6e4e8f9aca80a4ea7e9856f327) (based on LLVM 6.0.2svn)" )
             AT_language( DW_LANG_C99 )
             AT_name( "main.c" )
             AT_stmt_list( 0x00000000 )
    -------> AT_comp_dir( "/private/var/tmp/_bazel_steeve/b78b3b8a63bf1f81b6cb340ce2f2b555/sandbox/darwin-sandbox/1/execroot/__main__" )
            Unknown DW_AT constant: 0x2134( true )
             AT_low_pc( 0x00000000004005d8 )
             AT_high_pc( 0x0000000c )

0x0000002a:     TAG_subprogram [2]
                 AT_low_pc( 0x00000000004005d8 )
                 AT_high_pc( 0x0000000c )
                 AT_frame_base( reg31 )
                 AT_name( "main" )
        -------> AT_decl_file( "/Users/steeve/go/src/github.com/znly/bzl/main.c" )
                 AT_decl_line( 5 )
                 AT_type( {0x00000043} ( int ) )
                 AT_external( true )

0x00000043:     TAG_base_type [3]
                 AT_name( "int" )
                 AT_encoding( DW_ATE_signed )
                 AT_byte_size( 0x04 )

0x0000004a:     NULL

The first one is fixed by adding -fdebug-prefix-map, but somehow the second one isn't.

Also, it's preventing debugging from working on Android Studio + OSX.

Was this page helpful?
0 / 5 - 0 ratings