Bazel: Control linking of dependencies for cc_library

Created on 2 Oct 2015  Â·  19Comments  Â·  Source: bazelbuild/bazel

cc_library currently does not link its dependencies, but cc_binary does. It whould be nice if I could control this behaviour for each dependency to create cc_libraries that do contain the symbols from their dependencies. For example by using a flag like linkwithlibrary=true in each dependency or giving a list of "must-link" dependencies inside the final cc_library definition

See also: http://stackoverflow.com/questions/32845940/symbols-from-static-cc-library-dependency-in-so-missing

P4 team-Rules-CPP feature request

Most helpful comment

After 2 years we're going to make this happen! Please take a look and comment on the transitive libraries design doc: https://docs.google.com/document/d/1-tv0_79zGyBoDmaP_pYWaBVUwHUteLpAs90_rUl-VY8/edit?usp=sharing

All 19 comments

Why not use cc_binary if you want to build a dynamic library?

On Fri, Oct 2, 2015 at 9:33 AM, jcremer [email protected] wrote:

cc_library currently does not link its dependencies, but cc_binary does.
It whould be nice if I could control this behaviour for each dependency to
create cc_libraries that do contain the symbols from their dependencies.
For example by using a flag like linkwithlibrary=true in each dependency or
giving a list of "must-link" dependencies inside the final cc_library
definition

See also:
http://stackoverflow.com/questions/32845940/symbols-from-static-cc-library-dependency-in-so-missing

—
Reply to this email directly or view it on GitHub
https://github.com/bazelbuild/bazel/issues/492.

This seems to work for me, but i am not sure if the generation of cc_binary is always the wanted behaviour; For example if you want to have more control over the linked libraries or generate intermediates that i can then use as a build result and additionally to generate more dependent libraries or binaries.
Additionally cc_binary seems to already bind a specifing glibc version.

The relevant documentation is here: http://bazel.io/docs/build-encyclopedia.html#cc_binary.linkstatic

Intermediates for what? If you are building an intermediate for Bazel, you should be using cc_library, and everything should work. If you are building an intermediate for use elsewhere, it would help if you further specified what you want.

I have a main application that loads some other .so files as modules. These modules use symbols already available inside the main application and i dont want to have these symbols also inside every module. But I have some other dependencies inside the module whose symbols I do want inside the module binary. So I need to control which symbols may be undefined inside my module lib at compile time and which are not.

I'm pretty sure this is possible, I think it would work like this:

cc_binary(name = "main",
srcs = ["main.cc"],
deps = [ ":lib"])

cc_binary(name ="module",
srcs = ["mod.cc",],
deps = [":lib"])

cc_library(name ="lib",
srcs = [ "lib.so" ]
hdrs = [ "lib.h" ])

adding @ulfjack in case I'm totally off.

Going through .so files may work, but seems a bit roundabout and increases number of pieces to deploy. That said, we don't have a really good mechanism for this; I think the most obvious way to do it is to have a header-only library that doesn't have implementation .cc files, and compile against that in the modules. (And of course, make sure that the implementation is in the binary that wants to load the modules.)

Is there anything left for addressing this issue?

As ulfjack already said, the current options are suboptimal, but it is possible to get the wanted behaviour by defining some cc_libraries twice: complete and only-header.
It whould be nice if i was somehow able to specifically exclude/include symbol links from each dependency inside my artifact. Now the behaviour seems to be: link everything for cc_binary and leave unnamed everything for cc_library which often makes sense but not allways.

I have the same problem when I added linker_flag: "-Wl,--no-undefined" to my CROSSTOOL and then trying to build a cc_library alone. I'm quite surprised that cc_library doesn't link its deps. Removing that flag can workaround this issue (like we did in #67 (efd5d31) before).

IMHO, linking with the dependent libraries (specified in deps) for cc_library target would be a desired behavior. This can also pull those dependencies into DT_NEEDED entries of output .so so the output .so can be loaded with dlopen(). Currently, this is achieved by using cc_binary with linkshared (this flag imposes a limitation on binary name). What do you think to change cc_library's behavior and deprecates the latter approach? Using cc_binary to build a library (more specifically ... complement the missing behaviors in cc_library) seems a hack to me.

Use a cc_binary (right now, we may introduce more / better rules to do that) to link dependencies.

For larger code bases, linking the transitive closure at every cc_library would be prohibitively expensive (I haven't actually collected data on this, but I'm very confident). cc_library isn't intended as a top-level rule for external consumption - it's not a 'library' in the same sense as, say, openssl is a library. We generally expect cc_library rules to be much more fine-grained, in the limit down to having a single .cc file per cc_library.

I agree with you that linking takes more time and more memory for more dependencies. This becomes a serious problem when dealing with large code bases. That's why incremental linking is included in many people's most-wanted feature list.

Thanks for reiterating the semantic and design of cc_library. IIRC, cc_library is used to produce an _intermediate result_ for a collection of files that share the same build configuration (e.g., compiler flags). However, the _intermediate result_ comes in 2 forms on Linux:

  1. When building a cc_binary that depends on the cc_library: the cc_library is built to a .a which by its nature doesn't make sense to link against the dependencies.
  2. When building a cc_test that depends on the cc_library without setting linkstatic: the cc_library is built to a .so. In this case, I hope that all dependencies could be pulled to produce an intact dynamic library (i.e., exhaustive list of DT_NEEDED entries such that there's no undefined symbols).

IMHO, whether cc_library links the dependencies depends on what it is producing.

Besides, given such design of cc_library and the following project,

  app_a/
    BUILD
    main.c
  app_b/
    BUILD
    main.c
  libfoo/
    BUILD
    foo.c

Both app_a and app_b depends on libfoo. Is it possible to build libfoo.so such that it can be shared by both app_a and app_b in Bazel? I.e., libfoo.so is no longer an intermediate result. It should be included in the output image and is linked by the executables built from app_a and app_b.

We tried incremental linking, and it was a small performance win, but not significant enough to warrant the complexity.

We don't actually build .a files on Linux, where we instead pass the original .o files to the final link (or at least, we have code to do that, and I think it's enabled by default). This is faster than building intermediate .a files and uses less disk space.

In the given example, it's not really possible right now to build libfoo.so such that you can install all three independently. One of the problems with dynamic linking is that you have to carefully make sure to never statically link common dependencies into multiple outputs.

For example, consider two binaries A and B, depending on a library LIB, which depends on a library LIB2. Let's say A also directly depends on LIB2. If you link LIB as a .so, what do you do with LIB2? You could link it into LIB.so, but then you have to make sure to _not_ link it into A. Or you could require linking it into it's own LIB2.so, in which case you have to make sure that both LIB and A link it dynamically, and the downside is that you then can't use multiple rules for LIB2 - you have to force a single rule to contain everything that you want linked into LIB2.so.

A simple, and usually workable policy is to never link dependencies into libraries. At Google, we don't ship pieces of applications independently of each other, we always ship the entire app. This works great for production systems, but doesn't match how Linux distributions work today. Though note that there are efforts to move closer to that, e.g., Ubuntu snappy packages.

For Bazel, we'd like to offer a mode that covered this case better, but note that it's inherently brittle under changes to the structure of the dependency graph. We haven't yet been able to come up with a good model: in order to avoid it being brittle, we need to enforce global properties on the build graph, which turns it into a scalability issue.

We briefly discussed allowing users to determine the subdivision between linked units manually, and Bazel enforcing correctness at the binary level. It's still brittle, but at least the build will fail if you break it, rather than discovering issues at runtime.

If you have multiple dependent shared libs is there a way to maintain those dependencies? It seems like I would need to list all needed shared libs for any rule. for example look at test. Is there a work around for this?

cc_test(
    name="test",
    srcs=glob(["tests/*.cc"]) + [":lib2.so", ":lib1.so"],
    deps=[":lib_hdrs"]
)
cc_binary(
    name="lib1.so",
    srcs=glob(["lib1.cc", "lib1.hh"]),
    visibility=["//visibility:public"],
    linkshared=1
)

cc_binary(
    name="lib2.so",
    srcs=glob(["lib2.cc", "lib2.hh", "lib1.hh"]) + [":lib1.so"],
    visibility=["//visibility:public"],
    linkshared=1
)

After 2 years we're going to make this happen! Please take a look and comment on the transitive libraries design doc: https://docs.google.com/document/d/1-tv0_79zGyBoDmaP_pYWaBVUwHUteLpAs90_rUl-VY8/edit?usp=sharing

See also https://stackoverflow.com/questions/44723821/why-does-bazel-under-link-and-how-do-i-fix-it. This is just broken; Bazel should be able to produce one .so that can be used by both internal (i.e. within Bazel) and external (i.e. not using Bazel) consumers. (Bonus points for not forcing the Bazel target to have a stupid name.)

Status update, our awesome @iirina started working on transitive libraries. First commit is in! https://github.com/bazelbuild/bazel/commit/fb20d20580456280b0e64162fdbe86dafca75388

My takeaway from this is that for development, Bazel-style internal library linking is awesome because of the granularity it yields; however, for deployment, if you have all your Bazel internal libraries / binaries use linkstatic = 0 / linkshared = 1, and if you install these items, you'll have the weird dependencies like @mwoehlke-kitware mentioned, such as _solib_k8/_U@vtk _S_S_Cvtkglew___Uexternal_Svtk_Slib/libvtkglew-8.0.so.1, which probably would not have been listed in some install rule / manifest (and you probably wouldn't want it to be).

Since patchelf supports --remove-needed and --add-needed, and it's available in Homebrew on Mac, could the following be done for shared library deployment, keeping Bazel's granular targets for development, but linking against deployment libraries at install time?

  • Ensure that all internal libraries which might have diamond dependency problems are linkstatic = 0, to avoid ODR via the *.sos
  • Define common deployment libraries which consume these internal libraries, and link them into one *.so.

    • Whitelisting seems like it'd be the easiest for static dependencies into the deployment library - perhaps "reexport_deps" would be the solution here?



      • In Bazel development land, when you run the binary, it will run everything as normal; it will link against the internal library solibs without a hitch. This still enables granular development (you don't have to build a whole submodule when testing just a small potion; keep it Bazel-style rather than CMake-style.)


      • In a custom / built-in deployment step (like @drake//:install), when you install an artifact, you check the upstream solibs. For each internal version, you check the other deployed libraries (much less granular), remove the solibs from the binary (be it an executable or another solib) via patchelf --remove-needed, and add the containing deployed libraries via patchelf --add-needed.



My naive understanding is that this is similar to 5.d. in the Attack Plan, but instead of distributing the top-level library with its upstream components (possibly with or without odd names) in a solib directory, this would use the merged solib with the upstream solibs, and any components downstream of the original solibs would now use the merged deployment solib.

Can I also ask if this understanding is correct? (Specifically, would this be a valid usage of reexport_deps for shared libraries?)

Quick example (leveraging Matt's install setup in Drake as an example deployment mechanism):

# In //common:
cc_library(name = "math_utils", ...)
cc_library(name = "io_utils", ...)
# Deployment library. This would include transitive headers too.
cc_binary(name = "common", output = "libmy_project_common.so", reexport_deps = [":math_utils", ":io_utils"])
# - Deployment mechanism
install(name = "install", targets = [":common"])

# In //apps/example
cc_binary(name = "main", deps = ["//common:math_utils", "//common:io_utils"])
# ^ Running this would link directly against the internal Bazel libraries
# - Deployment mechanism. When "main" is installed, it is now linked directly against "libmy_project_common.so"; no other artifacts are needed for deployment.
install(name = "install", targets = [":main"], deps = ["//common:install"])

Hello! Two years have passed since the last message and I was wandering if there is any plan to add the cc_static_library rule to Bazel.
Otherwise, does any one have a custom implementation of the rule that can share?
Thanks!

Hi @filippobrizzi, it shouldn't be too hard to tweak https://github.com/bazelbuild/rules_cc/blob/master/examples/my_c_archive/my_c_archive.bzl to collect transitive static libraries from CcInfo and create an archiving action for all of them.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ob picture ob  Â·  3Comments

davidzchen picture davidzchen  Â·  3Comments

ttsugriy picture ttsugriy  Â·  3Comments

filipesilva picture filipesilva  Â·  3Comments

GaofengCheng picture GaofengCheng  Â·  3Comments