项目作者: grailbio

项目描述 :
LLVM toolchain for bazel
高级语言: Starlark
项目地址: git://github.com/grailbio/bazel-toolchain.git
创建时间: 2018-05-12T00:44:44Z
项目社区:https://github.com/grailbio/bazel-toolchain

开源协议:Apache License 2.0

下载


LLVM toolchain for Bazel Tests

Quickstart

See notes on the release
for how to get started.

NOTE: For releases prior to 0.10.1, please also see these notes.

Basic Usage

The toolchain can automatically detect your OS and arch type, and use the right
pre-built binary LLVM distribution. See the section on “Bring Your Own LLVM”
below for more options.

See in-code documentation in rules.bzl for available
attributes to llvm_toolchain.

Advanced Usage

Per host architecture LLVM version

LLVM does not come with distributions for all host architectures in each
version. In particular patch versions often come with few prebuilt packages.
This means that a single version probably is not enough to address all hosts
one wants to support.

This can be solved by providing a target/version map with a default version.
The example below selects 15.0.6 as the default version for all targets not
specified explicitly. This is like providing llvm_version = "15.0.6", just
like in the example on the top. However, here we provide two more entries that
map their respective target to a distinct version:

  1. llvm_toolchain(
  2. name = "llvm_toolchain",
  3. llvm_versions = {
  4. "": "15.0.6",
  5. "darwin-aarch64": "15.0.7",
  6. "darwin-x86_64": "15.0.7",
  7. },
  8. )

Customizations

We currently offer limited customizability through attributes of the
llvmtoolchain\* rules. You can send us a PR to add
more configuration attributes.

The following shows how to add a specific version for a specific target before
the version was added to llvm_distributions.bzl:

  1. llvm_toolchain(
  2. name = "llvm_toolchain",
  3. llvm_version = "19.1.6",
  4. sha256 = {"linux-x86_64": "d55dcbb309de7ade4e3073ec3ac3fac4d3ff236d54df3c4de04464fe68bec531"},
  5. strip_prefix = {
  6. "linux-x86_64": "LLVM-19.1.6-Linux-X64",
  7. },
  8. urls = {
  9. "linux-x86_64": [
  10. "https://github.com/llvm/llvm-project/releases/download/llvmorg-19.1.6/LLVM-19.1.6-Linux-X64.tar.xz",
  11. ],
  12. },
  13. )

A majority of the complexity of this project is to make it generic for multiple
use cases. For one-off experiments with new architectures, cross-compilations,
new compiler features, etc., my advice would be to look at the toolchain
configurations generated by this repo, and copy-paste/edit to make your own in
any package in your own workspace.

  1. bazel query --output=build @llvm_toolchain//:all | grep -v -e '^#' -e '^ generator'

Besides defining your toolchain in your package BUILD file, and until this
issue is resolved, you would
also need a way for bazel to access the tools in LLVM distribution as relative
paths from your package without using .. up-references. For this, you can
create a symlink that uses up-references to point to the LLVM distribution
directory, and also create a wrapper script for clang such that the actual
clang invocation is not through the symlinked path. See the files in the
@llvm_toolchain//: package as a reference.

  1. # See generated files for reference.
  2. ls -lR "$(bazel info output_base)/external/llvm_toolchain"
  3. # Create symlink to LLVM distribution.
  4. cd _your_package_directory_
  5. ln -s ../....../external/llvm_toolchain_llvm llvm
  6. # Create CC wrapper script.
  7. mkdir bin
  8. cp "$(bazel info output_base)/external/llvm_toolchain/bin/cc_wrapper.sh" bin/cc_wrapper.sh
  9. vim bin/cc_wrapper.sh # Review to ensure relative paths, etc. are good.

See bazel
tutorial

for how CC toolchains work in general.

Selecting Toolchains

If toolchains are registered (see Quickstart section above), you do not need to
do anything special for bazel to find the toolchain. You may want to check once
with the --toolchain_resolution_debug flag to see which toolchains were
selected by bazel for your target platform.

For specifying unregistered toolchains on the command line, please use the
--extra_toolchains flag. For example,
--extra_toolchains=@llvm_toolchain//:cc-toolchain-x86_64-linux.

Bring Your Own LLVM

The following mechanisms are available for using an LLVM toolchain:

  1. Host OS information is used to find the right pre-built binary distribution
    from llvm.org, given the llvm_version or llvm_versions attribute. The
    LLVM toolchain archive is downloaded and extracted as a separate repository
    with the suffix _llvm. The detection logic for llvm_version is not
    perfect, so you may have to use llvm_versions for some host OS type and
    versions. We expect the detection logic to grow through community
    contributions. We welcome PRs.
  2. You can use the urls attribute to specify your own URLs for each OS type,
    version and architecture. For example, you can specify a different URL for
    Arch Linux and a different one for Ubuntu. Just as with the option above,
    the archive is downloaded and extracted as a separate repository with the
    suffix _llvm.
  3. You can also specify your own bazel package paths or local absolute paths
    for each host os-arch pair through the toolchain_roots attribute (without
    bzlmod) or the toolchain_root module extension tags (with bzlmod). Note
    that the keys here are different and less granular than the keys in the urls
    attribute. When using a bazel package path, each of the values is typically
    a package in the user’s workspace or configured through local_repository or
    http_archive; the BUILD file of the package should be similar to
    @toolchains_llvm//toolchain:BUILD.llvm_repo. If using only
    http_archive, maybe consider using the urls attribute instead to get more
    flexibility if you need.
  4. All the above options rely on host OS information, and are not suited for
    docker based sandboxed builds or remote execution builds. Such builds will
    need a single distribution version specified through the distribution
    attribute, or URLs specified through the urls attribute with an empty key, or
    a toolchain root specified through the toolchain_roots attribute with an
    empty key.

Sysroots

A sysroot can be specified through the sysroot attribute (without bzlmod) or
the sysroot module extension tag (with bzlmod). This can be either a path on
the user’s system, or a bazel filegroup like label. One way to create a
sysroot is to use docker export to get a single archive of the entire
filesystem for the image you want. Another way is to use the build scripts
provided by the Chromium
project
.

Cross-compilation

The toolchain supports cross-compilation if you bring your own sysroot. When
cross-compiling, we link against the libstdc++ from the sysroot
(single-platform build behavior is to link against libc++ bundled with LLVM).
The following pairs have been tested to work for some hello-world binaries:

  • {linux, x86_64} -> {linux, aarch64}
  • {linux, aarch64} -> {linux, x86_64}
  • {darwin, x86_64} -> {linux, x86_64}
  • {darwin, x86_64} -> {linux, aarch64}

A recommended approach would be to define two toolchains, one without sysroot
for single-platform builds, and one with sysroot for cross-compilation builds.
Then, when cross-compiling, explicitly specify the toolchain with the sysroot
and the target platform. For example, see the MODULE.bazel
file for llvm_toolchain_with_sysroot and the test
script
for cross-compilation.

  1. bazel build \
  2. --platforms=@toolchains_llvm//platforms:linux-x86_64 \
  3. --extra_toolchains=@llvm_toolchain_with_sysroot//:cc-toolchain-x86_64-linux \
  4. //...

Multi-platform builds

The toolchain supports multi-platform builds through the combination of the
exec_os, exec_arch attribute pair, and either the distribution attribute,
or the urls attribute. This allows one to run their builds on one platform
(e.g. macOS) and their build actions to run on another (e.g. Linux), enabling
remote build execution (RBE). For example, see the MODULE.bazel
file for llvm_toolchain_linux_exec and the test
script
for running the build actions on
Linux even if the build is being run from macOS.

  1. bazel build \
  2. --platforms=@toolchains_llvm//platforms:linux-x86_64 \
  3. --extra_execution_platforms=@toolchains_llvm//platforms:linux-x86_64 \
  4. --extra_toolchains=@llvm_toolchain_linux_exec//:cc-toolchain-x86_64-linux \
  5. //...

Supporting New Target Platforms

The following is a rough (untested) list of steps:

  1. To help us detect if you are cross-compiling or not, note the arch string as
    given by python3 -c 'import platform; print(platform.machine()).
  2. Edit SUPPORTED_TARGETS in
    toolchain/internal/common.bzl with the os
    and the arch string from above.
  3. Add target_system_name, etc. in
    toolchain/cc_toolchain_config.bzl.
  4. For cross-compiling, add a platform bazel type for your target platform in
    platforms/BUILD.bazel, and add an appropriate
    sysroot entry to your llvm_toolchain repository definition.
  5. If not cross-compiling, bring your own LLVM (see section above) through the
    toolchain_roots or urls attribute.
  6. Test your build.

Sandbox

Sandboxing the toolchain introduces a significant overhead (100ms per action,
as of mid 2018). To overcome this, one can use
--experimental_sandbox_base=/dev/shm. However, not all environments might
have enough shared memory available to load all the files in memory. If this is
a concern, you may set the attribute for using absolute paths, which will
substitute templated paths to the toolchain as absolute paths. When running
bazel actions, these paths will be available from inside the sandbox as part of
the / read-only mount. Note that this will make your builds non-hermetic.

Compatibility

The toolchain is tested to work with rules_go, rules_rust, and
rules_foreign_cc.

Accessing tools

The LLVM distribution also provides several tools like clang-format. You can
depend on these tools directly in the bin directory of the distribution. When
not using the toolchain_roots attribute, the distribution is available in the
repo with the suffix _llvm appended to the name you used for the
llvm_toolchain rule. For example, @llvm_toolchain_llvm//:bin/clang-format
is a valid and visible target in the quickstart example above.

When using the toolchain_roots attribute, there is currently no single target
that you can reference, and you may have to alias the tools you want with a
select clause in your workspace.

As a convenience, some targets are aliased appropriately in the configuration
repo (as opposed to the LLVM distribution repo) for you to use and will work
even when using toolchain_roots. The complete list is in the file
aliases.bzl. If your repo is named llvm_toolchain,
then they can be referenced as:

Strict header deps (Linux only)

The toolchain supports Bazel’s layering_check feature, which relies on
Clang modules to implement strict
deps (also known as “depend on what you use”) for cc_* rules. This feature
can be enabled by enabling the layering_check feature on a per-target,
per-package or global basis.

Prior Art

Other examples of toolchain configuration:

https://bazel.build/tutorials/ccp-toolchain-config

https://github.com/vsco/bazel-toolchains