Utilize PGO for rustc linux dist builds
This implements support for applying PGO to the rustc compilation step (not
standard library or any tooling, including rustdoc). Expanding PGO to more tools
is not terribly difficult but will involve more work and greater CI time
commitment.
For the same reason of avoiding greater time commitment, this currently avoids
implementing for platforms outside of x86_64-unknown-linux-gnu, though in
practice it should be quite simple to extend over time to more platforms. The
initial implementation is intentionally minimal here to avoid too much work
investment before we start seeing wins for a subset of Rust users.
The choice of workloads to profile here is somewhat arbitrary, but the general
rationale was to aim for a small set that largely avoided time regressions on
perf.rust-lang.org's full suite of crates. The set chosen is libcore, cargo (and
its dependencies), and a few ad-hoc stress tests from perf.rlo. The stress tests
are arguably the most controversial, but they benefit those cases (avoiding
regressions) and do not really remove wins from other benchmarks.
The primary next step after this PR lands is to implement support for PGO in
LLVM. It is unclear whether we can afford a full LLVM rebuild in CI, though, so
the approach taken there may need to be more staggered. rustc-only PGO seems
well affordable on linux at least, giving us up to 20% wall time wins on some
crates for 15 minutes of extra CI time (1 hour up from 45 minutes).
The PGO data is uploaded to allow others to reuse it if attempting to reproduce
the CI build or potentially, in the future, on other platforms where an
off-by-one strategy is used for dist builds at minimal performance cost.
2020-12-14 19:50:59 +01:00
|
|
|
#!/bin/bash
|
|
|
|
|
|
|
|
set -euxo pipefail
|
|
|
|
|
|
|
|
rm -rf /tmp/rustc-pgo
|
|
|
|
|
2021-01-30 17:44:49 +01:00
|
|
|
python2.7 ../x.py build --target=$PGO_HOST --host=$PGO_HOST \
|
Utilize PGO for rustc linux dist builds
This implements support for applying PGO to the rustc compilation step (not
standard library or any tooling, including rustdoc). Expanding PGO to more tools
is not terribly difficult but will involve more work and greater CI time
commitment.
For the same reason of avoiding greater time commitment, this currently avoids
implementing for platforms outside of x86_64-unknown-linux-gnu, though in
practice it should be quite simple to extend over time to more platforms. The
initial implementation is intentionally minimal here to avoid too much work
investment before we start seeing wins for a subset of Rust users.
The choice of workloads to profile here is somewhat arbitrary, but the general
rationale was to aim for a small set that largely avoided time regressions on
perf.rust-lang.org's full suite of crates. The set chosen is libcore, cargo (and
its dependencies), and a few ad-hoc stress tests from perf.rlo. The stress tests
are arguably the most controversial, but they benefit those cases (avoiding
regressions) and do not really remove wins from other benchmarks.
The primary next step after this PR lands is to implement support for PGO in
LLVM. It is unclear whether we can afford a full LLVM rebuild in CI, though, so
the approach taken there may need to be more staggered. rustc-only PGO seems
well affordable on linux at least, giving us up to 20% wall time wins on some
crates for 15 minutes of extra CI time (1 hour up from 45 minutes).
The PGO data is uploaded to allow others to reuse it if attempting to reproduce
the CI build or potentially, in the future, on other platforms where an
off-by-one strategy is used for dist builds at minimal performance cost.
2020-12-14 19:50:59 +01:00
|
|
|
--stage 2 library/std --rust-profile-generate=/tmp/rustc-pgo
|
|
|
|
|
|
|
|
./build/$PGO_HOST/stage2/bin/rustc --edition=2018 \
|
|
|
|
--crate-type=lib ../library/core/src/lib.rs
|
|
|
|
|
|
|
|
# Download and build a single-file stress test benchmark on perf.rust-lang.org.
|
|
|
|
function pgo_perf_benchmark {
|
|
|
|
local PERF=e095f5021bf01cf3800f50b3a9f14a9683eb3e4e
|
|
|
|
local github_prefix=https://raw.githubusercontent.com/rust-lang/rustc-perf/$PERF
|
|
|
|
local name=$1
|
|
|
|
curl -o /tmp/$name.rs $github_prefix/collector/benchmarks/$name/src/lib.rs
|
|
|
|
./build/$PGO_HOST/stage2/bin/rustc --edition=2018 --crate-type=lib /tmp/$name.rs
|
|
|
|
}
|
|
|
|
|
|
|
|
pgo_perf_benchmark externs
|
|
|
|
pgo_perf_benchmark ctfe-stress-4
|
|
|
|
|
|
|
|
cp -pri ../src/tools/cargo /tmp/cargo
|
|
|
|
|
2021-01-25 13:33:24 +01:00
|
|
|
# The Cargo repository does not have a Cargo.lock in it, as it relies on the
|
|
|
|
# lockfile already present in the rust-lang/rust monorepo. This decision breaks
|
|
|
|
# down when Cargo is built outside the monorepo though (like in this case),
|
|
|
|
# resulting in a build without any dependency locking.
|
|
|
|
#
|
|
|
|
# To ensure Cargo is built with locked dependencies even during PGO profiling
|
|
|
|
# the following command copies the monorepo's lockfile into the Cargo temporary
|
|
|
|
# directory. Cargo will *not* keep that lockfile intact, as it will remove all
|
|
|
|
# the dependencies Cargo itself doesn't rely on. Still, it will prevent
|
|
|
|
# building Cargo with arbitrary dependency versions.
|
|
|
|
#
|
|
|
|
# See #81378 for the bug that prompted adding this.
|
|
|
|
cp -p ../Cargo.lock /tmp/cargo
|
|
|
|
|
Utilize PGO for rustc linux dist builds
This implements support for applying PGO to the rustc compilation step (not
standard library or any tooling, including rustdoc). Expanding PGO to more tools
is not terribly difficult but will involve more work and greater CI time
commitment.
For the same reason of avoiding greater time commitment, this currently avoids
implementing for platforms outside of x86_64-unknown-linux-gnu, though in
practice it should be quite simple to extend over time to more platforms. The
initial implementation is intentionally minimal here to avoid too much work
investment before we start seeing wins for a subset of Rust users.
The choice of workloads to profile here is somewhat arbitrary, but the general
rationale was to aim for a small set that largely avoided time regressions on
perf.rust-lang.org's full suite of crates. The set chosen is libcore, cargo (and
its dependencies), and a few ad-hoc stress tests from perf.rlo. The stress tests
are arguably the most controversial, but they benefit those cases (avoiding
regressions) and do not really remove wins from other benchmarks.
The primary next step after this PR lands is to implement support for PGO in
LLVM. It is unclear whether we can afford a full LLVM rebuild in CI, though, so
the approach taken there may need to be more staggered. rustc-only PGO seems
well affordable on linux at least, giving us up to 20% wall time wins on some
crates for 15 minutes of extra CI time (1 hour up from 45 minutes).
The PGO data is uploaded to allow others to reuse it if attempting to reproduce
the CI build or potentially, in the future, on other platforms where an
off-by-one strategy is used for dist builds at minimal performance cost.
2020-12-14 19:50:59 +01:00
|
|
|
# Build cargo (with some flags)
|
|
|
|
function pgo_cargo {
|
|
|
|
RUSTC=./build/$PGO_HOST/stage2/bin/rustc \
|
|
|
|
./build/$PGO_HOST/stage0/bin/cargo $@ \
|
|
|
|
--manifest-path /tmp/cargo/Cargo.toml
|
|
|
|
}
|
|
|
|
|
|
|
|
# Build a couple different variants of Cargo
|
|
|
|
CARGO_INCREMENTAL=1 pgo_cargo check
|
|
|
|
echo 'pub fn barbarbar() {}' >> /tmp/cargo/src/cargo/lib.rs
|
|
|
|
CARGO_INCREMENTAL=1 pgo_cargo check
|
|
|
|
touch /tmp/cargo/src/cargo/lib.rs
|
|
|
|
CARGO_INCREMENTAL=1 pgo_cargo check
|
|
|
|
pgo_cargo build --release
|
|
|
|
|
|
|
|
# Merge the profile data we gathered
|
|
|
|
./build/$PGO_HOST/llvm/bin/llvm-profdata \
|
|
|
|
merge -o /tmp/rustc-pgo.profdata /tmp/rustc-pgo
|
|
|
|
|
|
|
|
# This produces the actual final set of artifacts.
|
|
|
|
$@ --rust-profile-use=/tmp/rustc-pgo.profdata
|