rust/src/ci/pgo.sh

62 lines
2.2 KiB
Bash
Raw Normal View History

Utilize PGO for rustc linux dist builds This implements support for applying PGO to the rustc compilation step (not standard library or any tooling, including rustdoc). Expanding PGO to more tools is not terribly difficult but will involve more work and greater CI time commitment. For the same reason of avoiding greater time commitment, this currently avoids implementing for platforms outside of x86_64-unknown-linux-gnu, though in practice it should be quite simple to extend over time to more platforms. The initial implementation is intentionally minimal here to avoid too much work investment before we start seeing wins for a subset of Rust users. The choice of workloads to profile here is somewhat arbitrary, but the general rationale was to aim for a small set that largely avoided time regressions on perf.rust-lang.org's full suite of crates. The set chosen is libcore, cargo (and its dependencies), and a few ad-hoc stress tests from perf.rlo. The stress tests are arguably the most controversial, but they benefit those cases (avoiding regressions) and do not really remove wins from other benchmarks. The primary next step after this PR lands is to implement support for PGO in LLVM. It is unclear whether we can afford a full LLVM rebuild in CI, though, so the approach taken there may need to be more staggered. rustc-only PGO seems well affordable on linux at least, giving us up to 20% wall time wins on some crates for 15 minutes of extra CI time (1 hour up from 45 minutes). The PGO data is uploaded to allow others to reuse it if attempting to reproduce the CI build or potentially, in the future, on other platforms where an off-by-one strategy is used for dist builds at minimal performance cost.
2020-12-14 19:50:59 +01:00
#!/bin/bash
set -euxo pipefail
rm -rf /tmp/rustc-pgo
python2.7 ../x.py build --target=$PGO_HOST --host=$PGO_HOST \
Utilize PGO for rustc linux dist builds This implements support for applying PGO to the rustc compilation step (not standard library or any tooling, including rustdoc). Expanding PGO to more tools is not terribly difficult but will involve more work and greater CI time commitment. For the same reason of avoiding greater time commitment, this currently avoids implementing for platforms outside of x86_64-unknown-linux-gnu, though in practice it should be quite simple to extend over time to more platforms. The initial implementation is intentionally minimal here to avoid too much work investment before we start seeing wins for a subset of Rust users. The choice of workloads to profile here is somewhat arbitrary, but the general rationale was to aim for a small set that largely avoided time regressions on perf.rust-lang.org's full suite of crates. The set chosen is libcore, cargo (and its dependencies), and a few ad-hoc stress tests from perf.rlo. The stress tests are arguably the most controversial, but they benefit those cases (avoiding regressions) and do not really remove wins from other benchmarks. The primary next step after this PR lands is to implement support for PGO in LLVM. It is unclear whether we can afford a full LLVM rebuild in CI, though, so the approach taken there may need to be more staggered. rustc-only PGO seems well affordable on linux at least, giving us up to 20% wall time wins on some crates for 15 minutes of extra CI time (1 hour up from 45 minutes). The PGO data is uploaded to allow others to reuse it if attempting to reproduce the CI build or potentially, in the future, on other platforms where an off-by-one strategy is used for dist builds at minimal performance cost.
2020-12-14 19:50:59 +01:00
--stage 2 library/std --rust-profile-generate=/tmp/rustc-pgo
./build/$PGO_HOST/stage2/bin/rustc --edition=2018 \
--crate-type=lib ../library/core/src/lib.rs
# Download and build a single-file stress test benchmark on perf.rust-lang.org.
function pgo_perf_benchmark {
local PERF=e095f5021bf01cf3800f50b3a9f14a9683eb3e4e
local github_prefix=https://raw.githubusercontent.com/rust-lang/rustc-perf/$PERF
local name=$1
curl -o /tmp/$name.rs $github_prefix/collector/benchmarks/$name/src/lib.rs
./build/$PGO_HOST/stage2/bin/rustc --edition=2018 --crate-type=lib /tmp/$name.rs
}
pgo_perf_benchmark externs
pgo_perf_benchmark ctfe-stress-4
cp -pri ../src/tools/cargo /tmp/cargo
# The Cargo repository does not have a Cargo.lock in it, as it relies on the
# lockfile already present in the rust-lang/rust monorepo. This decision breaks
# down when Cargo is built outside the monorepo though (like in this case),
# resulting in a build without any dependency locking.
#
# To ensure Cargo is built with locked dependencies even during PGO profiling
# the following command copies the monorepo's lockfile into the Cargo temporary
# directory. Cargo will *not* keep that lockfile intact, as it will remove all
# the dependencies Cargo itself doesn't rely on. Still, it will prevent
# building Cargo with arbitrary dependency versions.
#
# See #81378 for the bug that prompted adding this.
cp -p ../Cargo.lock /tmp/cargo
Utilize PGO for rustc linux dist builds This implements support for applying PGO to the rustc compilation step (not standard library or any tooling, including rustdoc). Expanding PGO to more tools is not terribly difficult but will involve more work and greater CI time commitment. For the same reason of avoiding greater time commitment, this currently avoids implementing for platforms outside of x86_64-unknown-linux-gnu, though in practice it should be quite simple to extend over time to more platforms. The initial implementation is intentionally minimal here to avoid too much work investment before we start seeing wins for a subset of Rust users. The choice of workloads to profile here is somewhat arbitrary, but the general rationale was to aim for a small set that largely avoided time regressions on perf.rust-lang.org's full suite of crates. The set chosen is libcore, cargo (and its dependencies), and a few ad-hoc stress tests from perf.rlo. The stress tests are arguably the most controversial, but they benefit those cases (avoiding regressions) and do not really remove wins from other benchmarks. The primary next step after this PR lands is to implement support for PGO in LLVM. It is unclear whether we can afford a full LLVM rebuild in CI, though, so the approach taken there may need to be more staggered. rustc-only PGO seems well affordable on linux at least, giving us up to 20% wall time wins on some crates for 15 minutes of extra CI time (1 hour up from 45 minutes). The PGO data is uploaded to allow others to reuse it if attempting to reproduce the CI build or potentially, in the future, on other platforms where an off-by-one strategy is used for dist builds at minimal performance cost.
2020-12-14 19:50:59 +01:00
# Build cargo (with some flags)
function pgo_cargo {
RUSTC=./build/$PGO_HOST/stage2/bin/rustc \
./build/$PGO_HOST/stage0/bin/cargo $@ \
--manifest-path /tmp/cargo/Cargo.toml
}
# Build a couple different variants of Cargo
CARGO_INCREMENTAL=1 pgo_cargo check
echo 'pub fn barbarbar() {}' >> /tmp/cargo/src/cargo/lib.rs
CARGO_INCREMENTAL=1 pgo_cargo check
touch /tmp/cargo/src/cargo/lib.rs
CARGO_INCREMENTAL=1 pgo_cargo check
pgo_cargo build --release
# Merge the profile data we gathered
./build/$PGO_HOST/llvm/bin/llvm-profdata \
merge -o /tmp/rustc-pgo.profdata /tmp/rustc-pgo
# This produces the actual final set of artifacts.
$@ --rust-profile-use=/tmp/rustc-pgo.profdata