rustc-book: Add documentation on how to use PGO.
This commit is contained in:
parent
dbec74ffa7
commit
314194ef17
@ -13,5 +13,6 @@
|
||||
- [Targets](targets/index.md)
|
||||
- [Built-in Targets](targets/built-in.md)
|
||||
- [Custom Targets](targets/custom.md)
|
||||
- [Profile-guided Optimization](profile-guided-optimization.md)
|
||||
- [Linker-plugin based LTO](linker-plugin-lto.md)
|
||||
- [Contributing to `rustc`](contributing.md)
|
||||
|
@ -214,3 +214,20 @@ This option lets you control what happens when the code panics.
|
||||
## incremental
|
||||
|
||||
This flag allows you to enable incremental compilation.
|
||||
|
||||
## profile-generate
|
||||
|
||||
This flag allows for creating instrumented binaries that will collect
|
||||
profiling data for use with profile-guided optimization (PGO). The flag takes
|
||||
an optional argument which is the path to a directory into which the
|
||||
instrumented binary will emit the collected data. See the chapter on
|
||||
[profile-guided optimization](profile-guided-optimization.html) for more
|
||||
information.
|
||||
|
||||
## profile-use
|
||||
|
||||
This flag specifies the profiling data file to be used for profile-guided
|
||||
optimization (PGO). The flag takes a mandatory argument which is the path
|
||||
to a valid `.profdata` file. See the chapter on
|
||||
[profile-guided optimization](profile-guided-optimization.html) for more
|
||||
information.
|
||||
|
136
src/doc/rustc/src/profile-guided-optimization.md
Normal file
136
src/doc/rustc/src/profile-guided-optimization.md
Normal file
@ -0,0 +1,136 @@
|
||||
# Profile Guided Optimization
|
||||
|
||||
`rustc` supports doing profile-guided optimization (PGO).
|
||||
This chapter describes what PGO is, what it is good for, and how it can be used.
|
||||
|
||||
## What Is Profiled-Guided Optimization?
|
||||
|
||||
The basic concept of PGO is to collect data about the typical execution of
|
||||
a program (e.g. which branches it is likely to take) and then use this data
|
||||
to inform optimizations such as inlining, machine-code layout,
|
||||
register allocation, etc.
|
||||
|
||||
There are different ways of collecting data about a program's execution.
|
||||
One is to run the program inside a profiler (such as `perf`) and another
|
||||
is to create an instrumented binary, that is, a binary that has data
|
||||
collection built into it, and run that.
|
||||
The latter usually provides more accurate data and it is also what is
|
||||
supported by `rustc`.
|
||||
|
||||
## Usage
|
||||
|
||||
Generating a PGO-optimized program involves following a workflow with four steps:
|
||||
|
||||
1. Compile the program with instrumentation enabled
|
||||
(e.g. `rustc -Cprofile-generate=/tmp/pgo-data main.rs`)
|
||||
2. Run the instrumented program (e.g. `./main`) which generates a
|
||||
`default_<id>.profraw` file
|
||||
3. Convert the `.profraw` file into a `.profdata` file using
|
||||
LLVM's `llvm-profdata` tool
|
||||
4. Compile the program again, this time making use of the profiling data
|
||||
(for example `rustc -Cprofile-use=merged.profdata main.rs`)
|
||||
|
||||
An instrumented program will create one or more `.profraw` files, one for each
|
||||
instrumented binary. E.g. an instrumented executable that loads two instrumented
|
||||
dynamic libraries at runtime will generate three `.profraw` files. Running an
|
||||
instrumented binary multiple times, on the other hand, will re-use the
|
||||
respective `.profraw` files, updating them in place.
|
||||
|
||||
These `.profraw` files have to be post-processed before they can be fed back
|
||||
into the compiler. This is done by the `llvm-profdata` tool. This tool
|
||||
is most easily installed via
|
||||
|
||||
```bash
|
||||
rustup component add llvm-tools-preview
|
||||
```
|
||||
|
||||
Note that installing the `llvm-tools-preview` component won't add
|
||||
`llvm-profdata` to the `PATH`. Rather, the tool can be found in:
|
||||
|
||||
```bash
|
||||
~/.rustup/toolchains/<toolchain>/lib/rustlib/<target-triple>/bin/
|
||||
```
|
||||
|
||||
Alternatively, an `llvm-profdata` coming with a recent LLVM or Clang
|
||||
version usually works too.
|
||||
|
||||
The `llvm-profdata` tool merges multiple `.profraw` files into a single
|
||||
`.profdata` file that can then be fed back into the compiler via
|
||||
`-Cprofile-use`:
|
||||
|
||||
```bash
|
||||
# STEP 1: Compile the binary with instrumentation
|
||||
rustc -Cprofile-generate=/tmp/pgo-data -O ./main.rs
|
||||
|
||||
# STEP 2: Run the binary a few times, maybe with common sets of args.
|
||||
# Each run will create or update `.profraw` files in /tmp/pgo-data
|
||||
./main mydata1.csv
|
||||
./main mydata2.csv
|
||||
./main mydata3.csv
|
||||
|
||||
# STEP 3: Merge and post-process all the `.profraw` files in /tmp/pgo-data
|
||||
llvm-profdata merge -o ./merged.profdata /tmp/pgo-data
|
||||
|
||||
# STEP 4: Use the merged `.profdata` file during optimization. All `rustc`
|
||||
# flags have to be the same.
|
||||
rustc -Cprofile-use=./merged.profdata -O ./main.rs
|
||||
```
|
||||
|
||||
### A Complete Cargo Workflow
|
||||
|
||||
Using this feature with Cargo works very similar to using it with `rustc`
|
||||
directly. Again, we generate an instrumented binary, run it to produce data,
|
||||
merge the data, and feed it back into the compiler. Some things of note:
|
||||
|
||||
- We use the `RUSTFLAGS` environment variable in order to pass the PGO compiler
|
||||
flags to the compilation of all crates in the program.
|
||||
|
||||
- We pass the `--target` flag to Cargo, which prevents the `RUSTFLAGS`
|
||||
arguments to be passed to Cargo build scripts. We don't want the build
|
||||
scripts to generate a bunch of `.profraw` files.
|
||||
|
||||
- We pass `--release` to Cargo because that's where PGO makes the most sense.
|
||||
In theory, PGO can also be done on debug builds but there is little reason
|
||||
to do so.
|
||||
|
||||
- It is recommended to use *absolute paths* for the argument of
|
||||
`-Cprofile-generate` and `-Cprofile-use`. Cargo can invoke `rustc` with
|
||||
varying working directories, meaning that `rustc` will not be able to find
|
||||
the supplied `.profdata` file. With absolute paths this is not an issue.
|
||||
|
||||
- It is good practice to make sure that there is no left-over profiling data
|
||||
from previous compilation sessions. Just deleting the directory is a simple
|
||||
way of doing so (see `STEP 0` below).
|
||||
|
||||
This is what the entire workflow looks like:
|
||||
|
||||
```bash
|
||||
# STEP 0: Make sure there is no left-over profiling data from previous runs
|
||||
rm -rf /tmp/pgo-data
|
||||
|
||||
# STEP 1: Build the instrumented binaries
|
||||
RUSTFLAGS="-Cprofile-generate=/tmp/pgo-data" \
|
||||
cargo build --release --target=x86_64-unknown-linux-gnu
|
||||
|
||||
# STEP 2: Run the instrumented binaries with some typical data
|
||||
./target/x86_64-unknown-linux-gnu/release/myprogram mydata1.csv
|
||||
./target/x86_64-unknown-linux-gnu/release/myprogram mydata2.csv
|
||||
./target/x86_64-unknown-linux-gnu/release/myprogram mydata3.csv
|
||||
|
||||
# STEP 3: Merge the `.profraw` files into a `.profdata` file
|
||||
llvm-profdata merge -o /tmp/pgo-data/merged.profdata /tmp/pgo-data
|
||||
|
||||
# STEP 4: Use the `.profdata` file for guiding optimizations
|
||||
RUSTFLAGS="-Cprofile-use=/tmp/pgo-data/merged.profdata" \
|
||||
cargo build --release --target=x86_64-unknown-linux-gnu
|
||||
```
|
||||
|
||||
## Further Reading
|
||||
|
||||
`rustc`'s PGO support relies entirely on LLVM's implementation of the feature
|
||||
and is equivalent to what Clang offers via the `-fprofile-generate` /
|
||||
`-fprofile-use` flags. The [Profile Guided Optimization][clang-pgo] section
|
||||
in Clang's documentation is therefore an interesting read for anyone who wants
|
||||
to use PGO with Rust.
|
||||
|
||||
[clang-pgo]: https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
|
Loading…
Reference in New Issue
Block a user