From f0322371dda75d66d006a5c5ec4ffebb8af030c5 Mon Sep 17 00:00:00 2001 From: Julia Evans Date: Wed, 18 Dec 2013 12:38:14 -0500 Subject: [PATCH] Add testing tutorial to docs --- doc/tutorial-testing.md | 263 ++++++++++++++++++++++++++++++++++++++++ doc/tutorial.md | 2 + mk/docs.mk | 7 ++ 3 files changed, 272 insertions(+) create mode 100644 doc/tutorial-testing.md diff --git a/doc/tutorial-testing.md b/doc/tutorial-testing.md new file mode 100644 index 00000000000..b09ba605d2c --- /dev/null +++ b/doc/tutorial-testing.md @@ -0,0 +1,263 @@ +% Rust Testing Tutorial + +# Quick start + +To create test functions, add a `#[test]` attribute like this: + +```rust +fn return_two() -> int { + 2 +} + +#[test] +fn return_two_test() { + let x = return_two(); + assert!(x == 2); +} +``` + +To run these tests, use `rustc --test`: + +``` +$ rustc --test foo.rs; ./foo +running 1 test +test return_two_test ... ok + +test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured +``` + +`rustc foo.rs` will *not* compile the tests, since `#[test]` implies +`#[cfg(test)]`. The `--test` flag to `rustc` implies `--cfg test`. + + +# Unit testing in Rust + +Rust has built in support for simple unit testing. Functions can be +marked as unit tests using the 'test' attribute. + +```rust +#[test] +fn return_none_if_empty() { + // ... test code ... +} +``` + +A test function's signature must have no arguments and no return +value. To run the tests in a crate, it must be compiled with the +'--test' flag: `rustc myprogram.rs --test -o myprogram-tests`. Running +the resulting executable will run all the tests in the crate. A test +is considered successful if its function returns; if the task running +the test fails, through a call to `fail!`, a failed `check` or +`assert`, or some other (`assert_eq`, `assert_approx_eq`, ...) means, +then the test fails. + +When compiling a crate with the '--test' flag '--cfg test' is also +implied, so that tests can be conditionally compiled. + +```rust +#[cfg(test)] +mod tests { + #[test] + fn return_none_if_empty() { + // ... test code ... + } +} +``` + +Additionally #[test] items behave as if they also have the +#[cfg(test)] attribute, and will not be compiled when the --test flag +is not used. + +Tests that should not be run can be annotated with the 'ignore' +attribute. The existence of these tests will be noted in the test +runner output, but the test will not be run. Tests can also be ignored +by configuration so, for example, to ignore a test on windows you can +write `#[ignore(cfg(target_os = "win32"))]`. + +Tests that are intended to fail can be annotated with the +'should_fail' attribute. The test will be run, and if it causes its +task to fail then the test will be counted as successful; otherwise it +will be counted as a failure. For example: + +```rust +#[test] +#[should_fail] +fn test_out_of_bounds_failure() { + let v: [int] = []; + v[0]; +} +``` + +A test runner built with the '--test' flag supports a limited set of +arguments to control which tests are run: the first free argument +passed to a test runner specifies a filter used to narrow down the set +of tests being run; the '--ignored' flag tells the test runner to run +only tests with the 'ignore' attribute. + +## Parallelism + +By default, tests are run in parallel, which can make interpreting +failure output difficult. In these cases you can set the +`RUST_TEST_TASKS` environment variable to 1 to make the tests run +sequentially. + +## Benchmarking + +The test runner also understands a simple form of benchmark execution. +Benchmark functions are marked with the `#[bench]` attribute, rather +than `#[test]`, and have a different form and meaning. They are +compiled along with `#[test]` functions when a crate is compiled with +`--test`, but they are not run by default. To run the benchmark +component of your testsuite, pass `--bench` to the compiled test +runner. + +The type signature of a benchmark function differs from a unit test: +it takes a mutable reference to type `test::BenchHarness`. Inside the +benchmark function, any time-variable or "setup" code should execute +first, followed by a call to `iter` on the benchmark harness, passing +a closure that contains the portion of the benchmark you wish to +actually measure the per-iteration speed of. + +For benchmarks relating to processing/generating data, one can set the +`bytes` field to the number of bytes consumed/produced in each +iteration; this will used to show the throughput of the benchmark. +This must be the amount used in each iteration, *not* the total +amount. + +For example: + +```rust +extern mod extra; +use std::vec; + +#[bench] +fn bench_sum_1024_ints(b: &mut extra::test::BenchHarness) { + let v = vec::from_fn(1024, |n| n); + b.iter(|| {v.iter().fold(0, |old, new| old + *new);} ); +} + +#[bench] +fn initialise_a_vector(b: &mut extra::test::BenchHarness) { + b.iter(|| {vec::from_elem(1024, 0u64);} ); + b.bytes = 1024 * 8; +} +``` + +The benchmark runner will calibrate measurement of the benchmark +function to run the `iter` block "enough" times to get a reliable +measure of the per-iteration speed. + +Advice on writing benchmarks: + + - Move setup code outside the `iter` loop; only put the part you + want to measure inside + - Make the code do "the same thing" on each iteration; do not + accumulate or change state + - Make the outer function idempotent too; the benchmark runner is + likely to run it many times + - Make the inner `iter` loop short and fast so benchmark runs are + fast and the calibrator can adjust the run-length at fine + resolution + - Make the code in the `iter` loop do something simple, to assist in + pinpointing performance improvements (or regressions) + +To run benchmarks, pass the `--bench` flag to the compiled +test-runner. Benchmarks are compiled-in but not executed by default. + +## Examples + +### Typical test run + +``` +> mytests + +running 30 tests +running driver::tests::mytest1 ... ok +running driver::tests::mytest2 ... ignored +... snip ... +running driver::tests::mytest30 ... ok + +result: ok. 28 passed; 0 failed; 2 ignored +``` + +### Test run with failures + +``` +> mytests + +running 30 tests +running driver::tests::mytest1 ... ok +running driver::tests::mytest2 ... ignored +... snip ... +running driver::tests::mytest30 ... FAILED + +result: FAILED. 27 passed; 1 failed; 2 ignored +``` + +### Running ignored tests + +``` +> mytests --ignored + +running 2 tests +running driver::tests::mytest2 ... failed +running driver::tests::mytest10 ... ok + +result: FAILED. 1 passed; 1 failed; 0 ignored +``` + +### Running a subset of tests + +``` +> mytests mytest1 + +running 11 tests +running driver::tests::mytest1 ... ok +running driver::tests::mytest10 ... ignored +... snip ... +running driver::tests::mytest19 ... ok + +result: ok. 11 passed; 0 failed; 1 ignored +``` + +### Running benchmarks + +``` +> mytests --bench + +running 2 tests +test bench_sum_1024_ints ... bench: 709 ns/iter (+/- 82) +test initialise_a_vector ... bench: 424 ns/iter (+/- 99) = 19320 MB/s + +test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured +``` + +## Saving and ratcheting metrics + +When running benchmarks or other tests, the test runner can record +per-test "metrics". Each metric is a scalar `f64` value, plus a noise +value which represents uncertainty in the measurement. By default, all +`#[bench]` benchmarks are recorded as metrics, which can be saved as +JSON in an external file for further reporting. + +In addition, the test runner supports _ratcheting_ against a metrics +file. Ratcheting is like saving metrics, except that after each run, +if the output file already exists the results of the current run are +compared against the contents of the existing file, and any regression +_causes the testsuite to fail_. If the comparison passes -- if all +metrics stayed the same (within noise) or improved -- then the metrics +file is overwritten with the new values. In this way, a metrics file +in your workspace can be used to ensure your work does not regress +performance. + +Test runners take 3 options that are relevant to metrics: + + - `--save-metrics=` will save the metrics from a test run + to `file.json` + - `--ratchet-metrics=` will ratchet the metrics against + the `file.json` + - `--ratchet-noise-percent=N` will override the noise measurements + in `file.json`, and consider a metric change less than `N%` to be + noise. This can be helpful if you are testing in a noisy + environment where the benchmark calibration loop cannot acquire a + clear enough signal. diff --git a/doc/tutorial.md b/doc/tutorial.md index ccc7525b8d5..a19d8c3e820 100644 --- a/doc/tutorial.md +++ b/doc/tutorial.md @@ -3220,6 +3220,7 @@ tutorials on individual topics. * [Error-handling and Conditions][conditions] * [Packaging up Rust code][rustpkg] * [Documenting Rust code][rustdoc] +* [Testing Rust code][testing] There is further documentation on the [wiki], however those tend to be even more out of date than this document. @@ -3231,6 +3232,7 @@ more out of date than this document. [container]: tutorial-container.html [conditions]: tutorial-conditions.html [rustpkg]: tutorial-rustpkg.html +[testing]: tutorial-testing.html [rustdoc]: rustdoc.html [wiki]: https://github.com/mozilla/rust/wiki/Docs diff --git a/mk/docs.mk b/mk/docs.mk index add5f8c07ed..f211af700b9 100644 --- a/mk/docs.mk +++ b/mk/docs.mk @@ -133,6 +133,13 @@ doc/tutorial-ffi.html: tutorial-ffi.md doc/version_info.html doc/rust.css \ $(Q)$(CFG_NODE) $(S)doc/prep.js --highlight $< | \ $(CFG_PANDOC) $(HTML_OPTS) --output=$@ +DOCS += doc/tutorial-testing.html +doc/tutorial-testing.html: tutorial-testing.md doc/version_info.html doc/rust.css \ + doc/favicon.inc + @$(call E, pandoc: $@) + $(Q)$(CFG_NODE) $(S)doc/prep.js --highlight $< | \ + $(CFG_PANDOC) $(HTML_OPTS) --output=$@ + DOCS += doc/tutorial-borrowed-ptr.html doc/tutorial-borrowed-ptr.html: tutorial-borrowed-ptr.md doc/version_info.html doc/rust.css \ doc/favicon.inc