Speed up `SparseBitMatrix` use in `RegionValues`.
In practice, these matrices range from 10% to 90%+ full once they are
filled in, so the dense representation is better.
This reduces the runtime of Check Nll builds of `inflate` by 32%, and
several other benchmarks by 1--5%.
It also increases max-rss of `clap-rs` by 30% and a couple of others by
up to 5%, while decreasing max-rss of `coercions` by 14%. I think the
speed-ups justify the max-rss increases.
r? @nikomatsakis
Turn implied_outlives_bounds into a query
Right now all this does is remove the error reporting in `implied_outlives_bounds`, which seems to work. Farming out full tests to Travis.
For #51649. That issue is deferred so not sure what's next.
r? @nikomatsakis
Const propagate casts
fixes#49760
So... This fixes the original issue about the missing warnings.
But our test suite contains fun things like
```rust
fn foo() {}
assert_eq!(foo as i16, foo as usize as i16);
```
Which, will result in
> a raw memory access tried to access part of a pointer value as raw bytes
on both sides of the assertion. Because well... that's exactly what's going on! We're ripping out 16 bits of a pointer.
Using a `BTreeMap` to represent rows in the bit matrix is really slow.
This patch changes things so that each row is represented by a
`BitVector`. This is a less sparse representation, but a much faster
one.
As a result, `SparseBitSet` and `SparseChunk` can be removed.
Other minor changes in this patch.
- It renames `BitVector::insert()` as `merge()`, which matches the
terminology in the other classes in bitvec.rs.
- It removes `SparseBitMatrix::is_subset()`, which is unused.
- It reinstates `RegionValueElements::num_elements()`, which #52190 had
removed.
- It removes a low-value `debug!` call in `SparseBitMatrix::add()`.
We used to hardcode that we wanted the liveness of *all* variables.
This can now be configured by selecting an alternative index type
V and providing a (partial) map from locals to that new type V.
rustc: Use link_section, not wasm_custom_section
This commit transitions definitions of custom sections on the wasm target from
the unstable `#[wasm_custom_section]` attribute to the
already-stable-for-other-targets `#[link_section]` attribute. Mostly the same
restrictions apply as before, except that this now applies only to statics.
Closes#51088
`BitSlice` fixes
`propagate_bits_into_entry_set_for` and `BitSlice::bitwise` are hot for some benchmarks under NLL. I tried and failed to speed them up. (Increasing the size of `bit_slice::Word` from `usize` to `u128` caused a slowdown, even though decreasing the size of `bitvec::Word` from `u128` to `u64` also caused a slowdown. Weird.)
Anyway, along the way I fixed up several problems in and around the `BitSlice` code.
r? @nikomatsakis
This commit transitions definitions of custom sections on the wasm target from
the unstable `#[wasm_custom_section]` attribute to the
already-stable-for-other-targets `#[link_section]` attribute. Mostly the same
restrictions apply as before, except that this now applies only to statics.
Closes#51088
Preliminary work for incremental ThinLTO.
Since implementing incremental ThinLTO is a bit more involved than I initially thought, I'm splitting out some of the things that already work. This PR (1) adds a way accessing some ThinLTO information in `rustc` and (2) does some cleanup around CGU/object file naming (which makes things quite a bit nicer).
This is probably best reviewed one commit at a time.
Deny bare trait objects in the rest of rust
Add `#![deny(bare_trait_objects)]` to all the modules not covered before (those did not require code changes) that I consider applicable (I left out shims) in order to futureproof them.
NLL: Suggest `ref mut` and `&mut self`
Fixes#51244. Supersedes #51249, I think.
Under the old lexical lifetimes, the compiler provided helpful suggestions about adding `mut` when you tried to mutate a variable bound as `&self` or (explicit) `ref`. NLL doesn't have those suggestions yet. This pull request adds them.
I didn't bother making the help text exactly the same as without NLL, but I can if that's important.
(Originally this was supposed to be part of #51612, but I got bogged down trying to fit everything in one PR.)
nll experiment: compute SCCs instead of iterative region solving
This is an attempt to speed up region solving by replacing the current iterative dataflow with a SCC computation. The idea is to detect cycles (SCCs) amongst region constraints and then compute just one value per cycle. The graph with all cycles removed is of course a DAG, so we can then solve constraints "bottom up" once the liveness values are known.
I kinda ran out of time this morning so the last commit is a bit sloppy but I wanted to get this posted, let travis run on it, and maybe do a perf run, before I clean it up.
The strategy is this:
- we compute SCCs once all outlives constraints are known
- we allocate a set of values **per region** for storing liveness
- we allocate a set of values **per SCC** for storing the final values
- when we add a liveness constraint to the region R, we also add it
to the final value of the SCC to which R belongs
- then we can apply the constraints by just walking the DAG for the
SCCs and union'ing the children (which have their liveness
constraints within)
There are a few intermediate refactorings that I really ought to have
broken out into their own commits:
- reverse the constraint graph so that `R1: R2` means `R1 -> R2` and
not `R2 -> R1`. This fits better with the SCC computation and new
style of inference (`->` now means "take value from" and not "push
value into")
- this does affect some of the UI tests, since they traverse the
graph, but mostly the artificial ones and they don't necessarily
seem worse
- put some things (constraint set, etc) into `Rc`. This lets us root
them to permit mutation and iteration. It also guarantees they don't
change, which is critical to the correctness of the algorithm.
- Generalize various helpers that previously operated only on points
to work on any sort of region element.
Currently `Word` is `usize`, and there are various places in the code
that assume this.
This patch mostly just changes `usize` occurrences to `Word`. Most of
the changes were found as compile errors when I changed `Word` to a type
other than `usize`, but there was one non-obvious case in
librustc_mir/dataflow/mod.rs that caused bounds check failures before I
fixed it.
improve error message shown for unsafe operations
Add a short explanation saying why undefined behavior could arise. In particular, the error many people got for "creating a pointer to a packed field requires unsafe block" was not worded great -- it lead to people just adding the unsafe block without considering if what they are doing follows the rules.
I am not sure if a "note" is the right thing, but that was the easiest thing to add...
Inspired by @gnzlbg at https://github.com/rust-lang/rust/issues/46043#issuecomment-381544673
Infinite loop detection for const evaluation
Resolves#50637.
An `EvalContext` stores the transient state (stack, heap, etc.) of the MIRI virtual machine while it executing code. As long as MIRI only executes pure functions, we can detect if a program is in a state where it will never terminate by periodically taking a "snapshot" of this transient state and comparing it to previous ones. If any two states are exactly equal, the machine must be in an infinite loop.
Instead of fully cloning a snapshot every time the detector is run, we store a snapshot's hash. Only when a hash collision occurs do we fully clone the interpreter state. Future snapshots which cause a collision will be compared against this clone, causing the interpreter to abort if they are equal.
At the moment, snapshots are not taken until MIRI has progressed a certain amount. After this threshold, snapshots are taken every `DETECTOR_SNAPSHOT_PERIOD` steps. This means that an infinite loop with period `P` will be detected after a maximum of `2 * P * DETECTOR_SNAPSHOT_PERIOD` interpreter steps. The factor of 2 arises because we only clone a snapshot after it causes a hash collision.
Upgrade to LLVM's master branch (LLVM 7)
### Current status
~~Blocked on a [performance regression](https://github.com/rust-lang/rust/pull/51966#issuecomment-402320576). The performance regression has an [upstream LLVM issue](https://bugs.llvm.org/show_bug.cgi?id=38047) and has also [been bisected](https://reviews.llvm.org/D44282) to an LLVM revision.~~
Ready to merge!
---
This commit upgrades the main LLVM submodule to LLVM's current master branch.
The LLD submodule is updated in tandem as well as compiler-builtins.
Along the way support was also added for LLVM 7's new features. This primarily
includes the support for custom section concatenation natively in LLD so we now
add wasm custom sections in LLVM IR rather than having custom support in rustc
itself for doing so.
Some other miscellaneous changes are:
* We now pass `--gc-sections` to `wasm-ld`
* The optimization level is now passed to `wasm-ld`
* A `--stack-first` option is passed to LLD to have stack overflow always cause
a trap instead of corrupting static data
* The wasm target for LLVM switched to `wasm32-unknown-unknown`.
* The syntax for aligned pointers has changed in LLVM IR and tests are updated
to reflect this.
* ~~The `thumbv6m-none-eabi` target is disabled due to an [LLVM bug][llbug]~~
Nowadays we've been mostly only upgrading whenever there's a major release of
LLVM but enough changes have been happening on the wasm target that there's been
growing motivation for quite some time now to upgrade out version of LLD. To
upgrade LLD, however, we need to upgrade LLVM to avoid needing to build yet
another version of LLVM on the builders.
The revision of LLVM in use here is arbitrarily chosen. We will likely need to
continue to update it over time if and when we discover bugs. Once LLVM 7 is
fully released we can switch to that channel as well.
[llbug]: https://bugs.llvm.org/show_bug.cgi?id=37382
cc #50543
This commit upgrades the main LLVM submodule to LLVM's current master branch.
The LLD submodule is updated in tandem as well as compiler-builtins.
Along the way support was also added for LLVM 7's new features. This primarily
includes the support for custom section concatenation natively in LLD so we now
add wasm custom sections in LLVM IR rather than having custom support in rustc
itself for doing so.
Some other miscellaneous changes are:
* We now pass `--gc-sections` to `wasm-ld`
* The optimization level is now passed to `wasm-ld`
* A `--stack-first` option is passed to LLD to have stack overflow always cause
a trap instead of corrupting static data
* The wasm target for LLVM switched to `wasm32-unknown-unknown`.
* The syntax for aligned pointers has changed in LLVM IR and tests are updated
to reflect this.
* The `thumbv6m-none-eabi` target is disabled due to an [LLVM bug][llbug]
Nowadays we've been mostly only upgrading whenever there's a major release of
LLVM but enough changes have been happening on the wasm target that there's been
growing motivation for quite some time now to upgrade out version of LLD. To
upgrade LLD, however, we need to upgrade LLVM to avoid needing to build yet
another version of LLVM on the builders.
The revision of LLVM in use here is arbitrarily chosen. We will likely need to
continue to update it over time if and when we discover bugs. Once LLVM 7 is
fully released we can switch to that channel as well.
[llbug]: https://bugs.llvm.org/show_bug.cgi?id=37382
NLL: fix E0594 "change to mutable ref" suggestion
Fix#51515.
Fix#51879.
Questions:
- [x] Is this the right place to fix this? It feels brittle, being so close to the frontend. **It's probably fine.**
- [ ] Have I missed any other cases that trigger this behavior?
- [x] Is it okay to use HELP and SUGGESTION in the UI test? **Yes.**
- [x] Do I need more tests for this? **No.**
refactor and cleanup region errors for NLL
This is a WIP commit. It simplifies some of the code from https://github.com/rust-lang/rust/pull/51536 and extends a few more steps towards the errors that @davidtwco and I were shooting for. These are intended as a replacement for the general "unable to infer lifetime" messages -- one that is actually actionable. We're certainly not there yet, but the overall shape hopefully gets a bit clearer.
I'm thinking about trying to open up an internals thread to sketch out the overall plan and perhaps discuss how to get the wording right, which special cases to handle, etc.
r? @estebank
cc @davidtwco
Fix various issues with control-flow statements inside anonymous constants
Fixes#51761.
Fixes#51963 (and the host of other reported issues there).
(Might be easiest to review per commit, as they should be standalone.)
r? @estebank
[NLL] Fix various unused mut errors
Closes#51801Closes#50897Closes#51830Closes#51904
cc #51918 - keeping this one open in case there are any more issues
This PR contains multiple changes. List of changes with examples of what they fix:
* Change mir generation so that the parameter variable doesn't get a name when a `ref` pattern is used as an argument
```rust
fn f(ref y: i32) {} // doesn't trigger lint
```
* Change mir generation so that by-move closure captures don't get first moved into a temporary.
```rust
let mut x = 0; // doesn't trigger lint
move || {
x = 1;
};
```
* Treat generator upvars the same as closure upvars
```rust
let mut x = 0; // This mut is now necessary and is not linted against.
move || {
x = 1;
yield;
};
```
r? @nikomatsakis
This removes the `usize` argument to `inc_step_counter`. Now, the step
counter increments by exactly one for every terminator evaluated. After
`STEPS_UNTIL_DETECTOR_ENABLED` steps elapse, the detector is run every
`DETECTOR_SNAPSHOT_PERIOD` steps. The step counter is only kept modulo
this period.
The detector runs every `DETECTOR_SNAPSHOT_PERIOD` steps. Since the
number of steps can increase by more than 1 (I'd like to remove this),
the detector may fail if the step counter is incremented past the
scheduled detection point during the loop.
This reverts commit 59d21c526c036d7097d05edd6dffdad9c5b1cb62, and uses
tuple to store the mutable parts of an EvalContext (which now includes
`Machine`). This requires that `Machine` be `Clone`.
Use the approach suggested by @oli-obk, a table holding `EvalState`
hashes and a table holding full `EvalState` objects. When a hash
collision is observed, the state is cloned and put into the full
table. If the collision was not spurious, it will be detected during the
next iteration of the infinite loop.
I use a pattern binding in each custom impl, so that adding fields to
`Memory` or `Frame` will cause a compiler error instead of causing e.g.
`PartialEq` to become invalid. This may be too cute.
This adds several requirements to `Machine::MemoryData`. These can be
removed if we don't want this associated type to be part of the equality
of `Memory`.
Reuse the `DefsUsesVisitor` in `simulate_block()`.
This avoids a bunch of allocations for the bitsets within it,
speeding up a number of NLL benchmarks, the best by 1%.
r? @nikomatsakis
[NLL] Use better span for initializing a variable twice
Closes#51217
When assigning to a (projection from a) local immutable local which starts initialised (everything except `let PATTERN;`):
* Point to the declaration of that local
* Make the error message refer to the local, rather than the projection.
r? @nikomatsakis
introduce dirty list to dataflow
@nikomatsakis my naive implementation never worked, So, I decided to implement using `work_queue` data structure. This PR also includes your commits from `nll-liveness-dirty-list` branch. Those commits should not visible once your branch is merged.
r? @nikomatsakis
introduce dirty list to liveness, eliminate `ins` vector
At least in my measurements, this seems to knock much of the liveness computation off the profile.
r? @Zoxc
cc @nnethercote
Loosened rules involving statics mentioning other statics
Before this PR, trying to mention a static in any way other than taking a reference to it caused a compile-time error. So, while
```rust
static A: u32 = 42;
static B: &u32 = &A;
```
compiles successfully,
```rust
static A: u32 = 42;
static B: u32 = A; // error
```
and
```rust
static A: u32 = 42;
static B: u32 = *&A; // error
```
are not possible to express in Rust. On the other hand, introducing an intermediate `const fn` can presently allow one to do just that:
```rust
static A: u32 = 42;
static B: u32 = foo(&A); // success!
const fn foo(a: &u32) -> u32 {
*a
}
```
Preventing `const fn` from allowing to work around the ban on reading from statics would cripple `const fn` almost into uselessness.
Additionally, the limitation for reading from statics comes from the old const evaluator(s) and is not shared by `miri`.
This PR loosens the rules around use of statics to allow statics to evaluate other statics by value, allowing all of the above examples to compile and run successfully.
Reads from extern (foreign) statics are however still disallowed by miri, because there is no compile-time value to be read.
```rust
extern static A: u32;
static B: u32 = A; // error
```
This opens up a new avenue of potential issues, as a static can now not just refer to other statics or read from other statics, but even contain references that point into itself.
While it might seem like this could cause subtle bugs like allowing a static to be initialized by its own value, this is inherently impossible in miri.
Reading from a static causes the `const_eval` query for that static to be invoked. Calling the `const_eval` query for a static while already inside the `const_eval` query of said static will cause cycle errors.
It is not possible to accidentally create a bug in miri that would enable initializing a static with itself, because the memory of the static *does not exist* while being initialized.
The memory is not uninitialized, it is not there. Thus any change that would accidentally allow reading from a not yet initialized static would cause ICEs.
Tests have been modified according to the new rules, and new tests have been added for writing to `static mut`s within definitions of statics (which needs to fail), and incremental compilation with complex/interlinking static definitions.
Note that incremental compilation did not need to be adjusted, because all of this was already possible before with workarounds (like intermediate `const fn`s) and the encoding/decoding already supports all the possible cases.
r? @eddyb
Speed up compilation of large constant arrays
This is a different approach to #51672 as suggested by @oli-obk. Rather
than write each repeated value one-by-one, we write the first one and
then copy its value directly into the remaining memory.
With this change, the [toy program](c2f4744d2d/src/test/run-pass/mir_heavy_promoted.rs) goes from 63 seconds to 19 seconds on my machine.
Edit: Inlining `Size::bytes()` saves an additional 6 seconds dropping the total time to 13 seconds on my machine.
Edit2: Now down to 2.8 seconds.
r? @oli-obk
cc @nnethercote @eddyb