Enable default inlining in platform intrinsics
Since [#28273](https://github.com/rust-lang/rust/issues/28273) has been fixed for quite some time, it might be a good idea to return to default inlining in platform intrinsics.
This commit changes the exit status of rustc to 1 in the presence of
compilation errors. In the event of an unexpected panic (ICE) the
standard panic error exit status of 101 remains.
A run-make test is added to ensure that the exit code does not regress,
and compiletest is updated to check for an exit status of 1 or 101,
depending on the mode and suite.
This is a breaking change for custom drivers.
Fixes#51971.
rustc: Use link_section, not wasm_custom_section
This commit transitions definitions of custom sections on the wasm target from
the unstable `#[wasm_custom_section]` attribute to the
already-stable-for-other-targets `#[link_section]` attribute. Mostly the same
restrictions apply as before, except that this now applies only to statics.
Closes#51088
Avoid most allocations in `Canonicalizer`.
Extra allocations are a significant cost of NLL, and the most common
ones come from within `Canonicalizer`. In particular, `canonical_var()`
contains this code:
indices
.entry(kind)
.or_insert_with(|| {
let cvar1 = variables.push(info);
let cvar2 = var_values.push(kind);
assert_eq!(cvar1, cvar2);
cvar1
})
.clone()
`variables` and `var_values` are `Vec`s. `indices` is a `HashMap` used
to track what elements have been inserted into `var_values`. If `kind`
hasn't been seen before, `indices`, `variables` and `var_values` all get
a new element. (The number of elements in each container is always the
same.) This results in lots of allocations.
In practice, most of the time these containers only end up holding a few
elements. This PR changes them to avoid heap allocations in the common
case, by changing the `Vec`s to `SmallVec`s and only using `indices`
once enough elements are present. (When the number of elements is small,
a direct linear search of `var_values` is as good or better than a
hashmap lookup.)
The changes to `variables` are straightforward and contained within
`Canonicalizer`. The changes to `indices` are more complex but also
contained within `Canonicalizer`. The changes to `var_values` are more
intrusive because they require defining a new type
`SmallCanonicalVarValues` -- which is to `CanonicalVarValues` as
`SmallVec` is to `Vec -- and passing stack-allocated values of that type
in from outside.
All this speeds up a number of NLL "check" builds, the best by 2%.
r? @nikomatsakis
This was previously enabled via `proc_macro`, but since `proc_macro` is now
stable this is no longer the case. Explicitly include it in the 2018 edition
here.
Fix macro parser quadratic complexity in small repeating groups
Observed in #51754, and more easily demonstrated with the following:
```rust
macro_rules! stress {
($($t:tt)+) => { };
}
fn main() {
stress!{
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
// ... 65536 copies of "a" total ...
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
}
}
```
which takes 50 seconds to compile prior to the fix and <1s after.
I hope this has a visible impact on the compile times for real code. (I think it is most likely to affect incremental TT munchers that deal with large inputs, though it depends on how they are written)
For a fuller description of the performance issue: https://github.com/rust-lang/rust/issues/51754#issuecomment-403242159
---
There is no test (yet) because I'm not sure how easily to measure this for regressions.
Enable incremental independent of stage
Previously we'd only do so for stage 0 but with keep-stage
improvements it seems likely that we'll see more developers working in
the stage 1, so we should allow enabling incremental for them.
`BitSlice` fixes
`propagate_bits_into_entry_set_for` and `BitSlice::bitwise` are hot for some benchmarks under NLL. I tried and failed to speed them up. (Increasing the size of `bit_slice::Word` from `usize` to `u128` caused a slowdown, even though decreasing the size of `bitvec::Word` from `u128` to `u64` also caused a slowdown. Weird.)
Anyway, along the way I fixed up several problems in and around the `BitSlice` code.
r? @nikomatsakis
Extra allocations are a significant cost of NLL, and the most common
ones come from within `Canonicalizer`. In particular, `canonical_var()`
contains this code:
indices
.entry(kind)
.or_insert_with(|| {
let cvar1 = variables.push(info);
let cvar2 = var_values.push(kind);
assert_eq!(cvar1, cvar2);
cvar1
})
.clone()
`variables` and `var_values` are `Vec`s. `indices` is a `HashMap` used
to track what elements have been inserted into `var_values`. If `kind`
hasn't been seen before, `indices`, `variables` and `var_values` all get
a new element. (The number of elements in each container is always the
same.) This results in lots of allocations.
In practice, most of the time these containers only end up holding a few
elements. This PR changes them to avoid heap allocations in the common
case, by changing the `Vec`s to `SmallVec`s and only using `indices`
once enough elements are present. (When the number of elements is small,
a direct linear search of `var_values` is as good or better than a
hashmap lookup.)
The changes to `variables` are straightforward and contained within
`Canonicalizer`. The changes to `indices` are more complex but also
contained within `Canonicalizer`. The changes to `var_values` are more
intrusive because they require defining a new type
`SmallCanonicalVarValues` -- which is to `CanonicalVarValues` as
`SmallVec` is to `Vec -- and passing stack-allocated values of that type
in from outside.
All this speeds up a number of NLL "check" builds, the best by 2%.