The optimized MIR for closures is being encoded unconditionally, while
being unnecessary for cargo check. This turns out to be especially
costly with MIR inlining enabled, since it triggers computation of
optimized MIR for all callees that are being examined for inlining
purposes.
Skip encoding of optimized MIR for closures, enum constructors, struct
constructors, and trait fns when not doing codegen, like it is already
done for other items since 49433.
Turn type inhabitedness into a query to fix `exhaustive_patterns` perf
We measured in https://github.com/rust-lang/rust/pull/79394 that enabling the [`exhaustive_patterns` feature](https://github.com/rust-lang/rust/issues/51085) causes significant perf degradation. It was conjectured that the culprit is type inhabitedness checking, and [I hypothesized](https://github.com/rust-lang/rust/pull/79394#issuecomment-733861149) that turning this computation into a query would solve most of the problem.
This PR turns `tcx.is_ty_uninhabited_from` into a query, and I measured a 25% perf gain on the benchmark that stress-tests `exhaustiveness_patterns`. This more than compensates for the 30% perf hit I measured [when creating it](https://github.com/rust-lang/rustc-perf/pull/801). We'll have to measure enabling the feature again, but I suspect this fixes the perf regression entirely.
I'd like a perf run on this PR obviously.
I made small atomic commits to help reviewing. The first one is just me discovering the "revisions" feature of the testing framework.
I believe there's a push to move things out of `rustc_middle` because it's huge. I guess `inhabitedness/mod.rs` could be moved out, but it's quite small. `DefIdForest` might be movable somewhere too. I don't know what the policy is for that.
Ping `@camelid` since you were interested in following along
`@rustbot` modify labels: +A-exhaustiveness-checking
Since `DefIdForest` contains 0 or 1 elements the large majority of the
time, by allocating only in the >1 case we avoid almost all allocations,
compared to `Arc<SmallVec<[DefId;1]>>`. This shaves off 0.2% on the
benchmark that stresses uninhabitedness checking.
Remove much of the special-case handling around crate metadata
dependency tracking by replacing `DepKind::CrateMetadata` and the
pre-allocation of corresponding `DepNodes` with on-demand invocation
of the `crate_hash` query.
Make CTFE able to check for UB...
... by not doing any optimizations on the `const fn` MIR used in CTFE. This means we duplicate all `const fn`'s MIR now, once for CTFE, once for runtime. This PR is for checking the perf effect, so we have some data when talking about https://github.com/rust-lang/const-eval/blob/master/rfcs/0000-const-ub.md
To do this, we now have two queries for obtaining mir: `optimized_mir` and `mir_for_ctfe`. It is now illegal to invoke `optimized_mir` to obtain the MIR of a const/static item's initializer, an array length, an inline const expression or an enum discriminant initializer. For `const fn`, both `optimized_mir` and `mir_for_ctfe` work, the former returning the MIR that LLVM should use if the function is called at runtime. Similarly it is illegal to invoke `mir_for_ctfe` on regular functions.
This is all checked via appropriate assertions and I don't think it is easy to get wrong, as there should be no `mir_for_ctfe` calls outside the const evaluator or metadata encoding. Almost all rustc devs should keep using `optimized_mir` (or `instance_mir` for that matter).
Enhance type inference errors involving the `?` operator
This patch adds a special-cased note on type inference errors when the error span points to a `?` return. It also makes the primary label for such errors "cannot infer type of `?` error" in cases where before we would have only said "cannot infer type".
One beneficiary of this change is async blocks, where we can't explicitly annotate the return type and so may not generate any other help (#77880); this lets us at least print the error type we're converting from and anything we know about the type we can't fully infer. More generally, it signposts that an implicit conversion is happening that may have impeded type inference the user was expecting. We already do something similar for [mismatched type errors](2987785df3/src/test/ui/try-block/try-block-bad-type.stderr (L7)).
The check for a relevant `?` operator is built into the existing HIR traversal which looks for places that could be annotated to resolve the error. That means we could identify `?` uses anywhere in the function that output the type we can't infer, but this patch just sticks to adding the note if the primary span given for the error has the operator; if there are other expressions where the type occurs and one of them is selected for the error instead, it's more likely that the `?` operator's implicit conversion isn't the sole cause of the inference failure and that adding an additional diagnostic would just be noise. I added a ui test for one such case.
The data about the `?` conversion is passed around in a `UseDiagnostic` enum that in theory could be used to add more of this kind of note in the future. It was also just easier to pass around than something with a more specific name. There are some follow-up refactoring commits for the code that generates the error label, which was already pretty involved and made a bit more complicated by this change.
Rollup of 8 pull requests
Successful merges:
- #79757 (Replace tabs earlier in diagnostics)
- #80600 (Add `MaybeUninit` method `array_assume_init`)
- #80880 (Move some tests to more reasonable directories)
- #80897 (driver: Use `atty` instead of rolling our own)
- #80898 (Add another test case for #79808)
- #80917 (core/slice: remove doc comment about scoped borrow)
- #80927 (Replace a simple `if let` with the `matches` macro)
- #80930 (fix typo in trait method mutability mismatch help)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
driver: Use `atty` instead of rolling our own
Fixes#80888.
Rationale:
- `atty` is widely used in the Rust ecosystem
- We already use it (in `rustc_errors` and other places)
- We shouldn't be rolling our own TTY detector when there's a
widely-used, well-tested package that we can use
Add `MaybeUninit` method `array_assume_init`
When initialising an array element-by-element, the conversion to the initialised array is done through `mem::transmute`, which is both ugly and does not work with const generics (see #61956). This PR proposes the associated method `array_assume_init`, matching the style of `slice_assume_init_*`:
```rust
unsafe fn array_assume_init<T, const N: usize>(array: [MaybeUninit<T>; N]) -> [T; N];
```
Example:
```rust
let mut array: [MaybeUninit<i32>; 3] = MaybeUninit::uninit_array();
array[0].write(0);
array[1].write(1);
array[2].write(2);
// SAFETY: Now safe as we initialised all elements
let array: [i32; 3] = unsafe {
MaybeUninit::array_assume_init(array)
};
```
Things I'm unsure about:
* Should this be a method of array instead?
* Should the function be const?
Replace tabs earlier in diagnostics
This replaces tabs earlier in the diagnostics emitting process, which allows various margin calculations to ignore the existence of tabs. It does add a string copy for the source lines that are emitted.
Fixes https://github.com/rust-lang/rust/issues/78438
r? `@estebank`
Serialize incr comp structures to file via fixed-size buffer
Reduce a large memory spike that happens during serialization by writing
the incr comp structures to file by way of a fixed-size buffer, rather
than an unbounded vector.
Effort was made to keep the instruction count close to that of the
previous implementation. However, buffered writing to a file inherently
has more overhead than writing to a vector, because each write may
result in a handleable error. To reduce this overhead, arrangements are
made so that each LEB128-encoded integer can be written to the buffer
with only one capacity and error check. Higher-level optimizations in
which entire composite structures can be written with one capacity and
error check are possible, but would require much more work.
The performance is mostly on par with the previous implementation, with
small to moderate instruction count regressions. The memory reduction is
significant, however, so it seems like a worth-while trade-off.
Rationale:
- `atty` is widely used in the Rust ecosystem
- We already use it (in `rustc_errors` and other places)
- We shouldn't be rolling our own TTY detector when there's a
widely-used, well-tested package that we can use
Suggest async {} for async || {}
Fixes#76011
This adds support for adding help diagnostics to the feature gating checks and
then uses it for the async_closure gate to add the extra bit of help
information as described in the issue.
As we did with `Box`, we can allocate an uninitialized `Rc` or `Arc`
beforehand, giving the optimizer a chance to skip the local value for
regular clones, or avoid any local altogether for `T: Copy`.
For generic `T: Clone`, we can allocate an uninitialized box beforehand,
which gives the optimizer a chance to create the clone directly in the
heap. For `T: Copy`, we can go further and do a simple memory copy,
regardless of optimization level.
rustdoc: Resolve `&str` as `str`
People almost always are referring to `&str`, not `str`, so this will
save a manual link resolve in many cases.
Note that we already accept `&` (resolves to `reference`) in intra-doc
links, so this shouldn't cause breakage.
r? `@jyn514`
resolve: Simplify built-in macro table
We don't use full `SyntaxExtension`s from the table, only `SyntaxExtensionKind`s, and `Ident` in `register_builtin_macro` always had dummy span. This PR removes unnecessary data from the table and related function signatures.
Noticed when reviewing #80850.
std/core docs: fix wrong link in PartialEq
PartialEq doc was attempting to link to ``[`Eq`]`` but instead we got a link to `` `eq` ``. Disambiguate with `trait@Eq`.
You can see the bad link [here](https://doc.rust-lang.org/std/cmp/trait.PartialEq.html) (Second sentence, "floating point types implement PartialEq but not Eq").
Explain method-call move errors in loops
PR #73708 added a more detailed explanation of move errors that occur
due to a call to a method that takes `self`. This PR extends that logic
to work when a move error occurs due to a method call in the previous
iteration of a loop.