Don't dynamically link LLVM tools unless rustc is too
This PR initially tried to support link-shared on all of our target platforms (other than Windows), but ran into a number of difficulties:
* LLVM doesn't really support a shared link on macOS (llvm-config runs into problems with the version suffix)
* LLVM doesn't seem to support a shared link when cross-compiling (the libLLVM.so ends up empty and symbols are not found)
So, this PR has now been revised such that we don't attempt to dynamically link LLVM tools (even if that would, otherwise, be supported) on targets where LLVM is statically linked to rustc. Currently that's basically everything except for x86_64-unknown-linux-gnu (where we dynamically link to avoid rerunning ThinLTO in each stage).
Follow-up to #76708.
Fixes#76698.
Function to convert OpenOptions to c_int
Fixes: #74943
The creation_mode and access_mode function were already available in the OpenOptions struct, but currently private. I've added a new free functions to unix/fs.rs which takes the OpenOptions, and returns the c_int to be used as parameter for the `open` call.
Add a changelog for x.py and nag contributors until they read it
Add a changelog for x.py
- Add a changelog and instructions for updating it
- Use `changelog-seen` in `config.toml` and `VERSION` in bootstrap to determine whether the changelog has been read. There's no way to tie reading the changelog to updating the version, so unfortunately they still have to update `config.toml` manually. Actually reading the changelog is optional, anyone can set `changelog-seen = N` without reading (although it's not recommended).
- Nag people if they haven't read the x.py changelog
+ Print message twice to make sure it's seen
- Give different error messages depending on whether the version needs to be updated or added
Closes https://github.com/rust-lang/rust/issues/76617
r? `@Mark-Simulacrum`
Fix cross compiling dist/build invocations
I am uncertain why the first commit is not affecting CI. I suspect it's because we pass --disable-docs on most of our cross-compilation builders. The second commit doesn't affect CI because CI runs x.py dist, not x.py build.
Both commits are standalone; together they should resolve#76733. The first commit doesn't really fix that issue but rather just fixes cross-compiled x.py dist, resolving a bug introduced in #76549.
Rollup of 13 pull requests
Successful merges:
- #72734 (Reduce duplicate in liballoc reserve error handling)
- #76131 (Don't use `zip` to compare iterators during pretty-print hack)
- #76150 (Don't recommend ManuallyDrop to customize drop order)
- #76275 (Implementation of Write for some immutable ref structs)
- #76489 (Add explanation for E0756)
- #76581 (do not ICE on bound variables, return `TooGeneric` instead)
- #76655 (Make some methods of `Pin` unstable const)
- #76783 (Only get ImplKind::Impl once)
- #76807 (Use const-checking to forbid use of unstable features in const-stable functions)
- #76888 (use if let instead of single match arm expressions)
- #76914 (extend `Ty` and `TyCtxt` lints to self types)
- #77022 (Reduce boilerplate for BytePos and CharPos)
- #77032 (lint missing docs for extern items)
Failed merges:
r? `@ghost`
use if let instead of single match arm expressions
use if let instead of single match arm expressions to compact code and reduce nesting (clippy::single_match)
Use const-checking to forbid use of unstable features in const-stable functions
First step towards #76618.
Currently this code isn't ever hit because `qualify_min_const_fn` runs first and catches pretty much everything. One exception is `const_precise_live_drops`, which does not use the newly added code since it runs as part of a separate pass.
Also contains some unrelated refactoring, which is split into separate commits.
r? @oli-obk
Make some methods of `Pin` unstable const
Make the following methods unstable const under the `const_pin` feature:
- `new`
- `new_unchecked`
- `into_inner`
- `into_inner_unchecked`
- `get_ref`
- `into_ref`
- `get_mut`
- `get_unchecked_mut`
Of these, `into_inner` and `into_inner_unchecked` require the unstable `const_precise_live_drops`.
Also adds tests for these methods in a const context.
Tracking issue: #76654
r? @ecstatic-morse
Don't recommend ManuallyDrop to customize drop order
See
https://internals.rust-lang.org/t/need-for-controlling-drop-order-of-fields/12914/21
for the discussion.
TL;DR: ManuallyDrop is unsafe and footguny, but you can just ask the compiler to do all the work for you by re-ordering declarations.
Specifically, the original example from the docs is much better written as
```rust
struct Peach;
struct Banana;
struct Melon;
struct FruitBox {
melon: Melon,
// XXX: mind the relative drop order of the fields below
peach: Peach,
banana: Banana,
}
```
Don't use `zip` to compare iterators during pretty-print hack
If the right-hand iterator has exactly one more element than the
left-hand iterator, then both iterators will be fully consumed, but
the extra element will never be compared.
Split out from https://github.com/rust-lang/rust/pull/76130
Add optimization to avoid load of address
Look for the sequence
```rust
_2 = &_1;
...
_5 = (*_2)
```
in which we can replace the last statement with `_5 = _1` to avoid the load of _2
Remove duplicated library links between std and libc
The libc crate is already responsible for linking in the appropriate
libraries, and std doing the same thing results in duplicated library
names on the linker command line. Removing this duplication slightly
reduces linker time, and makes it simpler to adjust the set or order of
linked libraries in one place (such as to add static linking support).
If the right-hand iterator has exactly one more element than the
left-hand iterator, then both iterators will be fully consumed, but
the extra element will never be compared.
Make `ensure_sufficient_stack()` non-generic, using cargo-llvm-lines
Inspired by [this blog post](https://blog.mozilla.org/nnethercote/2020/08/05/how-to-speed-up-the-rust-compiler-some-more-in-2020/) from `@nnethercote,` I used [cargo-llvm-lines](https://github.com/dtolnay/cargo-llvm-lines/) on the rust compiler itself, to improve it's compile time. This PR contains only one low-hanging fruit, but I also want to share some measurements.
The function `ensure_sufficient_stack()` was monomorphized 1500 times, and with it the `stacker` and `psm` crates, for a total of 1.5% of all llvm IR lines. With some trickery I convert the generic closure into a dynamic one, and thus all that code is only monomorphized once.
# Measurements
Getting these numbers took some fiddling with CLI flags and I [modified](https://github.com/Julian-Wollersberger/cargo-llvm-lines/blob/master/src/main.rs#L115) cargo-llvm-lines to read from a folder instead of invoking cargo. Commands I used:
```
./x.py clean
RUSTFLAGS="--emit=llvm-ir -C link-args=-fuse-ld=lld -Z self-profile=profile" CARGOFLAGS_BOOTSTRAP="-Ztimings" RUSTC_BOOTSTRAP=1 ./x.py build -i --stage 1 library/std
# Then manually copy all .ll files into a folder I hardcoded in cargo-llvm-lines in main.rs#L115
cd ../cargo-llvm-lines
cargo run llvm-lines
```
The result is this list (see [first 500 lines](https://github.com/Julian-Wollersberger/cargo-llvm-lines/blob/master/llvm-lines-rustc-before.txt) ), before the change:
```
Lines Copies Function name
----- ------ -------------
16894211 (100%) 58417 (100%) (TOTAL)
2223855 (13.2%) 502 (0.9%) rustc_query_system::query::plumbing::get_query_impl::{{closure}}
1331918 (7.9%) 1287 (2.2%) hashbrown::raw::RawTable<T>::reserve_rehash
774434 (4.6%) 12043 (20.6%) core::ptr::drop_in_place
294170 (1.7%) 499 (0.9%) rustc_query_system::dep_graph::graph::DepGraph<K>::with_task_impl
245410 (1.5%) 1552 (2.7%) psm::on_stack::with_on_stack
210311 (1.2%) 1 (0.0%) rustc_target::spec::load_specific
200962 (1.2%) 513 (0.9%) rustc_query_system::query::plumbing::get_query_impl
190704 (1.1%) 1 (0.0%) rustc_middle::ty::query::<impl rustc_middle::ty::context::TyCtxt>::alloc_self_profile_query_strings
180272 (1.1%) 468 (0.8%) rustc_query_system::query::plumbing::load_from_disk_and_cache_in_memory
177396 (1.1%) 114 (0.2%) rustc_query_system::query::plumbing::force_query_impl
161134 (1.0%) 445 (0.8%) rustc_query_system::dep_graph::graph::DepGraph<K>::with_anon_task
141551 (0.8%) 186 (0.3%) rustc_query_system::query::plumbing::incremental_verify_ich
110191 (0.7%) 7 (0.0%) rustc_middle::ty::context::_DERIVE_rustc_serialize_Decodable_D_FOR_TypeckResults::<impl rustc_serialize::serialize::Decodable<__D> for rustc_middle::ty::context::TypeckResults>::decode::{{closure}}
108590 (0.6%) 420 (0.7%) core::ops::function::FnOnce::call_once
88488 (0.5%) 21 (0.0%) rustc_query_system::dep_graph::graph::DepGraph<K>::try_mark_previous_green
86368 (0.5%) 1 (0.0%) rustc_middle::ty::query::stats::query_stats
85654 (0.5%) 3973 (6.8%) <&T as core::fmt::Debug>::fmt
84475 (0.5%) 1 (0.0%) rustc_middle::ty::query::Queries::try_collect_active_jobs
81220 (0.5%) 862 (1.5%) <hashbrown::raw::RawIterHash<T> as core::iter::traits::iterator::Iterator>::next
77636 (0.5%) 54 (0.1%) core::slice::sort::recurse
66484 (0.4%) 461 (0.8%) <hashbrown::raw::RawIter<T> as core::iter::traits::iterator::Iterator>::next
```
All `.ll` files together had 4.4GB. After my change they had 4.2GB. So a few percent less code LLVM has to process. Hurray!
Sadly, I couldn't measure an actual wall-time improvement. Watching YouTube while compiling added to much noise...
Here is the top of the list after the change:
```
16460866 (100%) 58341 (100%) (TOTAL)
1903085 (11.6%) 504 (0.9%) rustc_query_system::query::plumbing::get_query_impl::{{closure}}
1331918 (8.1%) 1287 (2.2%) hashbrown::raw::RawTable<T>::reserve_rehash
777796 (4.7%) 12031 (20.6%) core::ptr::drop_in_place
551462 (3.4%) 1519 (2.6%) rustc_data_structures::stack::ensure_sufficient_stack::{{closure}}
```
Note that the total was reduced by 430 000 lines and `psm::on_stack::with_on_stack` has disappeared. Instead `rustc_data_structures::stack::ensure_sufficient_stack::{{closure}}` appeared. I'm confused about that one, but it seems to consist of inlined calls to `rustc_query_system::*` stuff.
Further note the other two big culprits in this list: `rustc_query_system` and `hashbrown`. These two are monomorphized many times, the query system summing to more than 20% of all lines, not even counting code that's probably inlined elsewhere.
Assuming compile times scale linearly with llvm-lines, that means a possible 20% compile time reduction.
Reducing eg. `get_query_impl` would probably need a major refactoring of the qery system though. _Everything_ in there is generic over multiple types, has associated types and passes generic Self arguments by value. Which means you can't simply make things `dyn`.
---------------------------------------
This PR is a small step to make rustc compile faster and thus make contributing to rustc less painful. Nonetheless I love Rust and I find the work around rustc fascinating :)
Revert adding Atomic::from_mut.
This reverts #74532, which made too many assumptions about platforms, breaking some things.
Will need to be added later with a better way of gating on proper alignment, without hardcoding cfg(target_arch)s.
---
To be merged if fixing from_mut (#76965) takes too long.
r? @ghost
- Add a changelog and instructions for updating it
- Use `changelog-seen` in `config.toml` and `VERSION` in bootstrap to determine whether the changelog has been read
- Nag people if they haven't read the x.py changelog
+ Print message twice to make sure it's seen
- Give different error messages depending on whether the version needs to be updated or added