Give `impl Trait` in a `const fn` its own feature gate
...previously it was gated under `#![feature(const_fn)]`.
I think we actually want to do this in all const-contexts? If so, this should be `#![feature(const_impl_trait)]` instead. I don't think there's any way to make use of `impl Trait` within a `const` initializer.
cc #77463
r? `@oli-obk`
Eliminate bounds checking in slice::Windows
This is how `<core::slice::Windows as Iterator>::next` looks right now:
```rust
fn next(&mut self) -> Option<&'a [T]> {
if self.size > self.v.len() {
None
} else {
let ret = Some(&self.v[..self.size]);
self.v = &self.v[1..];
ret
}
}
```
The line with `self.v = &self.v[1..];` relies on assumption that `self.v` is definitely not empty at this point. Else branch is taken when `self.size <= self.v.len()`, so `self.v` can be empty if `self.size` is zero. In practice, since `Windows` is never created directly but rather trough `[T]::windows` which panics when `size` is zero, `self.size` is never zero. However, the compiler doesn't know about this check, so it keeps the code which checks bounds and panics.
Using `NonZeroUsize` lets the compiler know about this invariant and reliably eliminate bounds checking without `unsafe` on `-O2`. Here is assembly of `Windows<'a, u32>::next` before and after this change ([goldbolt](https://godbolt.org/z/xrefzx)):
<details>
<summary>Before</summary>
```
example::next:
push rax
mov rcx, qword ptr [rdi + 8]
mov rdx, qword ptr [rdi + 16]
cmp rdx, rcx
jbe .LBB0_2
xor eax, eax
pop rcx
ret
.LBB0_2:
test rcx, rcx
je .LBB0_5
mov rax, qword ptr [rdi]
mov rsi, rax
add rsi, 4
add rcx, -1
mov qword ptr [rdi], rsi
mov qword ptr [rdi + 8], rcx
pop rcx
ret
.LBB0_5:
lea rdx, [rip + .L__unnamed_1]
mov edi, 1
xor esi, esi
call qword ptr [rip + core::slice::slice_index_order_fail@GOTPCREL]
ud2
.L__unnamed_2:
.ascii "./example.rs"
.L__unnamed_1:
.quad .L__unnamed_2
.asciz "\f\000\000\000\000\000\000\000\016\000\000\000\027\000\000"
```
</details>
<details>
<summary>After</summary>
```
example::next:
mov rcx, qword ptr [rdi + 8]
mov rdx, qword ptr [rdi + 16]
cmp rdx, rcx
jbe .LBB0_2
xor eax, eax
ret
.LBB0_2:
mov rax, qword ptr [rdi]
lea rsi, [rax + 4]
add rcx, -1
mov qword ptr [rdi], rsi
mov qword ptr [rdi + 8], rcx
ret
```
</details>
Note the lack of call to `core::slice::slice_index_order_fail` in second snippet.
#### Possible reasons _not_ to merge this PR:
* this changes the error message on panic in `[T]::windows`. However, AFAIK this messages are not covered by backwards compatibility policy.
Implement advance_by, advance_back_by for iter::Chain
Part of #77404.
This PR does two things:
- implement `Chain::advance[_back]_by` in terms of `advance[_back]_by` on `self.a` and `advance[_back]_by` on `self.b`
- change `Chain::nth[_back]` to use `advance[_back]_by` on `self.a` and `nth[_back]` on `self.b`
This ensures that `Chain::nth` can take advantage of an efficient `nth` implementation on the second iterator, in case it doesn't implement `advance_by`.
cc `@scottmcm` in case you want to review this
Use more intra-doc-links in `core::fmt`
This is a follow-up to #75819, which encountered some broken links due to #75176, so this PR contains the links that are blocked on #75176.
r? @jyn514
Hint the maximum length permitted by invariant of slices
One of the safety invariants of references, and in particular of references to slices, is that they may not cover more than `isize::MAX` bytes. The unsafe `from_raw_parts` constructors of slices explicitly requires the caller to guarantee this fact. Violating it would also be UB with regards to the semantics of generated llvm code.
This effectively bounds the length of a (non-ZST) slice from above by a compile time constant. But when the length is loaded from a function argument it appears llvm is not aware of this requirement. The additional value range assertions allow some further elision of code branches, including overflow checks, especially in the presence of artithmetic on the indices.
This may have a performance impact, adding more code to a common method but allowing more optimization. I'm not quite sure, is the Rust side of const-prop strong enough to elide the irrelevant match branches?
Fixes: #67186
Uses assume to check the length against a constant upper bound. The
inlined result then informs the optimizer of the sound value range.
This was tried with unreachable_unchecked before which introduces a
branch. This has the advantage of not being executed in sound code but
complicates basic blocks. It resulted in ~2% increased compile time in
some worst cases.
Add a codegen test for the assumption, testing the issue from #67186
Implement as_ne_bytes() for integers and floats
This is related to issue #64464.
I am pretty sure that these functions are actually const-ify-able, and technically as_bits() can also be implemented for floats, but I might need some comments on both.
Use less divisions in display u128/i128
This PR is an absolute mess, and I need to test if it improves the speed of fmt::Display for u128/i128, but I think it's correct.
It hopefully is more efficient by cutting u128 into at most 2 u64s, and also chunks by 1e16 instead of just 1e4.
Also I specialized the implementations for uints to always be non-false because it bothered me that it was checked at all
Do not merge until I benchmark it and also clean up the god awful mess of spaghetti.
Based on prior work in #44583
cc: `@Dylan-DPC`
Due to work on `itoa` and suggestion in original issue:
r? `@dtolnay`
Stabilize slice_ptr_range.
This has been unstable for almost a year now. Time to stabilize?
Closes#65807.
@rustbot modify labels: +T-libs +A-raw-pointers +A-slice +needs-fcp
Refactor memchr to allow optimization
Closes#75659
The implementation already uses naive search if the slice if short enough, but the case is complicated enough to not be optimized away. This PR refactors memchr so that it exists early when the slice is short enough.
Codegen-wise, as shown in #75659, memchr was not inlined previously so the only way I could find to test this is to check if there is no memchr call. Let me know if there is a more robust solution here.
Add Iterator::advance_by and DoubleEndedIterator::advance_back_by
This PR adds the iterator method
```rust
fn advance_by(&mut self, n: usize) -> Result<(), usize>
```
that advances the iterator by `n` elements, returning `Ok(())` if this succeeds or `Err(len)` if the length of the iterator was less than `n`.
Currently `Iterator::nth` is the method to override for efficiently advancing an iterator by multiple elements at once. `advance_by` is superior for this purpose because
- it's simpler to implement: instead of advancing the iterator and producing the next element you only need to advance the iterator
- it composes better: iterators like `Chain` and `FlatMap` can implement `advance_by` in terms of `advance_by` on their inner iterators, but they cannot implement `nth` in terms of `nth` on their inner iterators (see #60395)
- the default implementation of `nth` can trivially be implemented in terms of `advance_by` and `next`, which this PR also does
This PR also adds `DoubleEndedIterator::advance_back_by` for all the same reasons.
I'll make a tracking issue if it's decided this is worth merging. Also let me know if anything can be improved, this went through several iterations so there might very well still be room for improvement (especially in the doc comments). I've written overrides of these methods for most iterators that already override `nth`/`nth_back`, but those still need tests so I'll add them in a later PR.
cc @cuviper @scottmcm @Amanieu
Split core/str/mod.rs to smaller files
Note for reviewer:
* I split to multiple commits for easier reviewing, but I could git squash them all to one if requested.
* Recommend pulling this change locally and using advanced git diff viewer or this command:
```bash
git show --reverse --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space master..
```
---
I split `core/str/mod.rs` to these modules:
* `converts`: Contains helper functions to convert from bytes to str.
* `error`: For error structs like Utf8Error.
* `iter`: For iterators of many str methods.
* `traits`: For indexing operations and build in traits on str.
* `validations`: For functions validating utf8 --- This name is awkward, maybe utf8.rs is better.
Add zero padding
Add benchmarks for fmt u128
This tests both when there is the max amount of work(all characters used)
And least amount of work(1 character used)