Remove some unsound specializations
This removes the unsound and exploitable specializations in the standard library
* The `PartialEq` and `Hash` implementations for `RangeInclusive` are changed to avoid specialization.
* The `PartialOrd` specialization for slices now specializes on a limited set of concrete types.
* Added some tests for the soundness problems.
Override `StepBy::{try_fold, try_rfold}`
Previous PR: https://github.com/rust-lang/rust/pull/51435
The previous PR was closed in favor of https://github.com/rust-lang/rust/pull/51601, which was later reverted. I don't think these implementations will make it harder to specialize `StepBy<Range<_>>` later, so we should be able to land this without any consequences.
This should fix https://github.com/rust-lang/rust/issues/57517 – in my benchmarks `iter` and `iter.step_by(1)` now perform equally well, provided internal iteration is used.
Because of a compiler bug that adding `Self: ExactSizeIterator` makes
the compiler forget `Self::Item` is `<I as Iterator>::Item`, we remove
this specialization for now.
Override Cycle::try_fold
It's not very pretty, but I believe this is the simplest way to correctly implement `Cycle::try_fold`. The following may seem correct:
```rust
loop {
acc = self.iter.try_fold(acc, &mut f)?;
self.iter = self.orig.clone();
}
```
...but this loops infinitely in case `self.orig` is empty, as opposed to returning `acc`. So we first have to fully iterate `self.orig` to check whether it is empty or not, and before _that_, we have to iterate the remaining elements of `self.iter`.
This should always call `self.orig.clone()` the same amount of times as repeated `next()` calls would.
r? @scottmcm
Use internal iteration in the Sum and Product impls of Result and Option
This PR adds internal iteration to the `ResultShunt` iterator type underlying the `Sum` and `Product` impls of `Result`. I had to change `ResultShunt` to hold a mutable reference to an error instead, similar to `itertools::ProcessResults`, in order to be able to pass the `ResultShunt` itself by value (which is necessary for internal iteration).
`ResultShunt::process` can unfortunately no longer be an associated function because that would make it generic over the lifetime of the error reference, which wouldn't work, so I turned it into the free function `process_results`.
I removed the `OptionShunt` type and forwarded the `Sum` and `Product` impls of `Option` to their respective impls of `Result` instead, to avoid having to repeat the internal iteration logic.
Implement DoubleEndedIterator for iter::{StepBy, Peekable, Take}
Now that `DoubleEndedIterator::nth_back` has landed, `StepBy` and `Take` can have an efficient `DoubleEndedIterator` implementation. I don't know if there was any particular reason for `Peekable` not having a `DoubleEndedIterator` implementation, but it's quite trivial and I don't see any drawbacks to having it.
I'm not very happy about the implementation of `Peekable::try_rfold`, but I didn't see another way to only take the value out of `self.peeked` in case `self.iter.try_rfold` didn't exit early.
I only added `Peekable::rfold` (in addition to `try_rfold`) because its `Iterator` implementation has both `fold` and `try_fold` (and for similar reasons I added `Take::try_rfold` but not `Take::rfold`). Do we have any guidelines on whether we want both? If we do want both, maybe we should investigate which iterator adaptors override `try_fold` but not `fold` and add the missing implementations. At the moment I think that it's better to always have iterator adaptors implement both, because some iterators have a simpler `fold` implementation than their `try_fold` implementation.
The tests that I added may not be sufficient because they're all just existing tests where `next`/`nth`/`fold`/`try_fold` are replaced by their DEI counterparts, but I do think all paths are covered. Is there anything in particular that I should probably also test?
Implement `iter::Sum` and `iter::Product` for `Option`
This is similar to the existing implementation for `Result`. It will take each item into the accumulator unless a `None` is returned.
I based a lot of this on #38580. From that discussion it didn't seem like this addition would be too controversial or difficult. One thing I still don't understand is picking the values for the `stable` attribute. This is my first non-documentation PR for rust so I am open to any feedback on improvements.
We can use `usize::try_from` to convert steps from any size of integer.
This enables a meaningful `size_hint()` for larger ranges, rather than
always just `(0, None)`. Now they return the true `(len, Some(len))`
when it fits, otherwise `(usize::MAX, None)` for overflow.
RangeInclusive internal iteration performance improvement.
Specialize `Iterator::try_fold` and `DoubleEndedIterator::try_rfold` to improve code generation in all internal iteration scenarios.
This changes brings the performance of internal iteration with `RangeInclusive` on par with the performance of iteration with `Range`:
- Single conditional jump in hot loop,
- Unrolling and vectorization,
- And even Closed Form substitution.
Unfortunately, it only applies to internal iteration. Despite various attempts at stream-lining the implementation of `next` and `next_back`, LLVM has stubbornly refused to optimize external iteration appropriately, leaving me with a choice between:
- The current implementation, for which Closed Form substitution is performed, but which uses 2 conditional jumps in the hot loop when optimization fail.
- An implementation using a `is_done` boolean, which uses 1 conditional jump in the hot loop when optimization fail, allowing unrolling and vectorization, but for which Closed Form substitution fails.
In the absence of any conclusive evidence as to which usecase matters most, and with no assurance that the lack of Closed Form substitution is not indicative of other optimizations being foiled, there is no way
to pick one implementation over the other, and thus I defer to the statu quo as far as `next` and `next_back` are concerned.
Add unstable Iterator::copied()
Initially suggested at https://github.com/bluss/rust-itertools/pull/289, however the maintainers of itertools suggested this may be better of in a standard library.
The intent of `copied` is to avoid accidentally cloning iterator elements after doing a code refactoring which causes a structure to be no longer `Copy`. This is a relatively common pattern, as it can be seen by calling `rg --pcre2 '[.]map[(][|](?:(\w+)[|] [*]\1|&(\w+)[|] \2)[)]'` on Rust main repository. Additionally, many uses of `cloned` actually want to simply `Copy`, and changing something to be no longer copyable may introduce unnoticeable performance penalty.
Also, this makes sense because the standard library includes `[T].copy_from_slice` to pair with `[T].clone_from_slice`.
This also adds `Option::copied`, because it makes sense to pair it with `Iterator::copied`. I don't think this feature is particularly important, but it makes sense to update `Option` along with `Iterator` for consistency.