Make the iterator protocol more explicit

Document the fact that the iterator protocol only defines behavior up
until the first None is returned. After this point, iterators are free
to behave how they wish.

Add a new iterator adaptor Fuse<T> that modifies iterators to return
None forever if they returned None once.
This commit is contained in:
Kevin Ballard 2013-08-03 13:51:49 -07:00
parent 7c6c7519a7
commit fb0b388804
2 changed files with 145 additions and 2 deletions

View File

@ -105,6 +105,10 @@ impl Iterator<int> for ZeroStream {
}
~~~
In general, you cannot rely on the behavior of the `next()` method after it has
returned `None`. Some iterators may return `None` forever. Others may behave
differently.
## Container iterators
Containers implement iteration over the contained elements by returning an
@ -112,7 +116,7 @@ iterator object. For example, vector slices several iterators available:
* `iter()` and `rev_iter()`, for immutable references to the elements
* `mut_iter()` and `mut_rev_iter()`, for mutable references to the elements
* `move_iter()` and `move_rev_iter`, to move the elements out by-value
* `move_iter()` and `move_rev_iter()`, to move the elements out by-value
A typical mutable container will implement at least `iter()`, `mut_iter()` and
`move_iter()` along with the reverse variants if it maintains an order.
@ -149,7 +153,7 @@ let result = xs.iter().fold(0, |accumulator, item| accumulator - *item);
assert_eq!(result, -41);
~~~
Some adaptors return an adaptor object implementing the `Iterator` trait itself:
Most adaptors return an adaptor object implementing the `Iterator` trait itself:
~~~
let xs = [1, 9, 2, 3, 14, 12];
@ -158,6 +162,35 @@ let sum = xs.iter().chain(ys.iter()).fold(0, |a, b| a + *b);
assert_eq!(sum, 57);
~~~
Some iterator adaptors may return `None` before exhausting the underlying
iterator. Additionally, if these iterator adaptors are called again after
returning `None`, they may call their underlying iterator again even if the
adaptor will continue to return `None` forever. This may not be desired if the
underlying iterator has side-effects.
In order to provide a guarantee about behavior once `None` has been returned, an
iterator adaptor named `fuse()` is provided. This returns an iterator that will
never call its underlying iterator again once `None` has been returned:
~~~
let xs = [1,2,3,4,5];
let mut calls = 0;
let it = xs.iter().scan((), |_, x| {
calls += 1;
if *x < 3 { Some(x) } else { None }});
// the iterator will only yield 1 and 2 before returning None
// If we were to call it 5 times, calls would end up as 5, despite only 2 values
// being yielded (and therefore 3 unique calls being made). The fuse() adaptor
// can fix this.
let mut it = it.fuse();
it.next();
it.next();
it.next();
it.next();
it.next();
assert_eq!(calls, 3);
~~~
## For loops
The function `range` (or `range_inclusive`) allows to simply iterate through a given range:

View File

@ -41,6 +41,13 @@ pub trait Extendable<A>: FromIterator<A> {
/// An interface for dealing with "external iterators". These types of iterators
/// can be resumed at any time as all state is stored internally as opposed to
/// being located on the call stack.
///
/// The Iterator protocol states that an iterator yields a (potentially-empty,
/// potentially-infinite) sequence of values, and returns `None` to signal that
/// it's finished. The Iterator protocol does not define behavior after `None`
/// is returned. A concrete Iterator implementation may choose to behave however
/// it wishes, either by returning `None` infinitely, or by doing something
/// else.
pub trait Iterator<A> {
/// Advance the iterator and return the next value. Return `None` when the end is reached.
fn next(&mut self) -> Option<A>;
@ -300,6 +307,36 @@ pub trait Iterator<A> {
FlatMap{iter: self, f: f, frontiter: None, backiter: None }
}
/// Creates an iterator that yields `None` forever after the underlying
/// iterator yields `None`. Random-access iterator behavior is not
/// affected, only single and double-ended iterator behavior.
///
/// # Example
///
/// ~~~ {.rust}
/// fn process<U: Iterator<int>>(it: U) -> int {
/// let mut it = it.fuse();
/// let mut sum = 0;
/// for x in it {
/// if x > 5 {
/// break;
/// }
/// sum += x;
/// }
/// // did we exhaust the iterator?
/// if it.next().is_none() {
/// sum += 1000;
/// }
/// sum
/// }
/// let x = ~[1,2,3,7,8,9];
/// assert_eq!(process(x.move_iter()), 1006);
/// ~~~
#[inline]
fn fuse(self) -> Fuse<Self> {
Fuse{iter: self, done: false}
}
/// Creates an iterator that calls a function with a reference to each
/// element before yielding it. This is often useful for debugging an
/// iterator pipeline.
@ -1421,6 +1458,79 @@ impl<'self,
}
}
/// An iterator that yields `None` forever after the underlying iterator
/// yields `None` once.
#[deriving(Clone, DeepClone)]
pub struct Fuse<T> {
priv iter: T,
priv done: bool
}
impl<A, T: Iterator<A>> Iterator<A> for Fuse<T> {
#[inline]
fn next(&mut self) -> Option<A> {
if self.done {
None
} else {
match self.iter.next() {
None => {
self.done = true;
None
}
x => x
}
}
}
#[inline]
fn size_hint(&self) -> (uint, Option<uint>) {
if self.done {
(0, Some(0))
} else {
self.iter.size_hint()
}
}
}
impl<A, T: DoubleEndedIterator<A>> DoubleEndedIterator<A> for Fuse<T> {
#[inline]
fn next_back(&mut self) -> Option<A> {
if self.done {
None
} else {
match self.iter.next_back() {
None => {
self.done = true;
None
}
x => x
}
}
}
}
// Allow RandomAccessIterators to be fused without affecting random-access behavior
impl<A, T: RandomAccessIterator<A>> RandomAccessIterator<A> for Fuse<T> {
#[inline]
fn indexable(&self) -> uint {
self.iter.indexable()
}
#[inline]
fn idx(&self, index: uint) -> Option<A> {
self.iter.idx(index)
}
}
impl<T> Fuse<T> {
/// Resets the fuse such that the next call to .next() or .next_back() will
/// call the underlying iterator again even if it prevously returned None.
#[inline]
fn reset_fuse(&mut self) {
self.done = false
}
}
/// An iterator that calls a function with a reference to each
/// element before yielding it.
pub struct Inspect<'self, A, T> {