Accept 0 as a valid str char boundary

Index 0 must be a valid char boundary (invariant of str that it contains
valid UTF-8 data).

If we check explicitly for index == 0, that removes the need to read the
byte at index 0, so it avoids a trip to the string's memory, and it
optimizes out the slicing index' bounds check whenever it is zero.

With this change, the following examples all change from having a read of
the byte at 0 and a branch to possibly panicing, to having the bounds
checking optimized away.

```rust
pub fn split(s: &str) -> (&str, &str) {
    s.split_at(0)
}

pub fn both(s: &str) -> &str {
    &s[0..s.len()]
}

pub fn first(s: &str) -> &str {
    &s[..0]
}

pub fn last(s: &str) -> &str {
    &s[0..]
}
```
This commit is contained in:
Ulrik Sverdrup 2016-03-23 21:57:44 +01:00
parent 80e7a1be35
commit f621193e5e
1 changed files with 4 additions and 1 deletions

View File

@ -1892,7 +1892,10 @@ impl StrExt for str {
#[inline]
fn is_char_boundary(&self, index: usize) -> bool {
if index == self.len() { return true; }
// 0 and len are always ok.
// Test for 0 explicitly so that it can optimize out the check
// easily and skip reading string data for that case.
if index == 0 || index == self.len() { return true; }
match self.as_bytes().get(index) {
None => false,
Some(&b) => b < 128 || b >= 192,