Accept 0 as a valid str char boundary

Index 0 must be a valid char boundary (invariant of str that it contains valid UTF-8 data). If we check explicitly for index == 0, that removes the need to read the byte at index 0, so it avoids a trip to the string's memory, and it optimizes out the slicing index' bounds check whenever it is zero. With this change, the following examples all change from having a read of the byte at 0 and a branch to possibly panicing, to having the bounds checking optimized away. ```rust pub fn split(s: &str) -> (&str, &str) { s.split_at(0) } pub fn both(s: &str) -> &str { &s[0..s.len()] } pub fn first(s: &str) -> &str { &s[..0] } pub fn last(s: &str) -> &str { &s[0..] } ```
2016-03-23 21:57:44 +01:00 · 2016-03-23 21:57:44 +01:00 · f621193e5e
parent 80e7a1be35
commit f621193e5e
1 changed files with 4 additions and 1 deletions
--- a/src/libcore/str/mod.rs
+++ b/src/libcore/str/mod.rs
@ -1892,7 +1892,10 @@ impl StrExt for str {

    #[inline]
    fn is_char_boundary(&self, index: usize) -> bool {
-        if index == self.len() { return true; }
+        // 0 and len are always ok.
+        // Test for 0 explicitly so that it can optimize out the check
+        // easily and skip reading string data for that case.
+        if index == 0 || index == self.len() { return true; }
        match self.as_bytes().get(index) {
            None => false,
            Some(&b) => b < 128 || b >= 192,