Accept 0 as a valid str char boundary
Index 0 must be a valid char boundary (invariant of str that it contains valid UTF-8 data). If we check explicitly for index == 0, that removes the need to read the byte at index 0, so it avoids a trip to the string's memory, and it optimizes out the slicing index' bounds check whenever it is zero. With this change, the following examples all change from having a read of the byte at 0 and a branch to possibly panicing, to having the bounds checking optimized away. ```rust pub fn split(s: &str) -> (&str, &str) { s.split_at(0) } pub fn both(s: &str) -> &str { &s[0..s.len()] } pub fn first(s: &str) -> &str { &s[..0] } pub fn last(s: &str) -> &str { &s[0..] } ```
This commit is contained in:
parent
80e7a1be35
commit
f621193e5e
|
@ -1892,7 +1892,10 @@ impl StrExt for str {
|
|||
|
||||
#[inline]
|
||||
fn is_char_boundary(&self, index: usize) -> bool {
|
||||
if index == self.len() { return true; }
|
||||
// 0 and len are always ok.
|
||||
// Test for 0 explicitly so that it can optimize out the check
|
||||
// easily and skip reading string data for that case.
|
||||
if index == 0 || index == self.len() { return true; }
|
||||
match self.as_bytes().get(index) {
|
||||
None => false,
|
||||
Some(&b) => b < 128 || b >= 192,
|
||||
|
|
Loading…
Reference in New Issue