correct the primitive char doc's use of bytes and code points

Previously the docs suggested that '❤️' doesn't fit in a char because it's 6 bytes. But that's misleading. 'a̚' also doesn't fit in a char, even though it's only 3 bytes. The important thing is the number of code points, not the number of bytes. Clarify the primitive char docs around this.
2016-02-15 20:18:16 -05:00 · 2016-02-15 20:18:16 -05:00 · 071b4b6f7b
commit 071b4b6f7b
parent 17d284b4b5
1 changed files with 11 additions and 21 deletions
--- a/src/libstd/primitive_docs.rs
+++ b/src/libstd/primitive_docs.rs
@ -50,18 +50,21 @@ mod prim_bool { }
 /// [`String`]: string/struct.String.html
 ///
 /// As always, remember that a human intuition for 'character' may not map to
-/// Unicode's definitions. For example, emoji symbols such as '❤️' are more than
-/// one byte; ❤️ in particular is six:
+/// Unicode's definitions. For example, emoji symbols such as '❤️' can be more
+/// than one Unicode code point; this ❤️ in particular is two:
 ///
 /// ```
 /// let s = String::from("❤️");
 ///
-/// // six bytes times one byte for each element
-/// assert_eq!(6, s.len() * std::mem::size_of::<u8>());
+/// // we get two chars out of a single ❤️
+/// let mut iter = s.chars();
+/// assert_eq!(Some('\u{2764}'), iter.next());
+/// assert_eq!(Some('\u{fe0f}'), iter.next());
+/// assert_eq!(None, iter.next());
 /// ```
 ///
-/// This also means it won't fit into a `char`, and so trying to create a
-/// literal with `let heart = '❤️';` gives an error:
+/// This means it won't fit into a `char`. Trying to create a literal with
+/// `let heart = '❤️';` gives an error:
 ///
 /// ```text
 /// error: character literal may only contain one codepoint: '❤
@ -69,8 +72,8 @@ mod prim_bool { }
 ///             ^~
 /// ```
 ///
-/// Another implication of this is that if you want to do per-`char`acter
-/// processing, it can end up using a lot more memory:
+/// Another implication of the 4-byte fixed size of a `char`, is that
+/// per-`char`acter processing can end up using a lot more memory:
 ///
 /// ```
 /// let s = String::from("love: ❤️");
@ -79,19 +82,6 @@ mod prim_bool { }
 /// assert_eq!(12, s.len() * std::mem::size_of::<u8>());
 /// assert_eq!(32, v.len() * std::mem::size_of::<char>());
 /// ```
-///
-/// Or may give you results you may not expect:
-///
-/// ```
-/// let s = String::from("❤️");
-///
-/// let mut iter = s.chars();
-///
-/// // we get two chars out of a single ❤️
-/// assert_eq!(Some('\u{2764}'), iter.next());
-/// assert_eq!(Some('\u{fe0f}'), iter.next());
-/// assert_eq!(None, iter.next());
-/// ```
 mod prim_char { }

 #[doc(primitive = "unit")]