correct the primitive char doc's use of bytes and code points
Previously the docs suggested that '❤️' doesn't fit in a char because
it's 6 bytes. But that's misleading. 'a̚' also doesn't fit in a char,
even though it's only 3 bytes. The important thing is the number of code
points, not the number of bytes. Clarify the primitive char docs around
this.
This commit is contained in:
parent
17d284b4b5
commit
071b4b6f7b
@ -50,18 +50,21 @@ mod prim_bool { }
|
||||
/// [`String`]: string/struct.String.html
|
||||
///
|
||||
/// As always, remember that a human intuition for 'character' may not map to
|
||||
/// Unicode's definitions. For example, emoji symbols such as '❤️' are more than
|
||||
/// one byte; ❤️ in particular is six:
|
||||
/// Unicode's definitions. For example, emoji symbols such as '❤️' can be more
|
||||
/// than one Unicode code point; this ❤️ in particular is two:
|
||||
///
|
||||
/// ```
|
||||
/// let s = String::from("❤️");
|
||||
///
|
||||
/// // six bytes times one byte for each element
|
||||
/// assert_eq!(6, s.len() * std::mem::size_of::<u8>());
|
||||
/// // we get two chars out of a single ❤️
|
||||
/// let mut iter = s.chars();
|
||||
/// assert_eq!(Some('\u{2764}'), iter.next());
|
||||
/// assert_eq!(Some('\u{fe0f}'), iter.next());
|
||||
/// assert_eq!(None, iter.next());
|
||||
/// ```
|
||||
///
|
||||
/// This also means it won't fit into a `char`, and so trying to create a
|
||||
/// literal with `let heart = '❤️';` gives an error:
|
||||
/// This means it won't fit into a `char`. Trying to create a literal with
|
||||
/// `let heart = '❤️';` gives an error:
|
||||
///
|
||||
/// ```text
|
||||
/// error: character literal may only contain one codepoint: '❤
|
||||
@ -69,8 +72,8 @@ mod prim_bool { }
|
||||
/// ^~
|
||||
/// ```
|
||||
///
|
||||
/// Another implication of this is that if you want to do per-`char`acter
|
||||
/// processing, it can end up using a lot more memory:
|
||||
/// Another implication of the 4-byte fixed size of a `char`, is that
|
||||
/// per-`char`acter processing can end up using a lot more memory:
|
||||
///
|
||||
/// ```
|
||||
/// let s = String::from("love: ❤️");
|
||||
@ -79,19 +82,6 @@ mod prim_bool { }
|
||||
/// assert_eq!(12, s.len() * std::mem::size_of::<u8>());
|
||||
/// assert_eq!(32, v.len() * std::mem::size_of::<char>());
|
||||
/// ```
|
||||
///
|
||||
/// Or may give you results you may not expect:
|
||||
///
|
||||
/// ```
|
||||
/// let s = String::from("❤️");
|
||||
///
|
||||
/// let mut iter = s.chars();
|
||||
///
|
||||
/// // we get two chars out of a single ❤️
|
||||
/// assert_eq!(Some('\u{2764}'), iter.next());
|
||||
/// assert_eq!(Some('\u{fe0f}'), iter.next());
|
||||
/// assert_eq!(None, iter.next());
|
||||
/// ```
|
||||
mod prim_char { }
|
||||
|
||||
#[doc(primitive = "unit")]
|
||||
|
Loading…
Reference in New Issue
Block a user