Improve concurrency chapter

2016-03-27 22:05:58 +05:30 · 2016-03-27 22:05:58 +05:30 · 5954fce848
commit 5954fce848
parent abb3a107e4
1 changed files with 124 additions and 18 deletions
--- a/src/doc/book/concurrency.md
+++ b/src/doc/book/concurrency.md
@ -94,6 +94,54 @@ fn main() {
 }
 ```

+As closures can capture variables from their environment, we can also try to
+bring some data into the other thread:
+
+```rust,ignore
+use std::thread;
+
+fn main() {
+    let x = 1;
+    thread::spawn(|| {
+        println!("x is {}", x);
+    });
+}
+```
+
+However, this gives us an error:
+
+```text
+5:19: 7:6 error: closure may outlive the current function, but it
+                 borrows `x`, which is owned by the current function
+...
+5:19: 7:6 help: to force the closure to take ownership of `x` (and any other referenced variables),
+          use the `move` keyword, as shown:
+      thread::spawn(move || {
+          println!("x is {}", x);
+      });
+```
+
+This is because by default closures capture variables by reference, and thus the
+closure only captures a _reference to `x`_. This is a problem, because the
+thread may outlive the scope of `x`, leading to a dangling pointer.
+
+To fix this, we use a `move` closure as mentioned in the error message. `move`
+closures are explained in depth [here](closures.html#move-closures); basically
+they move variables from their environment into themselves. This means that `x`
+is now owned by the closure, and cannot be used in `main()` after the call to
+`spawn()`.
+
+```rust
+use std::thread;
+
+fn main() {
+    let x = 1;
+    thread::spawn(move || {
+        println!("x is {}", x);
+    });
+}
+```
+
 Many languages have the ability to execute threads, but it's wildly unsafe.
 There are entire books about how to prevent errors that occur from shared
 mutable state. Rust helps out with its type system here as well, by preventing
@ -145,23 +193,64 @@ This gives us an error:
 ```

 Rust knows this wouldn't be safe! If we had a reference to `data` in each
-thread, and the thread takes ownership of the reference, we'd have three
-owners!
+thread, and the thread takes ownership of the reference, we'd have three owners!
+`data` gets moved out of `main` in the first call to `spawn()`, so subsequent
+calls in the loop cannot use this variable.

-So, we need some type that lets us have more than one reference to a value and
-that we can share between threads, that is it must implement `Sync`.
+So, we need some type that lets us have more than one owning reference to a
+value. Usually, we'd use `Rc<T>` for this, which is a reference counted type
+that provides shared ownership. It has some runtime bookkeeping that keeps track
+of the number of references to it, hence the "reference count" part of its name.

-We'll use `Arc<T>`, Rust's standard atomic reference count type, which
-wraps a value up with some extra runtime bookkeeping which allows us to
-share the ownership of the value between multiple references at the same time.
+Calling `clone()` on an `Rc<T>` will return a new owned reference and bump the
+internal reference count. We create one of these for each thread:

-The bookkeeping consists of a count of how many of these references exist to
-the value, hence the reference count part of the name.
+
+```ignore
+use std::thread;
+use std::time::Duration;
+use std::rc::Rc;
+
+fn main() {
+    let mut data = Rc::new(vec![1, 2, 3]);
+
+    for i in 0..3 {
+        // create a new owned reference
+        let data_ref = data.clone();
+
+        // use it in a thread
+        thread::spawn(move || {
+            data_ref[i] += 1;
+        });
+    }
+
+    thread::sleep(Duration::from_millis(50));
+}
+```
+
+This won't work, however, and will give us the error:
+
+```text
+13:9: 13:22 error: the trait `core::marker::Send` is not
+            implemented for the type `alloc::rc::Rc<collections::vec::Vec<i32>>`
+...
+13:9: 13:22 note: `alloc::rc::Rc<collections::vec::Vec<i32>>`
+            cannot be sent between threads safely
+```
+
+As the error message mentions, `Rc` cannot be sent between threads safely. This
+is because the internal reference count is not maintained in a thread safe
+matter and can have a data race.
+
+To solve this, we'll use `Arc<T>`, Rust's standard atomic reference count type.

 The Atomic part means `Arc<T>` can safely be accessed from multiple threads.
 To do this the compiler guarantees that mutations of the internal count use
 indivisible operations which can't have data races.

+In essence, `Arc<T>` is a type that lets us share ownership of data _across
+threads_.
+

 ```ignore
 use std::thread;
@ -182,7 +271,7 @@ fn main() {
 }
 ```

-We now call `clone()` on our `Arc<T>`, which increases the internal count.
+Similarly to las time, we use `clone()` to create a new owned handle.
 This handle is then moved into the new thread.

 And... still gives us an error.
@ -193,14 +282,21 @@ And... still gives us an error.
                             ^~~~
 ```

-`Arc<T>` assumes one more property about its contents to ensure that it is safe
-to share across threads: it assumes its contents are `Sync`. This is true for
-our value if it's immutable, but we want to be able to mutate it, so we need
-something else to persuade the borrow checker we know what we're doing.
+`Arc<T> by default has immutable contents. It allows the _sharing_ of data
+between threads, but shared mutable data is unsafe and when threads are
+involved can cause data races!

-It looks like we need some type that allows us to safely mutate a shared value,
-for example a type that can ensure only one thread at a time is able to
-mutate the value inside it at any one time.
+
+Usually when we wish to make something in an immutable position mutable, we use
+`Cell<T>` or `RefCell<T>` which allow safe mutation via runtime checks or
+otherwise (see also: [Choosing Your Guarantees](choosing-your-guarantees.html)).
+However, similar to `Rc`, these are not thread safe. If we try using these, we
+will get an error about these types not being `Sync`, and the code will fail to
+compile.
+
+It looks like we need some type that allows us to safely mutate a shared value
+across threads, for example a type that can ensure only one thread at a time is
+able to mutate the value inside it at any one time.

 For that, we can use the `Mutex<T>` type!

@ -229,7 +325,17 @@ fn main() {
 Note that the value of `i` is bound (copied) to the closure and not shared
 among the threads.

-Also note that [`lock`](../std/sync/struct.Mutex.html#method.lock) method of
+We're "locking" the mutex here. A mutex (short for "mutual exclusion"), as
+mentioned, only allows one thread at a time to access a value. When we wish to
+access the value, we use `lock()` on it. This will "lock" the mutex, and no
+other thread will be able to lock it (and hence, do anything with the value)
+until we're done with it. If a thread attempts to lock a mutex which is already
+locked, it will wait until the other thread releases the lock.
+
+The lock "release" here is implicit; when the result of the lock (in this case,
+`data`) goes out of scope, the lock is automatically released.
+
+Note that [`lock`](../std/sync/struct.Mutex.html#method.lock) method of
 [`Mutex`](../std/sync/struct.Mutex.html) has this signature:

 ```ignore