diff --git a/examples/Cargo.toml b/examples/Cargo.toml index d995d8ad..1e04a1db 100644 --- a/examples/Cargo.toml +++ b/examples/Cargo.toml @@ -1,6 +1,9 @@ [workspace] members = [ "hello-world", + "hello-world-sleep", + "hello-world-spawn", + "hello-world-join", "01_02_why_async", "01_04_async_await_primer", "02_02_future_trait", diff --git a/examples/hello-world-join/Cargo.toml b/examples/hello-world-join/Cargo.toml new file mode 100644 index 00000000..d745be1e --- /dev/null +++ b/examples/hello-world-join/Cargo.toml @@ -0,0 +1,8 @@ +[package] +name = "hello-world-join" +version = "0.1.0" +authors = ["Nicholas Cameron "] +edition = "2021" + +[dependencies] +tokio = { version = "1.40.0", features = ["full"] } diff --git a/examples/hello-world-join/src/main.rs b/examples/hello-world-join/src/main.rs new file mode 100644 index 00000000..cb297e3f --- /dev/null +++ b/examples/hello-world-join/src/main.rs @@ -0,0 +1,23 @@ +use tokio::{spawn, time::{sleep, Duration}}; + +async fn say_hello() { + // Wait for a while before printing to make it a more interesting race. + sleep(Duration::from_millis(100)).await; + println!("hello"); +} + +async fn say_world() { + sleep(Duration::from_millis(100)).await; + println!("world"); +} + +#[tokio::main] +async fn main() { + let handle1 = spawn(say_hello()); + let handle2 = spawn(say_world()); + + let _ = handle1.await; + let _ = handle2.await; + + println!("!"); +} diff --git a/examples/hello-world-sleep/Cargo.toml b/examples/hello-world-sleep/Cargo.toml new file mode 100644 index 00000000..d4a26048 --- /dev/null +++ b/examples/hello-world-sleep/Cargo.toml @@ -0,0 +1,8 @@ +[package] +name = "hello-world-sleep" +version = "0.1.0" +authors = ["Nicholas Cameron "] +edition = "2021" + +[dependencies] +tokio = { version = "1.40.0", features = ["full"] } diff --git a/examples/hello-world-sleep/src/main.rs b/examples/hello-world-sleep/src/main.rs new file mode 100644 index 00000000..bc4cbcfc --- /dev/null +++ b/examples/hello-world-sleep/src/main.rs @@ -0,0 +1,20 @@ +use std::io::{stdout, Write}; +use tokio::time::{sleep, Duration}; + +async fn say_hello() { + print!("hello, "); + // Flush stdout so we see the effect of the above `print` immediately. + stdout().flush().unwrap(); +} + +async fn say_world() { + println!("world!"); +} + +#[tokio::main] +async fn main() { + say_hello().await; + // An async sleep function, puts the current task to sleep for 1s. + sleep(Duration::from_millis(1000)).await; + say_world().await; +} diff --git a/examples/hello-world-spawn/Cargo.toml b/examples/hello-world-spawn/Cargo.toml new file mode 100644 index 00000000..23e53bd0 --- /dev/null +++ b/examples/hello-world-spawn/Cargo.toml @@ -0,0 +1,8 @@ +[package] +name = "hello-world-spawn" +version = "0.1.0" +authors = ["Nicholas Cameron "] +edition = "2021" + +[dependencies] +tokio = { version = "1.40.0", features = ["full"] } diff --git a/examples/hello-world-spawn/src/main.rs b/examples/hello-world-spawn/src/main.rs new file mode 100644 index 00000000..b4ca3462 --- /dev/null +++ b/examples/hello-world-spawn/src/main.rs @@ -0,0 +1,20 @@ +use tokio::{spawn, time::{sleep, Duration}}; + +async fn say_hello() { + // Wait for a while before printing to make it a more interesting race. + sleep(Duration::from_millis(100)).await; + println!("hello"); +} + +async fn say_world() { + sleep(Duration::from_millis(100)).await; + println!("world!"); +} + +#[tokio::main] +async fn main() { + spawn(say_hello()); + spawn(say_world()); + // Wait for a while to give the tasks time to run. + sleep(Duration::from_millis(1000)).await; +} diff --git a/src/SUMMARY.md b/src/SUMMARY.md index cfed0e43..c60ce942 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -11,6 +11,7 @@ - [Introduction](part-guide/intro.md) - [Concurrent programming](part-guide/concurrency.md) +- [Async and Await](part-guide/async-await.md) # Part 2: reference diff --git a/src/part-guide/async-await.md b/src/part-guide/async-await.md new file mode 100644 index 00000000..a6622078 --- /dev/null +++ b/src/part-guide/async-await.md @@ -0,0 +1,200 @@ +# Async and Await + +In this chapter we'll get started doing some async programming in Rust and we'll introduce the `async` and `await` keywords. + +`async` is an annotation on functions (and other items, such as traits, which we'll get to later); `await` is an operator used in expressions. But before we jump into those keywords, we need to cover a few core concepts of async programming in Rust, this follows from the discussion in the previous chapter, here we'll relate things directly to Rust programming. + +## Rust async concepts + +### The runtime + +Async tasks must be managed and scheduled. There are typically more tasks than cores available so they can't all be run at once. When one stops executing another must be picked to execute. If a task is waiting on IO or some other event, it should not be scheduled, but when that completes, it should be scheduled. That requires interacting with the OS and managing IO work. + +Many programming languages provide a runtime. Commonly, this runtime does a lot more than manage async tasks - it might manage memory (including garbage collection), have a role in exception handling, provide an abstraction layer over the OS, or even be a full virtual machine. Rust is a low-level language and strives towards minimal runtime overhead. The async runtime therefore has a much more limited scope than many other languages' runtimes. There are also many ways to design and implement an async runtime, so Rust lets you choose one depending on your requirements, rather than providing one. This does mean that getting started with async programming requires an extra step. + +As well as running and scheduling tasks, a runtime must interact with the OS to manage async IO. It must also provide timer functionality to tasks (which intersects with IO management). There are no strong rules about how a runtime must be structured, but some terms and division of responsibilities are common: + +- *reactor* or *event loop* or *driver* (equivalent terms): dispatches IO and timer events, interacts with the OS, and does the lowest-level driving forward of execution, +- *scheduler*: determines when tasks can execute and on which OS threads, +- *executor* or *runtime*: combines the reactor and scheduler, and is the user-facing API for running async tasks; *runtime* is also used to mean the whole library of functionality (e.g., everything in the Tokio crate, not just the Tokio executor which is represented by the [`Runtime`](https://docs.rs/tokio/latest/tokio/runtime/struct.Runtime.html) type). + +As well as the executor as described above, a runtime crate typically includes many utility traits and functions. These might include traits (e.g., `AsyncRead`) and implementations for IO, functionality for common IO tasks such as networking or accessing the file system, locks, channels, and other synchronisation primitives, utilities for timing, utilities for working with the OS (e.g., signal handling), utility functions for working with futures and streams (async iterators), or monitoring and observation tools. We'll cover many of those in this guide. + +There are many async runtimes to choose from. Some have very different scheduling policies, or are optimised for a specific task or domain. For most of this guide we'll use the [Tokio](https://tokio.rs/) runtime. It's a general purpose runtime and is the most popular runtime in the ecosystem. It's a great choice for getting started and for production work. In some circumstances, you might get better performance or be able to write simpler code with a different runtime. Later in this guide we'll discuss some of the other available runtimes and why you might choose one or another, or even write your own. + +To get up and running as quickly as possible, you need just a little boilerplate. You'll need to include the Tokio crate as a dependency in your Cargo.toml (just like any other crate): + +``` +[dependencies] +tokio = { version = "1", features = ["full"] } +``` + +And you'll use the `tokio::main` annotation on your `main` function so that it can be an async function (which is otherwise not permitted in Rust): + +```rust,norun +#[tokio::main] +async fn main() { ... } +``` + +That's it! You're ready to write some asynchronous code! + +The `#[tokio::main]` annotation initializes the Tokio runtime and starts an async task for running the code in `main`. Later in this guide we'll explain in more detail what that annotation is doing and how to use async code without it (which will give you more flexibility). + + +### Futures and tasks + +The basic unit of async concurrency in Rust is the *future*. A future is just a regular old Rust object (a struct or enum, usually) which implements the ['Future'](https://doc.rust-lang.org/std/future/trait.Future.html) trait. A future represents a deferred computation. That is, a computation that will be ready at some point in the future. + +We'll talk a lot about futures in this guide, but it's easiest to get started without worrying too much about them. We'll mention them quite a bit in the next few sections, but we won't really define them or use them directly until later. One important aspect of futures is that they can be combined to make new, 'bigger' futures (we'll talk a lot more about *how* they can be combined later). + +I've used the term 'async task' quite a bit in an informal way in the previous chapter and this one. I've used the term to mean a logical sequence of execution; analogous to a thread but managed within a program rather than externally by the OS. It is often useful to think in terms of tasks, however, Rust itself has no concept of a task and the term is used to mean different things! It is confusing! To make it worse, runtimes do have a concept of a task and different runtimes have slightly different concepts of tasks. + +From here on in, I'm going to try to be precise about the terminology around tasks. When I use just 'task' I mean the abstract concept of a sequence of computation that may occur concurrently with other tasks. I'll use 'async task' to mean exactly the same thing, but in contrast to a task which is implemented as an OS thread. I'll use 'runtime's task' to mean whatever kind of task a runtime imagines, and 'tokio task' (or some other specific runtime) to mean Tokio's idea of a task. + +An async task in Rust is just a future (usually a 'big' future made by combining many others). In other words, a task is a future which is executed. However, there are times when a future is 'executed' without being a runtime's task. This kind of a future is intuitively a *task* but not a *runtime's task*. I'll spell this out more when we get to an example of it. + + +## Async functions + +The `async` keyword is a modifier on function declarations. E.g., we can write `pub async fn send_to_server(...)`. An async function is simply a function declared using the `async` keyword, and what that means is that it is a function which can be executed asynchronously, in other words the caller *can choose not to* wait for the function to complete before doing something else. + +In more mechanical terms, when an async function is called, the body is not executed as it would be for a regular function. Instead the function body and its arguments are packaged into a future which is returned in lieu of a real result. The caller can then decide what to do with that future (if the caller wants the result 'straight away', then it will `await` the future, see the next section). + +Within an async function, code is executed in the usual, sequential way[^preempt], being async makes no difference. You can call synchronous functions from async functions, and execution proceeds as usual. One extra thing you can do within an async function is use `await` to await other async functions (or futures), which *may* cause yielding of control so that another task can execute. + +[^preempt]: like any other thread, the thread the async function is running on may be pre-empted by the operating system and paused so another thread can get some work done. However, from the function's point of view this is not observable without inspecting data which may have been modified by other threads (and which could have been modified by another thread executing in parallel without the current thread being paused). + +## `await` + +We stated above that a future is a computation that will be ready at some point in the future. To get the result of that computation, we use the `await` keyword. If the result is ready immediately or can be computed without waiting, then `await` simply does that computation to produce the result. However, if the result is not ready, then `await` hands control over to the scheduler so that another task can proceed (this is cooperative multitasking mentioned in the previous chapter). + +The syntax for using await is `some_future.await`, i.e., it is a postfix keyword used with the `.` operator. That means it can be used ergonomically in chains of method calls and field accesses. + +Consider the following functions: + +```rust,norun +// An async function, but it doesn't need to wait for anything. +async fn add(a: u32, b: u32) -> u32 { + a + b +} + +async fn wait_to_add(a: u32, b: u32) -> u32 { + sleep(1000).await; + a + b +} +``` + +If we call `add(15, 3).await` then it will return immediately with the result `18`. If we call `wait_to_add(15, 3).await`, we will eventually get the same answer, but while we wait another task will get an opportunity to run. + +In this silly example, the call to `sleep` is a stand-in for doing some long-running task where we have to wait for the result. This is usually an IO operation where the result is data read from an external source or confirmation that writing to an external destination succeeded. Reading looks something like `let data = read(...).await?`. In this case `await` will cause the current task to wait while the read happens. The task will resume once reading is completed (other tasks could get some work done while the reading task waits). The result of reading could be data successfully read or an error (handled by the `?`). + +Note that if we call `add` or `wait_to_add` or `read` without using `.await` we won't get any answer! + +What? + +Calling an async function returns a future, it doesn't immediately execute the code in the function. Furthermore, a future does not do any work until it is awaited[^poll]. This is in contrast to some other languages where an async function returns a future which begins executing immediately. + +This is an important point about async programming in Rust. After a while it will be second nature, but it often trips up beginners, especially those who have experience with async programming in other languages. + +An important intuition about futures in Rust is that they are inert objects. To get any work done they must be driven forward by an external force (usually an async runtime). + +We've described `await` quite operationally (it runs a future, producing a result), but we talked in the previous chapter about async tasks and concurrency, how does `await` fit into that mental model? First, let's consider pure sequential code: logically, calling a function simply executes the code in the function (with some assignment of variables). In other words, the current task continues executing the next 'chunk' of code which is defined by the function. Similarly, in an async context, calling a non-async function simply continues execution with that function. Calling an async function finds the code to run, but doesn't run it. `await` is an operator which continues execution of the current task, or if the current task can't continue right now, gives another task an opportunity to continue. + +`await` can only be used inside an async context, for now that means inside an async function (we'll see more kinds of async contexts later). To understand why, remember that `await` might hand over control to the runtime so that another task can execute. There is only a runtime to hand control to in an async context. For now, you can imagine the runtime like a global variable which is only accessible in async functions, we'll explain later how it really works. + +Finally, for one more perspective on `await`: we mentioned earlier that futures can be combined together to make 'bigger' futures. `async` functions are one way to define a future, and `await` is one way to combine futures. Using `await` on a future combines that future into the future produced by the async function it's used inside. We'll talk in more detail about this perspective and other ways to combine futures later. + +[^poll]: Or polled, which is a lower-level operation than `await` and happens behind the scenes when using `await`. We'll talk about polling later when we talk about futures in detail. + +## Some async/await examples + +Let's start by revisiting our 'hello, world!' example: + +```rust,edition2021 +{{#include ../examples/hello-world/src/main.rs}} +``` + +You should now recognise the boilerplate around `main`. It's for initializing the Tokio runtime and creating an initial task to run the async `main` function. + +`say_hello` is an async function, when we call it, we have to follow the call with `.await` to run it as part of the current task. Note that if you remove the `.await`, then running the program does nothing! Calling `say_hello` returns a future, but it is never executed so `println` is never called (the compiler will warn you, at least). + +Here's a slightly more realistic example, taken from the [Tokio tutorial](https://tokio.rs/tokio/tutorial/hello-tokio). + +```rust,norun +#[tokio::main] +async fn main() -> Result<()> { + // Open a connection to the mini-redis address. + let mut client = client::connect("127.0.0.1:6379").await?; + + // Set the key "hello" with value "world" + client.set("hello", "world".into()).await?; + + // Get key "hello" + let result = client.get("hello").await?; + + println!("got value from the server; result={:?}", result); + + Ok(()) +} +``` + +The code is a bit more interesting, but we're essentially doing the same thing - calling async functions and then awaiting to execute the result. This time we're using `?` for error handling - it works just like in synchronous Rust. + +For all the talk so far about concurrency, parallelism, and asynchrony, both these examples are 100% sequential. Just calling and awaiting async functions does not introduce any concurrency unless there are other tasks to schedule while the awaiting task is waiting. To prove this to ourselves, lets look at another simple (but contrived) example: + +```rust,edition2021 +{{#include ../examples/hello-world-sleep/src/main.rs}} +``` + +Between printing "hello" and "world", we put the current task to sleep[^async-sleep] for one second. Observe what happens when we run the program: it prints "hello", does nothing for one second, then prints "world". That is because executing a single task is purely sequential. If we had some concurrency, then that one second nap would be an excellent opportunity to get some other work done, like printing "world". We'll see how to do that in the next section. + +[^async-sleep]: Note that we're using an async sleep function here, if we were to use [`sleep`](https://doc.rust-lang.org/std/thread/fn.sleep.html) from std we'd put the whole thread to sleep. That wouldn't make any difference in this toy example but in a real program it would mean other tasks could not be scheduled on that thread during that time. That is very bad. + + +## Spawning tasks + +We've talked about async and await as a way to run code in an async task. And we've said that `await` can put the current task to sleep while it waits for IO or some other event. When that happens, another task can run, but how do those other tasks come about? Just like we use `std::thread::spawn` to spawn a new task, we can use [`tokio::spawn`](https://docs.rs/tokio/latest/tokio/task/fn.spawn.html) to spawn a new async task. Note that `spawn` is a function of Tokio, the runtime, not from Rust's standard library, because tasks are purely a runtime concept. + +Here's a tiny example of running an async function on a separate task by using `spawn`: + +```rust,edition2021 +{{#include ../examples/hello-world-spawn/src/main.rs}} +``` + +Similar to the last example, we have two functions printing "hello" and "world!". But this time we run them concurrently (and in parallel) rather than sequentially. If you run the program a few times you should see the strings printing in both orders - sometimes "hello" first, sometimes "world!" first. A classic concurrent race! + +Let's dive into what is happening here. There are three concepts in play: futures, tasks, and threads. The `spawn` function takes a future (which remember can be made up of many smaller futures) and runs it as a new Tokio task. Tasks are the the concept which the Tokio runtime schedules and manages (not individual futures). Tokio (in its default configuration) is a multi-threaded runtime which means that when we spawn a new task, that task may be run on a different OS thread from the task it was spawned from (it may be run on the same thread, or it may start on one thread and then be moved to another later on). + +So, when a future is spawned as a task it runs *concurrently* with the task it was spawned from and any other tasks. It may also run in parallel to those tasks if it is scheduled on a different thread. + +To summarise, when we write two statements following each other in Rust, they are executed sequentially (whether in async code or not). When we write `await`, that does not change the concurrency of sequential statements. E.g., `foo(); bar();` is strictly sequential - `foo` is called and afterwards, `bar` is called. That is true whether `foo` and `bar` are async functions or not. `foo().await; bar().await;` is also strictly sequential, `foo` is fully evaluated and then `bar` is fully evaluated. In both cases another thread might be interleaved with the sequential execution and in the second case, another async task might be interleaved at the await points, but the two statements are executed sequentially *with respect to each other* in both cases. + +If we use either `thread::spawn` or `tokio::spawn` we introduce concurrency and potentially parallelism, in the first case between threads and in the second between tasks. + +Later in the guide we'll see cases where we execute futures concurrently, but never in parallel. + + +### Joining tasks + +If we want to get the result of executing a spawned task, then the spawning task can wait for it to finish and use the result, this is called *joining* the tasks (analogous to [joining](https://doc.rust-lang.org/std/thread/struct.JoinHandle.html#method.join) threads, and the APIs for joining are similar). + +When a task is spawned, the spawn function returns a [`JoinHandle`](https://docs.rs/tokio/latest/tokio/task/struct.JoinHandle.html). If you just want the task to do it's own thing executing, the `JoinHandle` can be discarded (dropping the `JoinHandle` does not affect the spawned task). But if you want the spawning task to wait for the spawned task to complete and then use the result, you can `await` the `JoinHandle` to do so. + +For example, let's revisit our 'Hello, world!' example one more time: + + +```rust,edition2021 +{{#include ../examples/hello-world-join/src/main.rs}} +``` + +The code is similar to last time, but instead of just calling `spawn`, we save the returned `JoinHandle`s and later `await` them. Since we're waiting for those tasks to complete before we exit the `main` function, we no longer need the `sleep` in `main`. + +The two spawned tasks are still executing concurrently. If you run the program a few times you should see both orderings. However, the `await`ed join handles are a limit on the concurrency: the final exclamation mark ('!') will *always* be printed last (you could experiment with moving `println!("!");` relative to the `await`s. You'll probably need to change with the sleep times too to get observable effects). + +If we immediately `await`ed the `JoinHandle` of the first `spawn` rather than saved it and later `await`ed (i.e., written `spawn(say_hello()).await;`), then we'd have spawned another task to run the 'hello' future, but the spawning task would have waited for it to finish before doing anything else. In other words, there is no possible concurrency! You almost never want to do this (because why bother with the spawn? Just write the sequential code). + +### `JoinHandle` + +We'll quickly look at `JoinHandle` in a little more depth. The fact that we can `await` a `JoinHandle` is a clue that a `JoinHandle` is itself a future. `spawn` is not an `async` function, it's a regular function that returns a future (`JoinHandle`). It does some work (to schedule the task) before returning the future (unlike an async future), which is why we don't *need* to `await` `spawn`. Awaiting a `JoinHandle` waits for the spawned task to complete and then returns the result. In the above example, there was no result, we just waited for the task to complete. `JoinHandle` is a generic type and it's type parameter is the type returned by the spawned task. In the above example, the type would be `JoinHandle<()>`, a future that results in a `String` would produce a `JoinHandle` with type `JoinHandle`. + +`await`ing a `JoinHandle` returns a `Result` (which is why we used `let _ = ...` in the above example, it avoids a warning about an unused `Result`). If the spawned task completed successfully, then the task's result will be in the `Ok` variant. If the task panicked or was aborted (a form of cancellation, see [TODO](TODO)), then the result will be an `Err` containing a [`JoinError` docs](https://docs.rs/tokio/latest/tokio/task/struct.JoinError.html). If you are not using cancellation via `abort` in your project, then `unwrapping` the result of `JoinHandle.await` is a reasonable approach, since that is effectively propagating a panic from the spawned task to the spawning task.