An increasing number of mainstream languages, from Python and JS, to Rust and C++, are jumping on the party bandwagon that is async/await₁. Let's explore why I hope it's just a fad.
What is async/await?
It's a feature of a language that allows us to write methods that do things in parallel, while looking very similar to regular, synchronous methods. It might even seem that, if used from the start of a sizeable project, your entire program just magically scales across all the CPU cores.₁
That sounds pretty cool. What's your problem with it?
Let's create a simple example program. First we'll write it naively, then we'll see how we can use async/await to make it run faster by doing some things in parallel. Once we understand how it works, we will discover a fundamental problem₂ with that approach.
Finally, we will explore an alternative approach, which I hope you will find more reasonable.
Example program
This program will take a password from standard input and rate its strength, writing a score from 0 to 5 to the standard output. It will also do a dummy action that could in principle be done in parallel.
Let's declare a few fantasy methods. Imagine they are provided by an external library and each takes a second to complete.
We will use these instead of the normal, sane methods you would usually use. They will be the obvious targets for parallelization.
Here is the Main method, which corresponds to the initial description of the program:
Let's break it down.
This method uses one of our slow fantasy methods: Print.
Let's move on to our dummy action:
Nothing suspicious here 😉. Just a call to another one of our slow methods.
Here is RatePassword:
Rating is done by two criteria. Here is the first one:
Note the call to a fantasy method Length.
And the second one is a bit more complex:
There are a bunch of calls to Any, which just beg to be run in parallel.
Can we run it?
< Please enter your password.
> C0olPa55word!
< Your password's score is 3/5.
Pretty simple.
What can we take away from this?
Most of the program is linear in nature:
Get input → Process input → Produce output
There are, however, three points where we can split the flow:
ReportToNSA can run independently from the rest of the code
RatePassword can do both evaluations in parallel
RateVariety can further split up the logic into one thread per check
Enter async/await
If you know how it works, you can just skim the code snippets in this section to see the changes to the program.
What is it again?
There is a special type called Task (sometimes called a Future or a Promise). It represents a computation run in another thread. We can run a Task, check if it is completed, and we can wait for its result (waiting for a completed task returns the result immediately). Take a look:
There is another way to get the result from a task: await.
Note however that in order to use await, we had to mark the method as async and change the return type from a plain int to a Task<int>! This means that our method becomes inherently asynchronous. Keep this in mind, we will discuss it later.
How does it fit in with our program?
There are two ways one introduces Tasks into a program. We'll focus on one now, while the other one will be better shown later.
Libraries often provide asynchronous methods, so let's imagine our fantasy methods now perform code in another thread and modify them to reflect that.
Let's go over their usages and see how they affect our program.
Now, we don't strictly need to await here, since SendOverInternet does not have a meaningful return value, but there is a hidden piece of information we preserve by awaiting: the duration of the operation. If we don't await, whoever calls ReportToNSA cannot know when the operation completes.
Moving on:
Again, Print doesn't return much, but we await it to make sure we don't ask for input before the prompt prints.
Let's interject here and talk a bit about why only async methods can use await and why the return type becomes a Task, even though the return statement looks like it returns a plain string.
Async methods (loosely speaking) get rewritten at compile time: whenever await is reached, the rest of the method gets broken off into another function, which the awaited Task uses to continue after it completes. In other words:
I won't go into what this means right now, just keep in mind that there is no magic, we are just structuring our logic a certain way.
Moving on:
This is the first time we use the return value of an awaited method. The Task<int> is handily unpacked to a plain int (keen readers will notice that this is just syntax sugar).
Here comes our biggest method again:
All the calls to Any are performed one after another, so they all run in parallel. We await each result later, only when they are needed. The most common pitfall is awaiting results as soon as possible:
This makes the second statement wait for the first call to Any to complete and makes the code sequential, not parallel.
You should await the result when you use it, not when you first get it.
Finally, this is how it all propagates:
Note that we call ReportToNSA and Print without awaiting because we don't care when they complete.
Peering through the cracks
Tasks are a great abstraction for multi-threading. They are very easy to run and to get results from. But Task doesn't inevitably lead to this programming style.
There are two important factors that cause this style of programming:
Individual (simple) methods run Tasks instead of just performing their code on the calling thread.
Tasks also compose functions, via the ContinueWith method. This is the feature (ab)used by async/await₃.
What does this mean?
Because Tasks appear on the lowest levels of our code, they have to bubble up through the entire codebase if we want to preserve the asynchronous intention of those methods. Async/await is infectious. We end up using it all over the place just to have these few things run in parallel.
Is this bad? It's easy to use, looks fine and gives us power.
Function composition is integral to programming. It's the most basic tool in our arsenal that allows us to tackle complex problems.
You could imagine a language that uses asynchronous functions exclusively. Imagine you don't have to write async or await, and every int and bool and whatnot is wrapped in a Task implicitly. All function calls execute in parallel and results are awaited only when used. This is the direction in which async/await pulls our code. But real code, just like processors, is primarily sequential.
If this fictional language is purely functional, then there is only one concern: performance. Most computation is sequential - some algorithms cannot be parallelized by definition. Others wouldn't benefit from parallelization in the real world because of how computers work.
If not, then the code has side effects; our functions are methods that modify fields. Now we have race conditions. Race conditions don't have to be a big problem, but become an impossible one to debug if every method is asynchronous.
In any existing language, when we mix regular and async functions, there is the "gotcha" that an async function actually runs in the calling thread until it hits the first await! If a method returns a Task, there is no way of knowing at a glance if the entire method runs asynchronously or just some tiny part that happens ten calls deep.
Moving on, a synchronous function is more basic than an asynchronous one in the real world. Every synchronous method can be made asynchronous and vice versa,
but every procedure starts as a synchronous one. It can become asynchronous if the programmer explicitly starts another thread (Task) or if he propagates an already existing Task by queuing up more code on its thread (await or ContinueWith). These are the aforementioned two ways of introducing Tasks to a program.
For this reason (and one more which we will see a bit later) most higher level functions, such as Select (more often called map), expect regular functions:
If we have an async function instead, we first have to convert it to a synchronous one:
Which is silly, because every asynchronous function has synchronous code at its core.
Even in our imaginary, purely functional, asynchronous language, I'd argue that, conceptually, the "asynchronousness" is a trait of the function call, not the function itself.
When a programmer declares a method as async, he declares a change in the programming paradigm for the caller, to one that is not feasible in reality. Luckily, the caller doesn't have to comply and can just invoke a .Result on the Task, but she must be smart enough to understand the ramifications of using await (which is often presented as a solution to this situation).₅
In fact, if you've ever awaited an async method immediately:
You didn't really need an async method, since code that follows depends on the result of that async method, which makes the logic sequential in nature, so it would have made more sense to just call .Result.
Armed with this new understanding, let's have another go at introducing parallelism to our program.
The reasonable Task
Starting with the original version of the program, where all our fantasy methods are synchronous, let's make a few changes:
Here we mix asynchronous and synchronous code with intention. We identify the procedure that can benefit from running asynchronously, instead of spamming Tasks everywhere. Even though GetPasswordFromUser and RatePassword do some stuff in parallel internally, we don't care about that here. As described earlier, this code is linear in nature.
Get input → Process input → Produce output
Next up:
This looks pretty similar to the async/await version. Only the syntax is different, and crucially, we didn't have to change the return type. This method is not asynchronous. Only a part of its logic can be run in parallel and the final result requires both of the Tasks completing. There is no point in leaking a Task to the calling code, since that's just one of the implementation details of this method.
Functions that act like a black box - taking some parameters and computing some definitive result - always look synchronous from the outside. This is the other reason why Select expects a synchronous function.
And finally:
Again, pretty similar to the async/await one. We just run each Any as a Task explicitly. And again, since we need all of them to complete to get a meaningful result, we don't pretend this entire method is asynchronous.
That's it! Only minor changes were required.
One drawback of this approach is that all of these lambdas are uglier than awaits. It would be nice if we had a run or fork keyword instead₆.
[1] https://en.wikipedia.org/wiki/Async/await Yes, I'm citing Wikipedia, this is a blog.
[2] More of a fundamental trait, not necessarily a problem; but a problem sounds more ominous. The problem does stem from it though. What? These are footnotes now as well. Sue me.
[3] That's not exactly how async/await abuses Task, but I don't want to talk about how Task is a crippled monad.
[4] ContinueWith doesn't work this way exactly, but a simple extension method could be written. I'm trying really hard not to talk about monads.
[5] Why does async/await exist if it is always detrimental? This is just a guess, but I would attribute it to a lapse in judgement. A mistake. Of course, this entire blog post could be wrong and I'm just missing something. Please let me know.
[6] Of course, you could just create one method that returns a Task here where it makes sense:
P.S. There is another tradeoff: if all methods were async, then you wouldn't even have to think what to run asynchronously. Everything that could be is run in parallel. But then you'd have to think about when to sync those methods when they inevitably try using the same resource of some kind.
Then again, not all methods are async, so even that is a pipe dream.
P.P.S. Did you know the Optima font doesn't have superscript 1, 2 and 3, but has other numbers?
While the basic premise of your argument holds merit—that async can be infectious and therefore used to excess in a detrimental way—there is a problem with the way you make your case, or at least with your solution as demonstrated. I'll note that your argument appears to assume that async is fundamentally multi-threaded and CPU bound (very often, neither is the case), although I imagine you are only doing so for the sake of illustration.
That (a)sync should be a feature of the call site rather than the callee makes a lot of sense in a lot of cases, but not always. Consider the notion that "there is no thread". A task may jump from thread to thread as it…