¶ Classifying Asynchrony
For a while, I've been looking for a good way to explain the relationship between promises, streams, callbacks, and event emitters. Handily, Domenic Denicola recently illustrated the problem well on IRC. To paraphrase him here: there are two axes, plurality and synchrony. Plurality means "how many times will the operation be executed." Synchrony is divided between "synchronous" and "asynchronous." Four categories are formed from these divisions, which each represent a class of operations.
Any useful evented platform must present consistent patterns addressing each of these classes of operation.
N=1 N≠1
│ │
┌─────────────────┬──────────────┬─> plurality
│ │ │
sync - │ Expressive │ Sequential │
│ │ │
├─────────────────┼──────────────┤
│ │ │
async - │ Transactional │ Diapasonal │
│ │ │
├─────────────────┴──────────────┘
│
˅
asynchrony
¶ Expressions & Sequences
Patterns for addressing expressive and sequential operations are built into the language.
Expressive operations happen once, synchronously, and succeed or fail. Functions are JS's "expressive" pattern. Failures are represented as exceptions, success is represented as a returned value.
Sequential operations represent a single operation happening repeatedly. Failure may happen only once, and halts the repetition. Loops, iterables, and recursion are JS's "sequential" patterns.
The most salient aspect of sequential and expressive operations is their behavior when nesting other operations. Expressive operations may nest sequential operations, but cannot expose their inner working. Loops and other sequential patterns, by contrast, may be interleaved. Expressives only yield a single value, while sequentials may yield multiple values over the course of their execution. It bears mentioning at this point that neither expressives nor sequentials are able to nest asynchronous operations.
These are familiar concepts to most programmers. I retread them because they have asynchronous counterparts that are not language level, and those counterparts share many of the same properties.
¶ Transactions
I am borrowing this term from the world of databases, though it's an inexact fit. Transactional operations may only complete at most once. On completion, they may signal error or success, and success may have associated values. There's no guarantee that once a transactional operation is queued that it will be able to be cancelled.
Transactions may nest other transactions as well as diapasons, sequences, and expressions. Akin to the nesting behavior of expressions, operations nested within a transaction become subject to the inability to signal any more than a single "pass or fail" result event.
Transactions encompass two distinct points in time: the point at which the operation is enqueued, and the point at which the operation completes.
¶ Diapasons
I use this term in the sense of "the entire compass, range, or scope of a thing." A diapasonal operation involves an operation that may execute zero or more times, yielding zero or more values, where no assumptions can be made about when the next execution will occur. Diapasonals may indicate execution, the end of execution, or an error to the client program. The end of execution and error signals may happen only once.
Diapasonals may interleave or nest other diapasonals, transactions, sequences, and expressions. Patterns used to address transactionals may be used to construct patterns to address diapasonals, but they are insufficient to address diapasonals by themselves.
Diapasons encompass at least two distinct points in time: the point at which the operation is enqueued, the point at which it completes, and all executions between those two points.
To recap:
- Expressions are operations that happen once and pass or fail.
- Sequentials are operations that happen zero or more times and pass or fail.
- Sequentials and expressions are synchronous. They may nest each other, but not asynchronous operations.
- Transactionals and expressions may only return a single error or value.
- Sequentials and diapasonals may return multiple values or a single error.
¶ How Does Node Handle Transactions and Diapasons?
Every time you see something like the following:
var fs = require('fs');
fs.readFile('some/path.txt', 'utf8', function(err, data) {
if (err) throw err;
// do something with "data"
});
You are looking at a transactional operation. Node presents transactions as
APIs that follow the error-first callback pattern. fs.readFile
is an
excellent example of a transaction that nests other operations. All of the
following transactions, expressions, and diapasons must complete in order for
readFile
to return successfully, and the failure of any one of them will
end the transaction.
- The
fs.open
transaction must return anfd
. - The
fs.fstat
transaction must return afs.Stat
object. - The size check expression must determine that Node can handle buffering the entire file.
- The
read
diapason must build up a list of buffers representing the file. - Once the
read
diapason has completed, thefs.close
transaction must successfully close thefd
.
These contained operations are not visible to calling code — the outside world
is oblivious to whether the open
transaction has occurred yet or not. It may only
rely upon the final callback to determine what operations actually occurred.
This illustrates one very important point: you can substitute error-first callbacks with promises. Promises are just a different pattern for handling the same class of problems. They come with their own assumptions, tradeoffs, and implementation details, but fundamentally, they both solve for the "transactional" problem.
var fs = require('fs');
fs.readFile('some/path.txt', 'utf8').then(function(data) {
// do something with data
}, function(error) {
// do something with error
});
Just as importantly, Promises (and error-first callbacks) are not solutions to the diapason class of operations. Diapason solutions can be built on top of them, but by themselves they are not sufficient.
The read
diapason mentioned above is a good representation of this. It's
internally implemented as three functions: one reader, one error-first
callback, and one finalizer. The reader and the callback ping-pong
between themselves, storing state in a closure until no more data is available,
at which point the finalizer is called. Ignoring the pattern used for
transactions, this is probably the simplest way to represent a diapason — though
not particularly composable or useful unless nested inside a transaction.
Node has two patterns that attempt to address diapasonal operations: EventEmitter and Stream. EventEmitters provide the ability to attach code to certain "topics" and non-locally execute those listeners. Individual topics may be considered diapasonal communication mechanisms: that is, a topic may be emitted multiple times over the course of time.
var http = require('http')
http.createServer().on('request', function(req, resp) {
});
Since there can be zero or more "request"'s, the above example represents a diapason. This pattern would be hard to represent with an error-first callback or a Promise alone. There has to be a mechanism for ping-pong'ing control between the instigator of a transaction and the completion of that transaction, at least.
However, there is no way to indicate the termination of a topic to listeners, and errors sit as a topic all their own. Thus, EventEmitters are an insufficient mechanism for addressing diapasonal operations on their own.
¶ Streams
Streams are built on top of EventEmitters, and attempt to structure their use a bit further. They do this by implementing a state machine on top of a subset of available topics.
In Node, Streams have a connotation of strictly having to do with I/O. This association was reinforced with the introduction of streams2. This association does Streams a disservice. EventEmitters, like error-first callbacks, do not attempt to address composability. Streams are the natural outcome of attempting to create a composable solution to the diapason problem. While in Node they're built on top of EventEmitters, this is largely for historical reasons. Modern greenfield implementations of streams — like the WHATWG streams spec, or min-streams, strictly eschew building on top of the EventEmitter pattern.
Streams, as a general pattern outside of Node, are able to process anything -- not just binary data. Backpressure is optionally available, and able to be controlled by client code. For example: Node servers can be represented as a series of connections, which are themselves streams.
var concat = require('concat-stream');
var http = require('http');
var server = http.createServer();
var responder = new Writable(({request, response}, ready) => {
// do not apply backpressure to new connections
ready();
request.pipe(concat(function(data) {
response.end('OK');
}))
});
// "listen" is a transactional operation,
server.listen().then(function() {
server.incomingConnections.pipe(responder);
});
This might seem like an overapplication of the pattern, but it illustrates that anything that can be represented as an EventEmitter topic can be represented as a Stream; and that Promises and error-first callbacks cannot (by themselves) address the problem of diapasons.
¶ Going Forward
Node's core pattern for dealing with transactional operations is the error-first callback. Switching a pattern so core to Node's operation represents a large amount of engineering effort. It may be done in the future -- and I'm personally excited to see it happen — but it won't happen immediately, and there will need to be a long transitional phase. Replacing callbacks with promises will not replace the EventEmitter and Stream patterns for diapasonal operations.
On the other side of this, moving to a more purely callback-based API for streams does not mean that event emitters will start taking error-first callbacks, nor does it imply that existing methods would grow new callback parameters.
In either case, evolving the patterns Node has adopted to solve these operations will take time, and understanding of how they interact with one another is key. In my next post, I will walk through the machinery of streams, explaining how Node streams currently work, and how WHATWG streams propose to improve upon them. Over the course of the post, we will construct our own stream implementation. Stay tuned!