¶ Two Bit-Twiddling Tricks
And now for something completely different: I'd like to share my two favorite bit-twiddling tricks with you. One I've used and abused for a while, and is taken from Sean Anderson's wonderful bestiary. The other, semi-related hack is something that only recently "clicked" for me, though I've been aware of its use for a while.
¶ For my first trick...
Modulo is a super useful operation in programming. It makes it easy to create ring buffers, wrap indexes through multi-dimensional arrays, and generally keep a value fenced between zero and an upper bound.
One issue — modulo can be a relatively expensive operation to perform in hot code. It's a floating point operation (at least in JS), which incurs some overhead. In cases where you control the size of the upper bound and the incoming data type is a positive small integer (<2^31), this trick might speed things up.
The trick is to exploit the base-two representation of the number in question. If your upper bound is a power of two, you can get away with a single bitwise AND operation against that upperbound minus one.
We'll illustrate this with 8 as our upper bound — we want
x % 8, so we're going to
x & 7.
┌───┬───┬───┬───┬───┬───┬───┬───┐ 8 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | └───┴───┴───┴───┴───┴───┴───┴───┘ We start with the upper bound, 8. Only a single bit is set. -1 1 -1 1 1 -1 1 1 1 ┌───┬───┬───┬───┐ ┌───┬───┬───┬───┐ ┌───┬───┬───┬───┐ ┌───┬───┬───┬───┐ ┌───┬───┬───┬───┐ | 1 | 0 | 0 | 0 |→| 1 | 0 | 0 | 0⃫ |→| 1 | 0 | 0⃫ | 0⃫ |→| 1 | 0⃫ | 0⃫ | 0⃫ |→| 0 | 1 | 1 | 1 | └───┴───┴───┴───┘ └───┴───┴───┴───┘ └───┴───┴───┴───┘ └───┴───┴───┴───┘ └───┴───┴───┴───┘ ─ 1 Subtracting one will set all bits to the right of the N-th power-of-two bit. Carrying works just like it does in base 10, it just happens more often in base two! ┌───┬───┬───┬───┬───┬───┬───┬───┐ 8 ─ 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | └───┴───┴───┴───┴───┴───┴───┴───┘ ┌───┬───┬───┬───┬───┬───┬───┬───┐ x | ? | ? | ? | ? | ? | ? | ? | ? | X comes in. All bits could be zero or one. We only care └─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┴─┬─┘ about the [0, 8) value range. | | | | | | | | ?&0 ?&0 ?&0 ?&0 ?&0 ?&1 ?&1 ?&1 ║ ║ ║ ║ ║ ║ ║ ║ ┌─╨─┬─╨─┬─╨─┬─╨─┬─╨─┬─╨─┬─╨─┬─╨─┐ 7 & x | 0 | 0 | 0 | 0 | 0 | ? | ? | ? | By bitwise-AND'ing, we mask off only the bits in the └───┴───┴───┴───┴───┴───┴───┴───┘ [0, 8) range. We proclaim victory and go eat a hoagie.
In JS, if you can be certain that the input types are SMI's (small integers), and your hot code path is performing a lot of modulo operations, you stand to see some performance gains!
I like this trick a lot: not only did it help speed up my inflate implementation, but I find that it's a helpful "gateway drug" to other bit twiddling tricks. For instance, now that we know that each column represents a power of two, it follows that shifting left is equivalent to multiplying your value by a power of two. From there, you can see that shifting right divides by a power of two — and so on, and so forth. Neat!
¶ For my next trick...
I've long been aware of this trick — it's incredibly common to see it used in VMs. For whatever reason though, I had a hard time visualizing how it worked until recently. Actually, the first trick illustrates very nicely how the second trick works.
Tagged pointers allow a language runtime to add metadata directly to pointers to objects. For example, a pointer could contain the type of the object it points to for quickly checking the validity of an operation before incurring the cost of a dereference. Alternatively, it could signal that the value of the pointer is a literal number, so that mathematical operations on "small" integer values doesn't incur a pointer dereference.
You may have intuited by now that tagged pointers are in use in all of the leading JS implementations. A single bit denotes whether a pointer refers to a target (be that a double precision float, an object, or another primitive value) or represents a SMI. This is why SMIs can only represent 2^31 different values.
The key to this trick is the knowledge that many memory allocators require that allocations be 8-byte aligned. This fact eluded me at first!
This means that there are three bits which will never be set by a valid pointer — pointers may only point to multiples of 8!
┌───┬───┬───┬───┬───┬───┬───┬───┐ valid pointer | ? | ? | ? | ? | ? | 0 | 0 | 0 | └───┴───┴───┴───┴───┴───┴───┴───┘ The least significant three bits will *never* be set on a valid pointer: they're free for us to use. ┌───┬───┬───┬───┬───┬───┬───┬───┐ SMI | ? | ? | ? | ? | ? | ? | ? | 1 | └───┴───┴───┴───┴───┴───┴───┴───┘ If the least significant bit is set, then all other bits represent an integer value. We need only shift right one to obtain the value. ┌───┬───┬───┬───┬───┬───┬───┬───┐ tagged pointer | ? | ? | ? | ? | ? | 1 | 1 | 0 | └───┴───┴───┴───┴───┴───┴───┴───┘ We can store extra metadata in the next two bits - the type of the referenced object, for example. We're limited to 2 bits, so only four values are available: the other bit is already used to indicate whether the value is a SMI or not! The original pointer value can be retrieved by AND-ing it with the complement of 7 (~7).