PLAN update

Jun 27, 2020 10 min read

I’m coming up on a chunk of vacation, and I’d like to work on a technical project during that time. I have a couple of plates already spinning; Boltzmann documentation, for example. The itch(es) I’d like to scratch are:

Parsing in Rust, possibly using pest or nom.
Learning more about WASM and WASI, possibly using them as a language compile target.
Revisiting control flow graph generation of JS in Rust, using ratel, RESSA, or esprit.
Doing something graphical: some sort of map generation would be neat. There’s quite a learning curve for me here; I’d probably be picking up Amethyst, Rendy, or wgpu.

Each of these has a research component: what is the state of these ecosystems? How do the various pieces fit together? What does it look like to work with these in a borrow-checked language like Rust?

So, breaking down the projects a bit:

Write a tiny language

Admittedly, there’s not a huge technical need for the sort of langauge I’d write, nor or any sort of gap in existing tooling I’m attempting to address! I’m interested in “walking the path”: what does it look like to write a language today? What role might WASM play in early prototyping? What gaps exist?

And indeed, writing a scripting language is on my personal programming bucket list. I’ve written tons of parsers in my day (and a single control-flow-graph generator leveraging partial evaluation), but I haven’t had the chance yet to write an executable language.

I plan to start small: iterate on a LISP-like language at first. Targeting WAT (“Web Assembly Text format”) and emitting WASM may be a good start. That would let me target Node’s WASI API (“Web Assembly System Interface”) to get started, or perhaps wasmtime. Eventually this could lead to writing a self-hosted runtime using rusty_v8 and uvwasi. (That would be a fun learning experience, since it’d take me through using bindgen.)

Control flow graph generation

Six years ago (!!) I experimented with building a control flow graph generator for JavaScript. My initial goals were to:

Build tooling to search a corpus of JavaScript for library use, in support of estimating the potential fallout of making changes to Node.JS builtin APIs.
Visualize code diffs from a novel angle: instead of looking at the textual change, two graphs would be compared and the differences would be displayed visually. (It would be easy to catch that new code introduces an exception edge when looking at two directed graphs, less so in textual form.)

I had some success; over the course of the year I spent at Wal-Mart Labs I wrote ESControl, ESToc, and ESSim: a control flow graph generator, dependency tracker, and simulator. However, looking back, I wish I had brought a bit more rigor to my work: early on in the project it struck me that due to the nature of JavaScript, very nearly ever node on the control flow graph would have to draw an exception edge. That many edges with a common destination would’ve completely obscured the rest of the graph when displayed visually, and yet the exception edges represented valuable control flow information.

I cut that Gordian knot by backing into partial evaluation: as I walked the graph I would keep a stack state machine that would allow me to make assertions about certain exception edges as they appeared: as an example, if I had drawn an exception edge for x is undefined earlier in the graph, I could omit the edge on the next (straightline) lookup of x:

'use strict'
x += 1 // draw an exception edge: what if "x" is undefined?
x += 2 // if we reached this point we definitely know that "x" is defined!

Approaching the problem this was expedient, but tipped the scale on other load-bearing decisions: because it had the stack machine information available to it, ESControl aggressively inlined functions. Eventually it reached a point that for some small programs, I could generate ESControl IR and run the program step-by-step with the same output.

I’m interested in revisiting this in Rust for a couple of reasons:

I never got to the point of visualizing code diffs as control flow graph changes, and I’m still intrigued by that
The memory usage characteristics of the JavaScript ESControl were not ideal; Rust would force more rigor in memory use.
I’m interesting in taking door number two, so to speak: separating the control flow graph generation from a simplification step that would run the partial evaluation necessary to erase edges from the graph.

This would require familiarizing myself with the state of JS parsing in Rust. I would not be mad about that.

Map Generation

This one is a bit out there. At my $dayjob we’ve been tasked with rethinking how delivery fulfillment works; I’ve become enthusiastic about the idea of modeling this as actor-owned task queues. Put another way, the data the system operates on would work much like how real time strategy games model commands issued to units. From there we’d layer on systems responsible for generating and assigning efficient task queues. (This might not seem like much of a brain-wave, but to be honest the way the system works right now was clouding my framing of the problem.)

Of course, thinking about real-time strategy games got me thinking about game programming again; at the same time, I’ve been missing some of my more visual experiments. It’d be fun to build a terrain or city generator, then model unit task queues on top of that! Kat’s recent work in this space is inspiring, also. (The presentation Kat linked to earwormed me.)

Currently reading / open in tabs:

Yoshua Wuyts’ writeup on compilers: Yosh is super thoughtful and thorough; I’m excited to read more of his writing. (Compliments aside, you can imagine how a writeup on compilers might dovetail with my current interests!)

“Why we love Rust”: a nice grab-bag of tips & tricks learned while delivering an impressive low-latency video & audio streaming solution. I’m going to try integrating some of the tools they suggest, like cargo-flamegraph.

Essentials of Metaheuristics: linked by a co-worker, Eli. As you can imagine, optimizing delivery involves a lot of (NP-hard) problems, for which heuristic approaches are a good fit. My experience with this is a bit distant, so I am refreshing my memory by re-reading the simulated annealing article on wikipedia. (This also dovetails with my desire to build a CFG diffing tool, which abuts the NP-hard-in-the-general-case longest common subsequence problem, for which a dynamic programming approach can solve specific cases in polynomial time. This is all to say: it’s up my alley, and look, there’s a Rust library for it!) I’m also taking a peek at Optaplanner on his recommendation.

Grain Language: a new language! Linked by Blaine. I like Blaine’s taste; if he’s enthusiastic about a language it’s probably worth checking out.

StaffEng.com: A great overview of what work in a staff engineering position looks like. I feel some relief at the seeing the enumeration of the types of roles staff engineers play at companies: I am somewhat allergic to working at large companies with lots of process, so I’m not a great fit for teams that use staff engineers to enforce process: whether that’s serving as architecture review boards or tech leads responsible for assigning work. However, I have it on other folks’ authority that I’m a good teacher, and I know how to tactically deploy my technical skill in service of my larger strategies. I’m in my feelings about this one still – that last year at NPM really did a number on me, not least of all because the goalposts of what staff engineering meant changed all at once. It’s probably worth a more complete blog post in the future.

Other thoughts which may yet become blog posts:

I’ve been ruminating a bit on my career progression lately. One of the aspects of staff engineering I’m interested in is how we process negativity, or grumpiness. It seems to me that as a profession we tend to reward negativity, so I (& others) displayed more of it when mid-career. I’ve walked a lot of that back over the last four years, but false positivity isn’t tenable in the long run, either. I’ve been stewing on Steve Klabnik’s keen observation:

As a programmer, I think it behooves us to think about not just the values that we hold, but the values of the people that use our software hold and, as a programmer, you should use tools that align with your values. I really like programming languages and learning new ones, but there are some that I have seen where I’m like, “You know what? This language is not for me, so I’m just not going to use it.” I’m not going to denigrate any languages by naming them, but it’s true that I would be unhappy if I had to program in some languages and that’s because they value different things than I value and that’s totally chill.

Ceej relayed the following from her husband, David:

Anger is a useful signal: it means your values are being violated in some way!

Speaking of grumpiness because of violated values, I’d like to see a positive workflow for ES Modules framed, if possible. Right now the story is all veggies and no dessert.

Finally on this note: I’ve been thinking about React SSR quite a bit lately. The thought bouncing around in my head is: React’s origin as a tool for managing a live DOM makes it ill-suited for use as a server-side template language, which is how most React SSR solutions position it. It’s another thing I find myself grumping about, and it might be worth digging into in the future.