Chunking and Naming


In our continuing conversation about refactoring, I want to go a little abstract today, and talk about chunking and naming. Naturally, a topic thisi important has already been addressed by a stupid joke in the movie Airplane!, so we’ll start there.

A passenger is approached by a steward, asking if he might be able to help with a problem in the cockpit. He says, "The cockpit! What is it?". She says, "It’s the little room at the front of the plane where the pilots sit, but that’s not important right now."

The number of things one can hold in one’s head at one time — mental bandwidth — is quite small. If the things are entirely unrelated to one another, it’s on the order of 4-6 things.

(When the things are related, we can do better, though still a surprisingly small number.)

By contrast, the number of things one can need to hold in one’s head at one time — context bandwidth — is ridiculously large. If the context is "reality", it’s effectively infinite. But far smaller contexts than reality, everyday ones, still dwarf mental bandwidth.

As geeks, we operate in contexts that are enormous: about the smallest real-world shipping app you’re likely to encounter, say 5-10kloc of code, still contains at least hundreds of things we might need to hold in our heads.

So the context is huge, and our mental bandwidth is itty-bitty. How then, do we ever pull this thinking business off at all? A key part of the answer is in what we call "chunking", and that dumb joke in Airplane! actually points right at it.

Here’s an experiment. We’ll get a book of old chess games, and pick positions out of it at random. We’ll show it on the screen for a few seconds, then ask our victims to re-create what they saw on a board right in front of them. We’ll score them on this.

If we take two kinds of victim, "random bozo I picked up at a sleazy bar I used to go to in south Philly", and "grandmasters", how do you think those two groups will score, compared to one another?

As you might imagine, the grandmasters dramatically outperform the random bozos in this experiment. (My god, is that, what is that on that guy’s shirt? Order another case of Lysol for the lab.)

Let’s tweak the experiment. Instead of picking random pages from our chess game collection, let’s make positions by randomly plopping pieces down on the board, and try again. Same question: how do you think the two groups will compare at our new task?

You’re likely assuming that the grandmasters will be better at this new experiment than the bozos. And you’d be right. But here’s a fascinating thing: in round 1, they did dramatically better, many times better. In round 2, they only do a little better.

a) How cool is that? and b) what’s going on here?

We can get a key insight by looking not so much at the scores of the two groups, but at the kind of mistakes they made when they didn’t re-create the positions successfully.

A mistake, in this experiment, involves pieces in the wrong place. And here’s the thing: random bozo mistakes were nearly always about individual pieces in the wrong place. Grandmaster mistakes were nearly always about whole subsets of pieces in the wrong place.

For instance, a "knight fork" in chess is a knight of one color that is one move away from capturing two other pieces. Next move, it could capture either A or B, hence the fork.

A random bozo’s mistaken reconstruction might get the knight, the A, or the B in the wrong place, off by one square. A grandmaster’s mistaken reconstruction, on the other hand, nearly always got all three pieces in the wrong place, off by one square.

Why? Because a grandmaster doesn’t experience the three individual pieces as individual pieces. She sees them as a single thing, the thing called a knight fork, not separate or individual, but a sub-assembly, a mental "chunk".

(Let’s set the experiment aside for now. Suffice to say that this isn’t new exciting fringe science, these experiments are from the middle ’50s of the last century, and there are lots and lots of them.)

When we chunk, we take a bunch of parts, give them some sort of mental tag or label, and from that point forward, we are largely blind to the parts. We see them as a chunk and we manipulate them as a chunk.

Every chunk, notice, is not only a single element in one context, but it’s also it’s own little context. Humans think about big contexts by using chunks as a kind of partitioning & layering technique, to shoehorn very large contexts into collections of very small ones.

And that is a key part of how bears of very little brain, like us, tackle contexts of very great size, like programs. We do it by pushing and popping chunks into being "the current context" or "some other thing we’re, for the moment, blind to".

That blindness to the other context is actually captured pretty well in that silly joke. "It’s the little room in the front of the plane where the pilots sit, BUT THAT’S NOT IMPORTANT RIGHT NOW."

In the design of our code, we use words like "well-factored", and principles like SOLID, and concepts like coupling and dependency, fan-in and fan-out. And all of this is really about describing the "chunkness" of our codebases.

We’re talking about whether our code is well-chunked or poorly-chunked. Remember that grandmasters do dramatically better with positions from real games than they do with random positions? It’s cuz real games are well-chunked, and random positions are ill-chunked.

When we say "it’s spaghetti code" or "don’t touch that, you don’t know what it’s connected to", what we’re really saying is "it’s not well-chunked". We’re saying it isn’t organized in such a fashion that we can make some chunk our current context and be blind to everything else.

"The database! What is it?" "It’s the room where we store stuff between runs, but that’s not important right now."

"The observable property! What is it?" "It’s how we let the UI see the effect of our change, but that’s not important right now."

The isBillableActivity! What is it?" "It’s the method that decides that the activity, the client, and regulations are legit, but that’s not important right now."

The quality of my chunks dramatically influences the speed with which I can ship more value, because it lets me work in contexts that are far larger than my mental bandwidth.

I do the broad constellation of technical practices we usually shorthand to "TDD & Refactoring" primarily because they lead me to well-chunked stuff, and well-chunked stuff leads me to ship more value faster.

That’s it for now. Think about the chunks you’re working with today. Think about the extent to which they let you and your team say "but that’s not important right now" as you operate within them.

All I can say is, I picked a helluva week to quit sniffing glue. I gotta run for now. Have a great chunky day!

Want new posts straight to your inbox once-a-week?
Scroll to Top