TDD’s Goldilocks Challenge

Successful microtest TDD depends on our ability to solve a goldilocks challenge: when we test our code pieces, do we isolate them too much, too little, or just right? Finding the sweet spot will mean letting go of rulesets & purity.

As is my wont, let me remind you, geekery isn’t the only story going on around us. I write of it for my rest & comfort, and perhaps for yours, but it’s a break from more important things.

Black Lives Matter. Please keep working for change and supporting those who do.

The five premises we’ve been over: value, judgment, correlation, steering, and pieces, guide us in a complicated dance of software design. At the center of that dance is our awareness of — and our manipulation of — "dependency", how one part of the code uses other parts of it.

From the moment you wrote your first program, you produced a high-structure high-detail text that depends on the output of other high-structure high-detail text.

fun main(args:String) {
println("Hello world!")
}

By "depends" here, I mean that your program can not run, even little ol’ hello-world, can not run, without the existence and correct operation of println.

And tho it’s not clear when you’re getting started, it’s dependencies all the way down. All the software that runs on Von Neumann architectures works this way, down to the place where we mix sand & metal into tiny logic circuits. It’s all dependency. A uses B uses C uses forever.

Every program, no matter how big, has a top-most element, a place where it starts. And it depends on other elements, and so on. The app you’re reading this on has many millions of these dependencies before it bottoms out in the metal.

In professional geekery, our livelihood depends on our ability to arrange those dependencies to do what we want. Further, as we’ve made clear, we don’t do this once, but more or less constantly, throughout the lifetime of the codebase, both before and after we ship it.

TDD is predicated on the idea that we can go faster at all this if we have a rich & verified understanding of each piece in these huge assemblies of dependency. I won’t re-visit that case today, but will take it for granted.

How do we get that rich & verified understanding?

Two answers, one conceptual and one mechanical.

Conceptually, by isolating each piece, using it in the app we’ve called the making app, poking it and prodding it to see whether it does what we think it does.

Mechanically, we have an above the board answer — legal, simple, proper, decent — and a below the board answer — cheating. TDD’ers mix both approaches all the time, it depends entirely on the circumstance, hence the need for judgment.

Above the board: we replace the dependency — basically changing the arguments — with a dependency that is very simple in which we have extremely high confidence. A one-liner: replace a database connection dependency with the list of results from someone else using the database.

Below the board: we create a faked testing version of the dependency and coerce the piece we’re testing to use our fake instead of the one it uses in the shipping app.

A one-liner: pass the database connection as before, but use an auto-mocker to pass a completely fake completely controlled version of that connection.

Set aside the above-the-board technique, we’ll come back to it another time. It’s very powerful, it’s always our first choice, but it has limits. Instead, consider the below-the-board choice: cheat a fake dependency into place.

Remember that the element we’re testing has a dependency, and that has others, and they have others, in a roughly pyramidal dependency graph all the way down to the metal.

When we go below the board, we’re chopping that dependency graph right at the first node.

In the shipping app, A -> B -> C -> gigantic forever. But in the making app, it’s A -> FakeB -> modest forever. We just took a scalpel to the dependency graph.

We are isolating A, comparatively, so we can focus all our attention on getting a rich verified understanding of it.

That was a lot, but now we’re ready to talk about the goldilocks challenge.

Are we isolating the element under test too much, not enough, or just right?

Consider zero isolation. For all but the most trivial of elements, this presents problems almost immediately. (That’s why we avoid, as far as possible, large-scale tests.)

Some dependencies present difficulties — we call them awkwardness — in terms of runtime: they take too long. Some are awkward in result: it’s hard to find out what they did. Some are awkward in setup: it’s hard to rig them with the right data. And so on.

These problems mount rather quickly, and the effect of awkwardness is that it makes us less likely to write, read, run, or debug the tests. We lose the confidence-giving value of TDD (and typically the rhythm & profluence of it, too).

We lose.

Consider 100% isolation: here the issue is a little more subtle, but just as real: every faked dependency in our making app is testing our element in an "unreal" setting. That unrealness can hide lots and lots of very real problems.

For tests against a faked dependency to be valuable, we must be very sure we know exactly what the real dependency does. How it reacts, what it considers legal input and correct output.

That’s a lot of work, it’s a lot of mental work, Plato-land. If we could count on our ability in Plato-land to serve us in code-land, we wouldn’t be testing at all, would we?

Every faked dependency takes a tiny bite out of our confidence.

Take enough bites, we lose.

And, a reminder: this whole thing is about confidence. There’s no viable way to be certain, there’s only lesser and greater ways to gain reasonable confidence.

So what do we do?

Strive for that above-the-board answer, wherever we can get it. It works, and it works really well. Studying hexagonal architecture will help us with that.
Look for dependency "clumps", where every function/class/module on the inside is graceful, and only those on the border have to interact with awkward.
Learn the Strategy pattern and the Adapter pattern and the Observer pattern, really really well. All three of these morph the dependency lines, often in just such a way as to create those clumps.
Every time you don’t want to write a test, take a minute to wonder why, to open the possibility that a slightly different design wouldn’t have that problem. In particular, don’t just reach for your auto-mocker. Auto-mocking tools do most folks in most shops a disservice.
Finally, abandon intellectual purity. You don’t want 0% isolation and you don’t want 100% isolation. Don’t make rules about it, don’t enforce them on others. Make experience, and nurture judgment.

TDD requires us to solve a goldilocks challenge: how much isolation do we choose? Too much, we lose. Too little, we lose.

The hardest part of learning TDD is building the judgment you need to get just the right amount of isolation in your making app.

Support GeePaw

If you love the GeePaw Podcast, consider a monthly donation to help keep the content flowing. You can also subscribe to get weekly posts sent straight to your inbox. And to get more involved in the conversation, jump into the Camerata and start talking to other like-minded Change-Harvesters today.