TDD Pro-Tip: Make Hard Problems Collections Of Toy Problems

Because TDD’ing little toy problems is so easy, I try to turn my big real-world problems into collections of them.

Yeah, it’s the steering premise rearing it’s ugly head again, though maybe from a different angle than we’ve undertaken before.

(The short premises video is here)

A lot of folks claim that TDD is just fine as long as you don’t work on their kind of problem. They mean that their daily bread is made from problems that are VERY HARD[tm], and that TDD could never make them more productive.

In my experience, such folks are laboring under multiple misunderstandings simultaneously. Some of these misunderstand TDD itself, and some of them seem to misunderstand computer programming altogether.

Here’s a reasonably difficult problem: given a list of Q’s, which are tuples of [P, T, S, L], where P is an integer and the types of the other don’t matter, perform a merge, such that the output is the smallest set of lists of Q such that 1) each Q appears once in the set of lists. 2) Q’s whose T’s, S’s, or L’s are equal and whose P’s form an unbroken sequence are on the same list.

(And there are some notes, too.)

Note: Solitaires, Q’s that don’t fit on any such list, exist, and must be returned.

Note: Do not return any identical lists, though a given Q may "want" to participate in several, make it participate in the longest one only.

Note: Throw away identical lists.

It sounds reasonably easy, doesn’t it? So easy, someone will immediately ask me about performance. My advice to them: get it right first, then we’ll worry about if it’s fast enough or small enough.

Now here’s the thing. Either you laugh and say "no-brainer", in which case it’s a toy, and TDD will work fine. Or you’re old enough to mess with it a little and say, as I did, damn. This thing’s gonna have to be de-composed.

Here’s a common misunderstanding about TDD: that it means standing on the outside after you’ve solved the problem and throwing test data at it. That little problem has mad combinatorics. Trying to capture every case from the outside will push you quickly beyond the limits of TDD-as-productivity-tool.

Here’s a common misunderstanding about computer programming: that there is ever any program that isn’t made up of parts. The skill of programming computers is always the skill of breaking things into parts. All programs decompose. All of them. This is the CT-thesis, the founding theory of computers. If it weren’t de-composable, all the way to the level of a Turing machine, you couldn’t write it.

When we put these pieces together and invert them — cuz they’re misunderstandings — we get how TDD, or at least my take on TDD actually works: break it into parts, and test the parts.

I’ve recently solved this problem, btw. It was grim. If one of you bright people actually has a no-brainer solution, let me know. Hell, I don’t know everything, maybe you can bang it out in a half-hour.

There were parts. My answer involves maybe 1000 lines of code, more or less classic "data structures plus algorithms" stuff.
Here are some of the parts I TDD’d on my way to getting the story done…

Can I find lists that are subsets of other lists? Can I split sorted sequences into consecutive runs? Can I (repeatedly, good lord) handle degenerates like empy lists and solitaires? Can I find winning lists when their are interlopers that don’t belong? When they’re unsorted? Along the way, of course, I also had to invent some data structures and some classes that expressed my algorithm. That involved refactoring, testing, and lots of rather surprising dumb-assedness on my part. Which brings us to one of the biggest misunderstandings of all…

A common misunderstanding about bugs is that most of them occur because of complex "rocket science" effects, far beyond the ken of anyone who isn’t a walking encylopedia of threading, functors, monads, patterns, idempotence, and so on and so forth.
Most bugs come from a single line of code, or a single overlooked case, or a single off-by-one error, or a single logic inversion.

Because parts.

Programs are complex machines built from simple machines. If the complex machine is borked, it’s because of something in the simple machines.
If the same complex program has the same bug on different runs on different machines, the odds of that bug being caused by anything other than the simple machines that make it up are vanishingly small.


I TDD toy problems and hard problems. I do that by breaking the hard problems into toy problems and TDD’ing them. The steering premise says "tests and testability are first-class participants of design". When I decompose my problems, I decompose them into testable parts. If the parts aren’t testable, I change my design until they are.

During the programming of that story, btw, I found easily a dozen or more bugs. Not one of them appeared in the top-level tests. I found every single one while testing the parts, and none by testing the whole.
Is it a lot of work, breaking hard problems into toy ones?

Don’t be ridiculous, of course it is.

But here’s the thing, Church-Turing means you had to do that anyway, whether you TDD’ed it or not.

The difference between TDDing a hard problem and not TDDing isn’t the de-composing. It is that, when I TDD it, I know each of the simple machines I’ve dreamed up does exactly what I wanted to.

Have a great Friday night or Saturday morning or idunno, whatever part of whatever day it is.

I’m gonna take a break from that stupid algorithm and go play my computer game.

Leave a Reply