On (Not) Using Mocking Frameworks

I’m long past on record that I think the use of auto-mockers outside of legacy rescue situations is bad policy.

First, it’s easy to write "psuedo-tests" using an automocker.

Psuedo-tests are tests that appear to prove things about your code that they don’t actually prove.

Now, note, I’m not saying auto-mockers force one to write psuedo-tests. They don’t. But they do make it awfully easy. How? The combination of "don’t care" arguments in mocked method specs with hardwired returns makes it easy to write tests that skip logic in the tested unit.

Consider:

a = ...
b = ...
return x.c(a,b)

where x.c is automocked to take any parameters and return a hardwired value. The only thing our test tests is that those two elisions don’t throw. That’s not usually what we meant to test.

Now, again, I point out, we don’t have to do it this way. We could setup our mock call to take known values of a and b, or even generic values that it uses to return non-hardwired results. Either way, we’re getting somewhere. But you see how easy it is to do this kind of thing.

Second, the syntax of automockers, in general, is both brilliant and largely impenetrable to people who aren’t already quite skilled in parsing the language under test.

This is both a difficulty in itself, and a contributing factor to that first point.

Studying your automocker’s source code is nearly always taking a master class in the odd corners of your programming language. I greatly admire the ingenuity of automocker authors. It’s usually, to be honest, breath-taking craftwork.

The thing that makes that work so ingenious is its attempt at absolute generality. Problem is, that’s the same thing that makes it so hard to write complex mocking without copy/pasting code you don’t understand.

But when you hand-write your mocking setups/verify’s, rather than copy/pasting them, the tiniest typo will normally spew an error message with 300 words and 19 nested angle-brackets. So. You don’t. You copy/paste code you don’t grok. We all know what happens when we do this.

Third, auto-mockers make it easy to write (possibly psuedo-) tests against poorly factored code instead of doing what would most help, which is refactoring it.

This is exactly why they’re so great in legacy rescue, but it amounts to encouraging us to make new legacy.

An auto-mocker is perfectly content to mock two methods of a class with 90 methods, for instance. But why do we have a class with 90 public methods? If we have a caller that needs to do two things, we’d likely be much better off with a callee that only does two things.

If you hand-rolled fakes off of interfaces, you’d never make a 90-method interface and hand-fake it. You’d get yourself a 2-method interface (or thereabouts), add it to the interfaces of the shipping class, and roll up a testing implementation for it.

This is often particularly apparent when working in code that has no composed methods. You see yourself writing multiple setups against the same automocked object, because you’ve factored two things together that don’t "belong" together.

The deep answer is usually that the method you’re testing should have been a straight-line composed method with zero logic, and the methods you’ve extracted should be tested separately.

Fourth, and I suspect this one is going to outrage some folks: behavior-testing, as opposed to result-testing, is highly fragile-under-change. Refactoring breaks tests much more often when we test behavior.

A calls B twice, and A’s tests automock B. The new kid shows us how to do whatever B does in those two calls in one call with different params and different internal logic. Change B and test it, change A, and, though A’s results are identical before and after, A’s test fails.

This, btw, is even more frequent and more painful if we’ve actually resolved the first issue I mentioned by using hard discipline. The more detail we put into A’s test’s setups, the easier it is to break A’s test, when it didn’t need to be broken at all.

There really are situations where testing behavior is the right thing to do. But in my experience, they’re comparatively rare. So rare, in fact, that they don’t justify me using an automocker at all, but instead force me to hand-roll a mock.

One more note:

I don’t advocate auto-mockers outside of active legacy rescue. Why do I advocate their usage inside active legacy rescue? Because auto-mockers let you sketch a throwaway pindown test in seconds, one that would take you hours to write without them.

(That phrase, "active legacy rescue", is meant to say that we are busily un-legacy-ing some code, refactoring it actively to make it what we want. To be super-clear: That’s not the same thing as just "working in legacy".)

I don’t advocate automockers outside of legacy rescue for these four reasons:

Pseudotests
Over-general syntax
Poor-design support
Behavioral fragility

You’re better off, short-, medium-, and long-term, using other tools and learning other techniques.

Do you love the GeePaw Podcast?

If so, consider a monthly donation to help keep the content flowing. You can also subscribe for free to get weekly posts sent straight to your inbox. And to get more involved in the conversation, jump into the Camerata and start talking to other like-minded Change-Harvesters today.