Turning Implicit into Explicit

Turning implicit understanding into explicit code is a great productivity & quality step.

Let’s talk about some whys and hows for that idea.

As we forge into the topic, please do remember, this is just comfort food, not the main story. I proudly support my friends & family who are working for change in the world, and encourage them to keep at it.

Stay safe. Stay strong. Stay kind. Stay angry. Black lives matter.

An implicit understanding is anything a skilled geek would have to be told before being set loose in your codebase to make a change. Implicit understandings take lots of different forms, but at their simplest, they often involve hidden correlations.

Probably the most common form of implicit understanding is related to the smell we call primitive obsession. You’ll see a series of methods, and each method takes the same three arguments, an int for the ID, a string for the descriptor, and a decimal for the amount, let’s say.

Those three arguments are actually correlated: together, they uniquely identify a line item on an invoice. But there’s no "LineItem" thing. Instead, we just pass and pass and pass those three arguments. We know they’re correlated, but the code doesn’t say they’re correlated.

The fix here is pretty straightforward: Make a LineItem, and pass those instead. Now, my senior geek doesn’t need to be told about that correlation, it’s right there in the code. We have made our implicit understanding of the code into something direct and explicit.

Notice, too, even in this simple case, the fix may actually be just a first opening step in a long sequence of impressive transformation: perhaps those six methods that now all take a lineitem are best expressed as methods on a lineitem. They’re things lineitems know how to do.

This "follow-up opening" aspect of converting implicit to explicit is quite common, no matter the underlying form of the case. The fix for these cases will ofen serve as a kind of change gateway, freeing us to move much more rapidly as we reshape the code for our customers.

Another common case of implicit and explicit: an object with a default constructor and a bunch of setters, some of which must be set to use that object correctly. Teammates who’ve been there a while know this, but your noob senior will have to code-study or be told.

Again, there’s at least one straightforward fix: kill off that default constructor and replace it with one that requires all the settings that must be present. You make an implicit runtime dependency an explicit compile-time one. (There are other fixes possible, too.)

Another variant: like before, we have an object with setters, and one of those fields must be set to a legit value, but only if we’re going to call that one method that requires it.

Here there are at least three fixes:

Always require it, via constructor.
Require it to be passed to that method.
Create a wrapper for that method’s purpose and give that guy the required constructor argument.

These cases are all fairly modest, but the implicit->explicit thing is often present in other forms, more rich, with more variants, more complexity, and here’s the thing: even greater costs in terms of our productivity.

Now that everyone’s so service-crazy, significant apps are really many different programs running in orchestra. The user experiences one app, but the codebase is twenty apps big.

And every single program maintains its own configuration block, typically in a profile or registry system, pointing at all the others that it’s supposed to be using. The implicit understanding? All those URL strings have to be correlated across all those programs. Whoops.

One routinely sees this implicit understanding cost productivity and reduce quality. "Bugs" pop up that are just wrong environment connections. Developers set permissions and can get to the first two programs but not the third. Weird messages bang out on slack about new URLs.

There are several fixes possible here. The least lovely one I’ve seen is a master database, but it’s viable. Stronger, still, would be using the true REST style, where returned messages actually include the correct links, instead of assuming that client already has them.

Here we see two powerful aspects of the implicit->explicit conversion:

duplication-killing,
cheap testability.

Duplication-killing: If we put the UserQuery endpoint’s URL in the profile for the twenty apps that sometimes need it, we got twenty copies of it. If we return it, we can make it a single copy.

Testability: I can’t test my downstream’s configuration file. I don’t even know who that downstream is or where it keeps that kind of thing. But I can easily test that the configuration I give any downstream who asks is correct and consistent.

Another case, super-common in web environments, be they monolithed or microserviced: control-flow between steps in a multi-page application is maintained by an implicit state machine.

We know that you can’t get to page 3b of the process unless we’ve done 1 and 2 and 3a, but the code doesn’t know it. And because it doesn’t, we can’t test for it, and we can’t guarantee it.

And here’s something every long-term library writer knows only too well: if it’s possible to do the wrong thing at the wrong time, someone somewhere will find a way to do it. 🙂

One sees a lot of apps that started from some domain stuff in a database grow very gradually into a lot of "deduced state", here and there mixed with some "assumed state".

They’re often even accompanied by drawings that show the desired state machine, slightly out of date. But that machine is nowhere to be found in the code. And because it’s not, changing it becomes a matter of guesswork and folklore, and eventually, first-round QA rejection.

Notice, again, if that state machine is explicit in the codebase, I can test it for consistency before it ever gets in the hands of a UI. If it’s implicit, distributed here and there, deduced in some places and assumed in others, I can’t.

Meta-flow environments, where we’re supposed to be using a database to describe the flow, are particularly prone to this problem. But the fix isn’t that hard, even there: you can prevent database editors from breaking it, or you can run heavyweight integrity tests. Both work.

What kind of approaches can we adopt to minimize/mitigate the amount of implicit understanding lurking in our code? There are several. Some are strictly technical, some more process-y. The key isn’t any one magic solution, it’s learning to see the problem-family.

At a hardcore geekery level:

immutability is your very very good friend.
create new objects at the drop of a hat.
suspect config blocks that have more than one URL in them.
Learn the state pattern and especially how to TDD state machines.

At a process level:

listen to an outsider senior encountering your codebase.
Turn flow diagrams into explicit code.
Avoid deducing state from domain fields that aren’t about state.
Take config bugs as real bugs.

I’m gonna link a quote from me, which feels a little creepy, but I think it’s relevant here:

"If you’re a geek, your entire job is the translation of human language, which is neither precise nor complete, into computer language, which is both. Doing this takes considerable human judgment. That’s why they pay you for it." @GeePawHill (posted by @mariopiogioiosa)

A key part of our job, as programmers, is making implicit human understandings explicit in the code. Doing this isn’t just a matter of intellectual purity, it has real consequences for our productivity and quality.

When you go into work this week, think about what’s implicit in your codebase but not explicit. As challenges flow into your team, think about which ones are harder to get at precisely because they derive from implicit understanding.

And get back to me. 🙂

Supporting The PawCast

If you love the GeePaw Podcast, consider a monthly donation to help keep the content flowing. Support GeePaw Here. You can also participate by sending in voice messages to be included in the podcasts. These can be questions, comments, prompts, etc. Submit A Voice Message Here.