Large-Scale Refactorings

Large-Scale Refactorings — given the recent refactoring-related muses, a respondent asked me to talk about large or larger scale refactoring work.

(I love it when people’s questions trigger my writing. Please do ask these things.)

First things first, “large scale refactoring” is really a colloquial expression, a shorthand we sometimes use, but in my experience there is no such thing, for two reasons.

The first reason is definitional. Remember, refactoring doesn’t mean “changing my design”, it means “changing my design without changing its function”. We almost never do this at large scale.

The marvel and the horror of this trade is the extraordinary demand pressure it operates under. Anyone who’s done time in the silicon mines knows this. It is the origin of nearly every bad process in the trade.

“Takin’ it off, here, boss!” says Cool Hand Luke.

They pay us to change function. They don’t pay us to change design, except in so much as it is required to transform function or to improve function-transformation process.

So from a practical perspective, it basically just doesn’t happen. One doesn’t get large blocks of time where one is only transforming design, not function.


Having made that gritty little pedantic point, I still have to say, “I know what you mean, though.”

(Apologies for the pause. Back live in T minus one cigarette.)

So, if we don’t get large blocks of design change, what *do* we get? We get many small alternating blocks of design change and function change.

And that segues into the second reason there’s no such thing as large-scale refactoring: because all large-scale refactorings are just chains (usually intermittent) of small-scale ones.

This kind of work is a perfect fit for the change-harvester’s “local” and “oriented” approaches to change.

Local seems obvious: we make small changes because that’s all the time we have. Oriented, though, is a little more subtle.

In theory, it might be possible to plot out a perfect intermittent chain of steps that is precisely aimed at the large-scale goal. This is, conceptually, taking the long straight line segment from A to B and breaking it into many smaller line segments.

There is nothing inherently broken about this technique. It is, in fact, exactly the approach I take when I’m chaining refactorings together to make a modest or mid-scale transformation.

In the hands of a skilled geek who knows the affected part of her codebase quite well, it is possible to do this on a time scale up to about a couple of days. Nevertheless, there are at least three issues that make it less successful beyond that threshold.

One issue: remember that those steps are alternating with function-changing work, and the function-changing quite often impacts the code that, just last week, we were sure we knew how to transform.

A second issue: beyond that threshold of a couple of days, it is *extremely* rare that your geek can actually hold all the parts of the problem in her head successfully. If you plot “things to remember while changing” against “ease of change”, the relationship isn’t linear.

A third, and very real-life issue: if your geek is actually that good, she’s a key in your operation, and she very likely has numerous people making numerous other demands on her attention and focus.

So, precise aiming isn’t very effective. Does that mean we have to give up on large-scale transformation of our code base? No.

What we need, instead, is to adopt an oriented approach: imagining the large-scale goal on the horizon, turn to face it, and make a local change.

The oriented approach rejects the fear of rework, abandons faux-optimizations based on faux-straight-lines, and embraces ongoing continuous change. It leans in, in fact, to one of our greatest strengths as species: rapid adaptability to context.

How’s this all work in practice? I’ll use a real example from a couple years back to show how it goes.

We got an that manipulates a ton of upstream data in the form of dates and prices to produce a ton of analysis and prediction data also in the form of dates and prices. All of the prices we work with are *unit* prices, until, yeah, that day they weren’t.

Now, no big deal, right? We just look up the implicit lot sizes, take the lot prices, and turn them into unit prices, and Bob’syeruncle!

Suspect that word “just”.

The problem is that we have to manipulate all the objects associated with these prices in amazingly complex non-price-related ways, right up until the very end, when they’re in a form that we can then establish their lot sizes.

So this forced us to say “we have to turn the artist formerly known as Price (god I’m funny) into the artist known as AccumulatedDataThatProbablyWillEventuallyHoldAUnitPriceExceptIfItDoesntItWillNotThrowAFatalException.” (Grown-ups use *all* the letters in our names.)

This is a classic “large-scale refactoring”, except, as I say, it’s not. It’s a substantial change of design with a moderate change of function.

And tho the absence of that change is blocking one story, there are other stories in play, and we needed to keep feeding functional change to our customer, so we have to distribute that large transformation across a block of time with lots of interruptions.

It was not the end of the world, it took us, elapsed time, maybe a month. The shape of our process was intermittent, partial, and stepwise, just as it would have been had we aimed precisely. But it was also “wrong but better” at numerous points.

There was a logical ordering of steps, don’t get me wrong, especially right at the beginning. Step 1: Make price know whether it’s legit. Step 2: Make intermediate displays able to render a non-legitimate price however crudely.

These two steps, together, had to come first, because they were how we could keep the rest of the app running — including new changed function — while we worked our way towards the goal on the horizon.

After those first two steps, it was basically willy-nilly: each time we made an unrelated change for another story, we made sure it could deal with prices that hadn’t resolved. Sometimes, one of us would take an hour or two and *only* go find a dependent or two and go fix them.

Eventually, we came to a tipping-point: it was easier to stop and fix all remaining issues and kill off the old code entirely than it was to keep piecemealing the thing.

A *really* important factor: the extremely high level of straightforward and casual conversation within our team, including very much our product-expert maker, the person y’all usually call the PO.

This is getting long, and like everyone else on the planet, I’m kinda fragmented in attention right now.

I’m gonna close with some don’ts about large-scale transformations of code.

Don’t go dark. Socialize the hell out of what’s out on the horizon, including to your customers and customer proxies. People are surprisingly able to handle complications, if you only trust them.

Don’t go fatal. If you’re doing a transformation because you have a problem, the number one priority is not to fix the problem, it’s to keep the problem in this part of the forest from shutting down the rest of the forest.

Don’t wait to be sure you’re doing the right thing. This is *really* hard to get used to, and is the heart of the “oriented” change, but it is important. People worry about whether they’ll be wrong. Never worry about that: you’ll definitely be wrong. Worry about adapting to it.

Don’t stop producing visible change. Whatever your transformation is, it’s a change to a system that has both resolved and ongoing value. Visible change, even the change that isn’t the fix, is profoundly comforting to your users.

All those don’ts, but one do, especially for my geeks: learn strong brownfield technique, and obsess over the boundaries between the the “things” in your code. The code works for you, you don’t work for the code.

I’m out, for now, I have a large-scale transformation to kontentment I’m working on today.

A Request of the Community

Normally, this is where I plug my site. Today, and for the next while, I have a more important request. I’m GeePaw. My GeeKid, Threy, needs some financial help I can’t give just now.

If you like my work, do me a solid, and go to the GoFundMe we set up for him.

Help if you can, please.

GeePaw Hill

GeePaw’s Camerata is a community of software developers leading in changing the industry. Becoming a member also gives you exclusive access and discounts on the site.
Want new posts straight to your inbox once-a-week?
Scroll to Top