Refactoring Pro-Tip: Making Local Variables Maximally Local

Refactoring Pro-Tip:

When I tackle a long method, the first thing I do is make my local variables maximally local.

Consider this psuedo-code. It’s basic stuff, I’m betting virtually any geek can read it, but if not, lemme know.

longMethod( int y ) {
  int x;
  x = 0;
  // ... 
  // IRRELEVANCY #0: 87 lines that neither read nor write x
  // ...
  if( y == 17 ) {
    x = 1;
  }
  // ...
  // IRRELEVANCY #1: 87 more lines that neither read nor write x
  // ...
  if( someVeryInterestingConditionHavingNothingToDoWithX() ) {
    doTheThing(x)
    }
  // ...
  // IRRELEVANCY #2: 87 more lines that neither read nor write x
  // ... 
  }

In this chunk we have a long method. The first thing I’ll do when I decide to refactor a method like this — it’s a decision made in a context, not some rigorous algorithm — is study the locals, making the tiny little changes that will make them as local as I can make them.

Lines 2 & 3 might seem unusual. I hope they do, in fact, cuz we don’t write code like that anymore. But we used to. And it can still happen. When I’m caught up in the hurly-burly, I might have changed my mind five times on the way to getting to this ugliness.

Hold on, everybody, this one’s gonna be wild!!!! Okay, seriously, just join the damned declaration and its initialization into one line.
We’re on our way.

Now what I do is push that line down the block of code as far down as it can go and as far in as it can go. Down is obvious: I mean grab hold of the line in your editor and move it to right before it’s first use, at the same level of indent it’s at right now.

"In" means as far right in the indents as it will go and still compile. With slalom methods, an original author can easily forget to do this. (Note: I’mo tweak the starting code right now to show this, so hang tough for a second.)

Before we do "in", though, let’s look at that snippet: int x=0; if(y==17) x=1;

That’s an example of a single-assign masquerading as two assigns. We can tell this, cuz there are no lines in between the first assign and the second.

we have a perfect little extraction to make this clear.

longMethod( int y ) {
  // ... 
  // IRRELEVANCY #0: 87 lines that neither read nor write x
  // ...
  int x=chooseX(y)
  // ...
  // IRRELEVANCY #1: 87 more lines that neither read nor write x
  // ...
  if( someVeryInterestingConditionHavingNothingToDoWithX() ) {
    doTheThing(x)
    }
  // ...
  // IRRELEVANCY #2: 87 more lines that neither read nor write x
  // ... 
  }

int chooseX(int y) { 
    if(y==17) return 1; 
    else return 0; 
    }

If we have a ternary operator, we could use that, too:

x = (y==17) ? 1 : 0;

Got me?

So now I can push it in. By pushing it in we’ve cut x’s lifetime from about 250 lines to 3. We’ve clearly indicated, too, that the other 250 lines are irrelevant.

longMethod( int y ) {
  // ... 
  // IRRELEVANCY #0: 87 lines that neither read nor write x
  // ...
  // ...
  // IRRELEVANCY #1: 87 more lines that neither read nor write x
  // ...
  if( someVeryInterestingConditionHavingNothingToDoWithX() ) {
    int x = chooseX(y)
    doTheThing(x)
    }
  // ...
  // IRRELEVANCY #2: 87 more lines that neither read nor write x
  // ... 
  }

Last little piece? Now it’s obvious that x isn’t just single-assign, it’s single-use. That is, we only use that little value one time. That means we can inline the variable declaration, yielding: doTheThing(chooseX(y)); The variable x has disappeared.

Okay, so let’s make some points about this process.

First, that was easy, wasn’t it? There was no point where we did anything scary or even tricky. We like that. We like that a lot. Momma dint raise no thinkers, and we did very little thinking. Instead, we performed a bunch of very tiny steps that made it so we didn’t have to.

Second, you followed that as I did it, without the slightest awareness of the domain. I didn’t have to go find an SME, research how charges work, study the internals of psuedo-language frameworks. All I had to do was read what was there.

Third, as easy as that was, we did a wonderful thing. We made the code work well with the body of the human reading it. When we started, X had to be ready-to-mind for 250 lines. at the end, it didn’t have to be ready-to-mind at all.

I can not stress this enough: I DO MY BEST WORK WHEN I AM WELL WITHIN THE LIMITS OF MY HUMAN BODY. Remembering a value for 250 lines is not something human bodies do well.

Now, what about some variations that make this trickier? We had a two-way single-assign, but a lot of those are n-way. You have not an if but what may or may not be an explicit switch statement. It’s still a single-assign, but it’s hard to see.

It’s also trickier when you don’t know whether those intervening irrelevancies have side-effects that might make moving the test-against-y around A Bad Thing[tm].

It’s trickier, too, when a variable is re-used for different purposes. Hard to put this in a snippet, but like this:

x <- singleassign
dosomething with x
x <- singleassign
dosomethingelse with x.

that’s not two assigns, that’s a variable being reused for two different purposes.

It’s trickier when the variable really is called "x".

Look, kids, if you’re not doing geometry in the code — and usually even if you are — "x" is a shite name.

In real life, I’d rename x as soon as I saw how it was used, in that doTheThing code.

So, yes, it’s not always as easy and obvious as this. But here’s the thing, this is usually the easiest and most obvious problem with a long method, so it’s the place I start anyway.

My actual technique looks like what Hofstadter called a "terraced scan", a subject for another muse. But in the beginning, the first ideas, the easiest nearest fixable owwies? They’re in making the local variables as local as they can be.

When I tackle a long method, the first thing I do is make my local variables as maximally local as I can.

Thanks, and have a quietly perturbing Wednesday night!

Leave a Reply