I’ve been tech reviewing the second edition of Steve McConnell’s landmark book
Code Complete, due out in June. Bless his heart, he’s got an entire chapter devoted
to good variable naming practices. He touches on, but doesn’t fully
explore, two of the biggest sins in variable naming. Allow me to hop
up on my soapbox.

Bad variables are all over the place.
Usually it will be something like a short variable used for too long, like $n being used for the duration of an entire subroutine. The programmer might as well have been working in
TRS-80 BASIC, where only the first two characters of variable names were
significant, and we had to keep a handwritten lookup chart of names in
a spiral notebook next to the keyboard.


Sometimes you’ll find variables where all vowels have been removed as a shortening technique, instead of simple truncation, so you have $cstmr instead of $cust. I sure hope you don’t have to distinguish the customers from costumers!

There have also been intentionally bad variable names, where the writer
was more interested in being funny than useful. I’ve seen $crap
as a loop variable, and a colleague tells of overhauling
old code with a function called THE_LONE_RANGER_RIDES_AGAIN().
That’s not the type of bad variable name I mean.

Variable naming conventions can often turn into a religious war, but
I’m entirely confident when I declare The World’s Worst Variable Name to be:

$data

Of course it’s data! That’s what variables contain! That’s all they
ever can contain. It’s like you’re packing up your belongings to move to
a new house, and on the side of the box you write, in big black marker,
“matter.”

Even if it’s a function pointer, it’s data that tells the language what
function to run. Even if it’s undef or NULL, that the variable
contains that value is significant in itself.

Variables should say what type of data they hold. Asking the question
“what kind” is an easy way to enhance your variable naming. I once saw
$data used when reading a record from a database table. The code
was something like:

    $data = read_record();
    print "ID = ", $data["CUSTOMER_ID"];

Asking the question “what kind of $data” turns up immediate ideas
for renaming. $record would be a good start. $customer_record
would be better still.

I promised the two worst variable names, and I feel no fear of
disagreement as I declare The World’s Second Worst Variable Name to be:

$data2

More generally, any variable that relies on a numeral to distinguish it from a similar
variable needs to be refactored, immediately. Usually, you’ll see it like this:

    $total = $price * $qty;
    $total2 = $total - $discount;
    $total2 += $total * $taxrate;
    $total3 = $purchase_order_value + $available_credit;
    if ( $total2 < $total3 ) {
        print "You can't afford this order.";
    }

You can see this as an archaeological dig through the code.
At one point, the code only figured out the total cost of the order,
$total. If that’s all the code does, then $total is a fine name.
Unfortunately, someone came along later, added code for handling discounts
and tax rate, and took the lazy way out by putting it in $total2.
Finally, someone added some checking against the total that the user
can pay and named it $total3.

The real killer in this chunk of code is that if statement:

    if ( $total2 < $total3 )

You can’t read that without going back to figure out how it was
calculated. You have to look back up above to keep track of what’s what.

If you’re faced with naming something $total2, change the existing
name to something more specific. Spend the five minutes to name the
variables appropriately. This level of refactoring is one of the easiest,
cheapest and safest forms of refactoring you can have, especially if
the naming is confined to a single subroutine.

Let’s do a simple search-and-replace on the coding horror above:

    $order_total = $price * $qty;
    $payable_total = $order_total - $discount;
    $payable_total += $payable_total * $taxrate;
    $available_funds = $purchase_order_value + $availble_credit;
    if ( $payable_total < $available_funds ) {
        print "You can't afford this order.";
    }

The only thing that changed was the variable names, and already it’s
much easier to read. Now there’s no ambiguity as to what each of the
_total variables means. And look what we found: The comparison in
the if statement was reversed. Effective naming makes it obvious.

There is one exception to the rule that all variables ending with numerals are bad. If the entity itself is named with a number, then keep that as part of the name. A variable for the road running through town would be just fine as $route31. It would be silly to rename it as $route_thirty_one.



Finally, remember that all of these rules apply to subroutine and file naming as well. We often don’t spend enough time considering file names, but that’s a rant for another day.

What other naming sins drive you crazy?