Tuesday, August 3, 2010

Unnecessary short variable names irritation

There is always a discussion between programmers (and code reviewers, etc) on how variable names should be assigned. Except the formatting (camel case, etc), there is also the issue of the length of variable names. Some people really dislike and even get irritated in short, non-descriptive variable names. Today I have encountered a blog post even given 5 reason why not to use short variable names. I find this discussion very unnecessary and people are really missing the point here.

The most important best practice in (readable) programming should be consistency. Having long or short variable names does not achieve this. Long variable names can be as confusing as short variable names.

An example: Let’s take the variable called total_order. What does this say… probably the total order amount, but is this including VAT or is including discount or including discount and VAT. Is the amount in US dollars or euro’s? Maybe it is not even the amount but the total items in the amount. Coming back to consistency, if the variable total_order is used in function x() as the total amount in USD and in function y() it is used as total amount in euro’s including discount, it really gets confusing. Having it called tot would force you to think (and investigate) what it would contain, which in some case would catch some nasty bugs then relying on a name of a variable.

As you can find reasons why not to use short variable names, I can find good reasons on not using long variable names. For me in variable names the following is more important:

  1. Consistent use of variable names, e.g.. using single character (or 2 character) variable names for temporary/looping purposes (like i or ix). This also means that shortening variables in the same manner. For example a variable containing a total should be throughout the program shortened to total and not to sometime tot and another time to totals.
  2. Use of comments explaining what a variable is required for and what it stores (and purpose). This means instead of using

    total_amount_vat_incl_discount

    I would prefer to see something like

    // contains total amount including VAT (with discount applied to it)
    totamount


    This clears any misconceptions about what the variable is used for. In the case of total_amount_vat_incl_discount: is the total amount including VAT or is it the total amount of VAT? 
  3. Readability. When having long variable names, the lines become unnecessary long and formulas look more complex. You actually loose the overview what is happening. For example, have a look at the following formula:

    ((total_order_amount_ex_vat – total_discount_customer – total_discount_sales_month) * ((1+ (vat_percentage/100)))/total_share_percentage) + total_amount_zero_vat

    (In this case too much total_ in the formula and it’s just too long)
  4. Programming speed. When using long variable names, it also means that you will need to type more (over and over again), There is counter argument that you should you a good editor, which allows auto-completion. However from my experience auto-completion does not really help here. If you have variables like: total_amount_incl_vat, total_amount_incl_discount, total_amount_ex_vat, total_amount_usd, you will get all of them and you will still need to scroll and find the correct variable. Most of the time, I even don’t use this functionality and just type it, because I am quicker then the editor.
  5. Minimize copy-paste errors during programming. Copy-Paste is a very common situation during programming, you mostly use an existing piece of code for a new module/function. If you have very specific variable names, you also start renaming/re-edit these variable names. During this process you always make typing errors or forget to rename one item, resulting in strange program behavior.

    Let me give a very basic simple example: You have a piece of code which does something in US Dollars. You copy paste it to a new function which does similar calculations in euro’s and have variables like amount_usd, amount_incl_usd, amount_usd_discount, etc. You will need to change them to amount_eur, amount_incl_eur, etc. You will also during this process make errors like instead of using amount_incl_eur, you use amount_inc_eur (missing l). It is just statistics, the more characters you need to type, the bigger the chances of typing errors.

At the end, as mentioned earlier, the focus should be on consistency and readability then the actual naming variables and you will need the right balance between what is long and what is short.

Happy programming

No comments: