Home
Foreword
Preface
Class Idioms
Collections
Implements
Constructors
Terminate
Forms
On Error
Frameworks
F.A.Q.
Value-added
FSMs
Constants
GOTO
Hungarian
Nothing
Properties
Big EXEs

These days it's hard to pick up a book about Visual Basic without the author trying to shove his own brand of coding standards down your throat. This isn´t, of itself, a bad thing (coding standards are certainly vital for multi-developer projects) but what useful information is available is swamped by a glut of unqualified advice masquerading as authority, making it difficult for a beginner to know who to believe.

From this melting-pot of well-intentioned dogma emerges one fairly consistent coding convention: the use of `hungarian´ prefixes on identifiers. Hungarian notation is so called for much the same reason that reverse polish notation isn´t called `reverse Lukasiewicz notation´, and in the grossly adulterated form expounded by VB programming books it deals with the use of prefixes to denote data type in variable names.

So why do we need type prefixes on variable names? Well, there are some precedents, chiefly from weakly-typed languages such as C. In C you can assign any variable to just about any other variable and the compiler won´t mind. You´ll often get warnings about incompatibilities, but these are implementation-dependent and chosen by the compiler vendor rather than the language definition, and you can also turn them off. Assigning between incompatible types is dangerous because you can lose data or even end up with semantic nonsense. C has other pitfalls too, including the array/pointer equivalence and the traditional idiom of coercing freely between pointers and integers. These features add up to a very powerful language, but the opportunities for error are enormous.

In the kind of `bare metal´ programming environment offered by C it´s so easy to lose track of what kind of data a particular variable holds that the use of type prefixes is a useful reality check. But what about Visual Basic? VB does have some annoying type coercion problems, although they aren´t in the same league as the ones we find in C (we can´t assign an object reference to an Integer variable, for example). Nevertheless, coercions are rife, and the VB literature doesn´t help by teaching beginners the kind of sloppy coding that relies on them. Perhaps there really is a case for type prefixes in VB?

One of the more fundamental principles of programming is to write your code for the reader, and this philosophy is at the heart of type prefixing. The aim is to make the code easy to understand, and one of the ways to do that is by keeping the focus as local as possible. The more pages your reader has to flip to assimilate the details of what he´s reading, the less likely he is to understand your code. By incorporating a type prefix on the identifier itself, you are cutting out some of this page-flipping.

But wait a minute, your reader is flipping pages? This may have rung true for C programmers in the eighties, but it´s pretty unusual to find anyone debugging Visual Basic code on paper. The Visual Basic IDE is designed for browsing code, and since version 5.0, the identifier lookup (Shift-F2) function has been so good that we can almost always find out a variable´s type with a single keystroke. This simple fact defuses one of the most compelling reasons for using type prefixes.

But even if we assume that type prefixes add value to Visual Basic code, they are becoming increasingly irrelevant as we move towards programming with objects. There are two main reasons for this. First, it is usual to omit type prefixes from public interfaces, so the more object types (classes) you programs rely on, the smaller the proportion of your code that will feature type prefixes. This idea also applies to named parameters, both in public interfaces and private ones (depending on where you draw the line).

The second reason that type prefixes are diminishing in usefulness is that they are inevitably applied only to intrinsic data types (Integer, string, Double, and so on). This means that as you begin to use more classes, user-defined types and enums, fewer and fewer of your variables warrant useful type prefixes and most end up carrying a generic `o´ or `obj´ prefix. The argument, then, is this: there are so many exceptions to the use of type prefixes that at some point we´ll find ourselves writing more code without prefixes than with them. Even if this isn´t strictly the case, producing code that uses type prefixes inconsistently undermines the intellectual manageability we are striving for.

The questions that really need to be addressed are: do type prefixes give us any benefits, and, if so, what are those benefits? All other things being equal, any prefix simply obscures the identifier name, and I'm no longer convinced that there are significant benefits to type prefixes in VB. Take controls, for example. We routinely prefix our control names with a suitable tag, yet this is nothing to do with enforcing data typing. Nine times out of ten, code that refers to a control simply sets a property or invokes a method - if the control doesn't support the property we're trying to assign, either we'll get a compile-time error, or, if we´re assuming the default property (in which case we deserve all we get), VB may do a coercion. So what´s the point?

There are only two reasons I can think of to prefix a control name with a tag: (i) to ensure we don't assign a reference to the control to an incompatible variable, and (ii) to give us a clue to the control type when we're browsing the code. We can discount (i) because it's only applicable in the rare situation where we're using As Object to fake polymorphism (or for other reasons, such as laziness or incompetence), and (ii) is tantamount to admitting we're going to choose poor names.

Something else I´ve never seen acknowledged is that what we call 'hungarian' notation has very little to do with what Charles Simonyi proposed in his original paper. Taken as a whole, Simonyi's coding conventions are aimed at producing terse, precise names from a pool of standard words and abbreviations - ie. they are ABSOLUTELY NOT about merely adding type prefixes to conventional variable names. Simonyi´s `hungarian´ system is designed to be so deliberately inflexible that, ideally, two different programmers coding the same program would come up with exactly the same variable names. This is anathema to the conventional wisdom of 'choosing meaningful variable names', as is his suggestion that we should use naked tags to name certain classes of variables (loop counters, for example).

Crucially, Simonyi´s claims for the effectiveness of his coding scheme depend on two features that are missing from any VB variation I've ever seen: (i) that a tag (prefix) is chosen for every programmer-defined data type (enum, UDT or class in VB), and (ii) that the scheme is applied without exception. Interestingly, in Code Complete Steve McConnell identifies point (i) and concludes that any scheme which doesn't do this results in 'a convention of little value'. He also notes that hungarian notation 'encourages lazy, uninformative variable names'.

It´s one thing to learn from the experience of others, but to adopt an arbitrary subset of conventions that somebody once found useful is something else entirely. There is some merit in having a standard set of semantic short-hands (Max, Min, First, Last, etc.) to use in constructing variable names, but even these aren´t strictly  necessary if the programmer chooses his names well. As Edsger Dijkstra said, `Besides a mathematical inclination, an exceptionally good mastery of one´s native tongue is the most vital asset of a competent programmer.´ Maybe we should be giving literacy tests to our potential programmers.

If we´re going to make rules for our programmers, let them be rules based on reason rather than mere tradition. And if a convention has outlived its usefulness, let´s put it to sleep and get on with business.
 

Key Spinner

© 1998 - 2009 Mark Hurst. All rights reserved.   Updated March 01, 2009