|
With the introduction of Class modules in VB4, Visual Basic became an altogether more sophisticated programming language. At last we could program with objects in VB, and if Visual Basic still
didn't offer the true flexibility of an object-oriented language, the exclusion of error-prone language features such as pointers and explicit memory management at least made it safer by an order of magnitude than C++. But not everyone sees it that way. Here's a typical VB urban myth that crops up again and again in newsgroups:
If I had a get-well card for every time I've heard this nonsense, my name would be Craig Shergold. What's even more disturbing is that I've also seen it in the mandatory VB6 coding standards of at
least one large company. Time to set the record straight. Let's start with some moral support from Bruce McKinney, who in Hardcore Visual Basic says:
"In real Visual Basic code, references are rarely destroyed explicitly by setting an object variable to Nothing. Instead, objects are created and destroyed automatically when you pass objects to
procedures, return them from functions, set or get them with properties, or destroy other objects that hold references to them."
Let's have a closer look at how this works. The term 'object variable' is an ambiguous one, so let's dispense with it right away. Two kinds of thing are involved when we work with objects in Visual
Basic: objects and variables. A variable can't contain an object, so what we usually talk about as an object variable is nothing of the sort - it's actually a variable that points to
an object. By 'points to', we mean that the variable can hold a value that somehow describes where an object is. In COM, which is the basis of all VB's object behaviour, this value is called an 'object reference'. From here on we'll use
pointer variable to mean the variable, and object reference to talk about what the variable contains. Here's a summary:
- Object
: an instance of a class, typically created with the New keyword.
- Object reference
: a data value that points to an object.
- Pointer variable
: a Visual Basic variable that can contain an object reference.
Pointer variables are just like, say, integer variables except that for reasons unknown we have to use a different syntax to assign or compare pointer variables (Set and Is). A pointer variable doesn't
have to be a simple variable defined with a Dim statement, of course. It can also be a property of another object, a member of an array or collection, and so on. Although we can persuade VB to create objects
for us by using the New keyword, we have no direct control over when objects are destroyed. Object destruction, and the consequent freeing of resources, is controlled by the object itself and is triggered automatically
when the object has no more pointer variables pointing at it. The object knows when this is because it has a reference counter, which Visual Basic maintains behind the scenes as we manipulate our pointer variables.
The first object reference is created when we use the New keyword (Set ptr = New Class1, for example), and because we've saved this value in a pointer variable VB increments the object's reference counter to 1.
If we then copy the reference to another pointer variable (Set ptr2 = ptr1), the counter is incremented again, and so on. But what causes the reference counter to be decremented? Because object references are
inextricably linked to pointer variables, the only way a reference can be destroyed (and hence the object's reference counter decremented) is when one of the following things happens to a pointer variable:
- We assign a different value to the pointer variable. This can be the special value Nothing
or a reference to another object. The latter may be assigned from another pointer variable or it may be returned by the New statement. Either way, a reference to the original object is freed, and if that was the only reference, the object is destroyed (not left in limbo, causing a 'memory leak' as another common misconception would have it).
- The pointer variable goes out of scope (not strictly correct in the case of static variables - see later). If it is a local variable, this will be when the containing function terminates. If it is a module-level
variable, it will be when the containing module is destroyed (more on this later too).
The first of these descriptions should ring a bell, because it's widely (and incorrectly) regarded as the ONLY way to safely destroy objects. You should now realise, of course, that assigning Nothing
to a pointer variable doesn't necessarily result in the object's destruction since all it does is decrement the reference count. Crucially, the second point means that it is often unnecessary to assign Nothing
to the pointer variable, because it gets destroyed automatically when it goes out of scope. [NOTE: we should also say that 'going out of scope' is used here in the colloquial sense that a variable's lifetime always follows its scope, which isn't strictly true. This point is picked up later.]
This is usually where the argument starts to become confused, so we need to be very specific:
If a pointer variable is destroyed, whether explicitly or because it has gone out of scope, Visual Basic decrements the reference counter of the object that the variable was pointing at.
This is how Visual Basic is designed to work, and this is how it does work, yet two arguments come up again and again in opposition to this simple fact. They are:
- That reference counting is buggy.
- That we shouldn't rely on reference counting and so we should always set our variables to Nothing 'just to be safe'.
Point 1 is very hard to believe, since reference counting is fundamental to COM, and COM, in turn, is a well-established technology that permeates Windows at every level. But belief aside, nobody I have argued this
with has ever demonstrated such a bug. Point 2 would have some pragmatic merit if reference counting was demonstrably unreliable, but it isn't. As an argument in its own right, it's just superstitious nonsense.
Invariably we find that a problem advanced in support of either of these arguments is down to something else, and often the 'something else' is that the programmer doesn't understand forms. I'm not going to go into the
specific peculiarities of forms here (they are explained at length in Chapter 1 and Chapter 6
), but the most common source of misconceptions over VB object deallocation is a poor understanding of variable persistence. Variables persist according to different rules in different circumstances, and if
we don't understand the lifetime of a specific pointer variable, we can have no control over the lifetime of a corresponding object variable. We can partition all Visual Basic variables into four categories, as
follows (note that these terms are not generally used in the VB documentation):
- Automatic variables
: These are always local to a function, and are created and destroyed anew each time the function runs. Because automatic variables are allocated space on the stack, a recursive call to
the same function gets a separate copy of each variable.
- Fixed variables
: These are variables that have space allocated at program startup and which persist for the lifetime of the program. In VB, the only place we can create Fixed variables is in a BAS module or
a class with its instancing property set to GlobalMultiUse or GlobalSingleUse. (NOTE: this kind of variable was traditionally called static, but VB uses 'static' to mean something else.)
- Instance variables
: These are variables which are associated with classes, and which persist for the lifetime of the class instance. They are defined at the module level in a Class or Form file, and can be
Public or Private.
- Static variables
: These are local variables that persist between function calls. A static variable has the same persistence as a Fixed variable or an Instance variable, depending on whether it is defined in
a BAS module or a class module (forms count as classes).
You may have seen the different variable types discussed in relation to scope, but it's important to understand that persistence, not scope, is the issue here. We can consider fixed and automatic variables to
be at opposite ends of the persistence spectrum, and they have very different characteristics that influence all areas of programming. Traditional programming wisdom says it's usually preferable to use automatic
variables in preference to other kinds, but it's only when we start to store object references in variables that the effects of unexpected persistence become impossible to ignore. So, accepting that it's good practice
to use automatic variables, and given that such variables auto-destruct when they go out of scope, does this mean we should never
destroy object references in automatic pointer variables explicitly? Of course it doesn't - but let's be sure we know why we're doing it. At least three possible reasons come to mind:
- To optimise memory usage by freeing up objects as soon as possible. This is an acknowledged technique in other garbage-collecting languages such as Java.
- As a defensive coding measure. If we're finished with an object reference half-way down a function, setting the pointer to Nothing
will show up any subsequent lines where we accidentally try to use the object.
- There's a bug in Visual Basic error handling that shows up in Terminate events, and one of the ways around it requires that we explicitly destroy any local objects in the Terminate event handler. For a detailed
description of this, see The Trouble with Terminate.
The most obvious consequence of storing an object reference in a fixed variable is that we need to clear it out manually (by assigning Nothing) when we've finished with the reference. This is uncontroversial,
but it's a world away from clearing out an automatic variable just before exiting a function. Things are a little more obscure if we consider instance variables, which are typically private variables defined at the
module level of a class. In this case what we need to consider is the lifetime of the class instance, because all module-level variables die with the instance. Again this is well-defined behaviour, so setting pointer
variables to Nothing willy-nilly in the Terminate event serves no useful purpose. One of the most insidious problems with classes is the creation of circular references, and although this is often cited as a
good reason to nuke all pointer variables explicitly, the argument is a poor one because it's merely a way of papering over the cracks in a bad design. There's also no satisfactory way to do it in a class module,
because code in the Terminate event doesn't run until the object's reference count reaches zero! In form modules we can use the Form_Unload event, but this is bad practice because Form_Unload really has nothing to do
with the form's lifetime. Dan Appleman gives an excellent discussion of circular references in Developing ActiveX Components with Visual Basic 5.0. And so to forms. Forms are classes at heart, so everything
we've said about classes applies to them too. Unfortunately there are two extra things about forms that make our dealings with them more complicated than they have any right to be. I've said all of this at great length
in Chapter 6, so we'll have only the briefest summary here:
- Forms are effectively two-tier classes. There's a user-defined class that carries all of our code, and this contains
an auto-instantiated VB-defined class that carries the controls. The idea of containment is important at run time because we can't get anywhere near a form's visual object until we've created an instance of the user-defined class. The visual object also holds a reference to the user-defined object, so even if we destroy all external references to our class instance it won't die until the visual object is unloaded.
- Visual Basic creates a fixed variable with global scope for each form we define. This variable has the same name as the class and is auto-instantiated (effectively defined with a statement like 'Public Form1 As
New Form1').
There's no question that Visual Basic makes things unreasonably difficult with forms, and most anecdotes about buggy object deallocation can be traced back to here. |
|