"Well, at least it compiled OK!"
The value of software testing
In the previous version of this book I started my chapter with
the rather bold statement, "It is my belief that software development is
one of the most complex tasks that human beings are called on to perform."
This assertion led to a few comments coming my way, along the lines of,
"Well, what about people who design nuclear reactors, or the space
shuttle?" and so on. If we had an imaginary list of all the jobs that
people actually do, sure, there would be a bunch of stuff above "Write
Software." However, for the purposes of this discussion we could actually
argue that "Write Software" could appear on the list twice. At some
point lower down in the list would be the basic "Write Software"
entry, but much higher up the list would be "Write Quality Software."
Quality consists of many attributes, such as ease-of-use, stability, the
clarity and depth of help files, the range of features, and even millennium
compliance! However, the most important attribute is that a piece of software
correctly performs the actions that it is designed to do. In this chapter I
will discuss ways in which to test Visual Basic code. By the time you reach the
end of the chapter, I hope you will have grasped the important concept that I
want to convey: testing should be planned before code is written, and all code
should be written with testing in mind.
Software development projects come in all shapes and sizes,
and they also vary in their nature. In terms of business applications, a
standalone telephone directory system for a medium-sized company probably
exists at the "relatively simple" end of the spectrum. With an
underlying Access database and a few data-bound controls, it doesn't take much
effort to put together, and because of the extensive use of pretested
components it probably wouldn't be too challenging in terms of testing, either.
However, at the other end of the spectrum might be a major banking application
that facilitates the processing of all branch-level account activity using a
remote ActiveX server set via a local SQL Server database, and then
synchronizes each local set of data with the central bank's DB2 database. A
system of this size is a much larger development effort that therefore requires
a more thorough and comprehensive planning and design exercise, and the result
is of course a greater amount of source code. The implications of a failure in
the code could result in a loss of money for the bank, and this in turn leads
to a loss of credibility in the business community. It is therefore imperative
that the system works as expected.
Another example, perhaps closer to home for many of us, is the
forthcoming Windows NT 5. Back in the days when Windows NT existed as version
3.51, it had about 5 million lines of code. The last released version of
Windows NT 4 Server was the Enterprise Edition, which included Internet
Information Server, Message Queue Server, and Transaction Server (all now
considered to be part of the base platform), and contained 16 million lines of
code in all. At the time of this writing (April 1998) it is estimated that the
final gold release of Windows NT 5 will have about 31 million lines of code.
This new code also includes some fairly radical alterations to the existing
technology (most notably the inclusion of Active Directory services). With so
much code, the testing effort will be massive. In this particular case the 400
or so programmers are joined by another 400 testers, providing a ratio of one
tester for every developer. The overall team also has around 100 program
managers and an additional 250 people on internationalization.
Although software development has formal guidelines that lay
down techniques for drawing up logic tables, state tables, flow charts, and the
like, the commercial pressures that are frequently placed on a development team
often mean that a system must be ready by a certain date no matter what. Some
people might choose to argue with this observation, but it happens
nevertheless. One of the biggest headaches for a software developer is time-or
rather, the lack of it. When your project lead sits you down and asks you to
estimate the amount of time that it will take to code up an application from a
design specification that he or she has given you, it is difficult to know
beforehand what problems will arise during the project. You are also faced with
a bit of a dilemma between giving yourself enough time to code it and not
wanting to look bad because you think that the project might take longer than
the project lead thinks it will. The logical part of your mind cuts in and
tells you not to worry because it all looks straightforward. However, as the
development cycle proceeds and the usual crop of interruptions comes and goes,
you find that you are running short of time. The pressure is on for you to
produce visible deliverables, so the quality aspect tends to get overlooked in
the race to hit the deadline. Code is written, executed once to check that it
runs correctly, and you're on to the next bit. Then, eventually, the
development phase nears its end-you're a bit late, but that's because of
(insert one of any number of reasons here)-and you have two weeks of testing to
do before the users get it. The first week of the two-week test period is taken
up with fixing a few obvious bugs and meeting with both the users and technical
support over the implementation plan. At the beginning of the second week, you
start to write your test plan, and you realize that there just isn't time to do
it justice. The users, however, raise the alarm that there are bugs and so the
implementation date is pushed back by a month while the problems are sorted
out. When it finally goes live you get transferred onto another project but
quietly spend the next fortnight trying to get the documentation done before
anyone notices.
I dread to think how many systems have been developed under
these conditions. I'm not saying that all development projects are like this,
and the problems are slightly different when there is a team of developers
involved rather than an individual, but it's an easy trap to fall into. The
scenario I've described indicates several problems, most notably poor project
management. Even more detrimental to the quality of the final deliverable,
however, is the lack of coordinated testing. The reason I so strongly tie in
testing with the project management function is that a developer who is behind
schedule will often ditch the testing to get more code written. This is human
nature, and discipline is required (preferably from the developer) to follow
the project plan properly rather than give in to deadline urgency. It is very
important to report any slippage that occurs rather than cover it up and try to
fit in the extra work. The discipline I'm referring to involves writing a
proper test plan beforehand and then striving to write code that can be easily
tested. The project management process should ensure that the creation of the
test suite is also proceeding along with the actual development. I've seen
projects fall woefully behind schedule and be very buggy because of poor
project managers working with relatively good developers. If, among a team of
developers, there is one poor developer, the rest of the team will probably
make up for it. However, if the project manager is poor at performing the job,
the effect on the project can be disastrous, often because of resultant low
morale.
Software projects often run into trouble, more so when they
are team developments. The industry average for software development projects
is that typically about four in every five overrun their planned time scales
and budgets, and less than half actually deliver everything that they are
supposed to. In the fight to deliver a bug-free system as quickly as possible,
project managers often end up negotiating with the end users for a reduction in
functionality so that the developers can concentrate on the key parts of the
system. The remaining functionality is often promised for delivery in the next
version.
In this chapter, I'll start by covering the formalities-that
is, descriptions of the various phases of testing that a well-managed
development project undergoes. I'll then outline a few tips that I think will
help with the testing of a Visual Basic program, and I'll finish up with a
discussion of test environments. I've also included a few Microsoft Word 97
Quality Tracking templates on the CD that accompanies this book. Although most
companies will have their own in-house versions of these templates, I've
included them as starting points for people who do not already use them. The
usage of each form should be self-explanatory because of its filename. The
files are called:
Build log.dot
Build report.dot
End user feedback
log.dot
End user feedback
report.dot
Test failure log.dot
Test failure report.dot
Notice that I have kept these templates generic-different
businesses have different requirements and audit standards, so the templates
can be modified as necessary. To install them on your machine, create a new
directory called TMS under the Templates subdirectory in your Microsoft Office
installation, and then copy the files to this location. Now start up Word, and
select the New command from the File menu. The templates should appear under
the TMS tab of the New dialog box.
It's very easy to think of the debugging process as being
synonymous with the testing process. Certainly, the distinction between the two
processes blurs on small systems at the unit testing stage (to be defined a bit
later). Other chapters in this book cover the debugging side of the software
development process, which should allow the distinction to become more
apparent.
The Purpose of Testing
Testing verifies that a software deliverable conforms
precisely to the functional and design specifications that have been agreed to
with the users. That's a formal definition. However, testing is also used in
the detection of bugs-not to prove that there are none, but to locate any that
are present. It is a sad fact that we all inadvertently code bugs into our
applications. The trick is to reduce the number of bugs in a deliverable to as
few as possible so that the system is completely operable. In an ideal world,
we would continue to hone and refine the application ad nauseum until it was
practically bug-free, but the users can't wait that long, unfortunately. As a
general rule, bugs are found and eliminated exponentially-that is, it gets
harder to track down bugs as time goes by, but that doesn't mean that they
aren't there. When the product is released, they will pop up from time to time,
but the user's perception will be-we hope-that the application is stable and
robust.
The Formal Test Cycle
Before we get our teeth too deeply into the Visual Basic way
of programming, I think it's worth reviewing the different levels of testing
that apply to all software development projects regardless of the programming
language or target platform.
The nature of testing is so varied in its requirements that it
is difficult to give generalized definitions. What is appropriate for a small
(one- or two-person) application is totally unsuitable for a large
(twenty-person) development, whereas the amount of formality that accompanies a
large project would slow down the delivery of a small application by a wholly
unreasonable amount. With this in mind, I have tried where appropriate to
illustrate the relative scale that is necessary at each stage.
Unit/component testing
Unit testing is a test of a simple piece of code-in our case a
subroutine, a function, an event, a method, or a Property Get/Let/Set. In
formal terms, it is the smallest piece of testable code. It should be
nonreliant on other units of code that have been written for this development
project because they will almost certainly be only partly tested themselves.
However, it is acceptable to call library routines (such as the Visual Basic
built-in functions) since you can be highly confident that they are correct.
The idea is to confirm that the functional specification of the unit has been
correctly implemented. An example of a unit would be a single user-defined
calculation.
|
Tip
Sometimes it is necessary to comment
out one piece of code to get another piece to work. This
might be necessary during the main development cycle when,
for example, the underlying code might be dependent on something
that has not yet been written or that contains a known bug.
If you have to comment out a piece of code, add a Debug.Print
statement just before or after it to highlight the fact
that you have done so. It's inevitable that you'll forget
to remove the leading apostrophe from time to time, and
adding a Debug.Print statement should save you from having
to find out the hard way.
|
Component-level testing is the next level up from unit
testing. A component can have fairly straightforward functionality, but it is
just complex enough to warrant breaking down the actual implementation into
several smaller units. For example, a logical process could be specified that
calculates the monthly salary for an individual. This process might consist of
the following operations:
-
Extract from the database the number of hours worked in the month.
-
Calculate the amount of gross pay.
-
Add a bonus amount (optional).
-
Make all standard deductions from this amount.
Each operation will probably have different requirements. For
example, the database extraction will need error handling to allow for the
usual group of possibilities (user record not found, database service not
available, and so on). The calculations will need to prevent numeric type
errors (divide by zero, mainly), and if they are remote components, they will
have to raise fresh errors. Therefore, the entire component (for example,
CalcMonthlySalary) will consist of four smaller units (GetHoursForEmployee,
CalcGrossPay, GetBonusAmount, and CalcDeductions), but CalcMonthlySalary will
still be small enough to qualify as a unit (for testing purposes).
To test a defined unit, a series of scenarios should be
devised that guarantees every line of code will be executed at some time (not
necessarily in the same test). For example, if a function includes an
If..Then..Else statement, at least two test scenarios should be devised, one to
cover each path of execution. If it is a function that is being tested,
defining the expected result of the test is generally easier because the return
value of the function can be tested for correctness or reasonableness. However,
if you are testing a subroutine, you can check only the effect(s) of calling
the routine because there is no return value. I generally have a bias toward
writing routines as functions where this is reasonable. For some operations,
particularly GUI manipulation, it is not so necessary or beneficial because an
error should generally be contained within the routine in which it occurred.
In a small system, the developer would likely perform this
level of testing. In a larger system, the developer would still perform the
initial test, but a separate individual would most likely conduct a more formal
version of the test.
Integration Testing
This is the next level up and is concerned with confirming
that no problems arise out of combining unit components into more complex
processes. For example, two discrete functions might appear to test
successfully in isolation, but if function B is fed the output of function A as
one of its parameters, it might not perform as expected. One possible cause
might be incorrect or insufficient data validation. Using the previous example
of the calculation of the single net salary figure, the actual system might
implement a menu or command button option to calculate the salaries for all
employees and produce a report of the results. It is this entire routine that
qualifies as an integration test.
As with unit testing, it is important to write test plans that
will execute along all conceivable paths between the units. Integration
testing, by its nature, will probably be performed by a dedicated tester-except
for small projects.
System Testing
System testing is concerned with the full build of an
application (or application suite). At this level, the emphasis is less on bug
hunting per se, and more on checking that the various parts of the system
correctly interact with each other. The level of testing that would be
conducted at this phase would be more systemwide. For example, it could include
correct initialization from the Registry, performance, unexpected termination
of resources (for example, database connections being terminated when other
parts of the system still expect them to be there), logon failures, error
recovery and centralized error handling (if appropriate), correct GUI behavior,
and correct help file topics, to name just a few.
A system test is conducted on a complete build of the
application under construction or at least on a specified phase of it. Ideally,
it should be in the state in which the end user will see it (for example, no
test dialog boxes popping up and no "different ways of doing things until
we code that part of the interface"). Therefore, it should be as complete
as possible. In my opinion, the testing cycle should also include the system
installation task and not just the execution of the application. If you are
developing software for corporatewide use, it is highly unlikely that you will
be performing the installations. Most large corporations have dedicated
installation teams, and these people are still end users in that they will be
running software that you have generated. On the other hand, if you are
developing commercial software, the setup program is the first thing the
"real" user will see. First impressions count. The Setup Wizard has
matured into a very useful tool, but you should still test its output.
User Acceptance Testing
User acceptance testing happens when a tested version of the
specified deliverable is made available to a selected number of users who have
already received training in the use of the system. In this scenario, the users
chosen to perform the tests will be expected to give the system the kind of
usage that it will receive in real life. The best way to perform this testing
is to get the users to identify an appropriate set of data for the system test
and to enter it into the system themselves. This data is most useful if it is
real rather than hypothetical. Whatever kind of processing the system performs
can then be instigated (for example, printing reports) and the results
scrutinized carefully. Ideally, the development and testing team will have
little or no input into this process, other than to answer questions and to
confirm the existence of any bugs that crop up. Apart from this careful input
of prepared data, the system should also be used "normally" for a
while to determine the level of confidence that can be attributed to the
system. If this confidence level is satisfactory, the system can be signed off
and a system rollout can commence. If possible, a partial rollout would
initially be preferable-not only for prolonged confidence tests, but also to
ease the burden on the support team. These people will encounter more queries
as to the use of the system during these early days than at any other time
during its lifetime, so if the volume of queries can be spread out, so much the
better. It also gives them an idea of the most frequently asked questions so
that they can organize their knowledge base accordingly.
Regression Testing
Regression testing is the repetition of previously run tests
after changes have been made to the source code. The purpose is to verify that
things in the new build still work according to specification and that no new
bugs have been introduced in the intervening coding sessions. Although it is
impossible to quantify precisely (some have tried), figures that I have come
across from time to time suggest that for every ten bugs that are identified
and cleared, perhaps another four will be introduced. This sounds like a
realistic figure to me, although I would apply it more to process code rather
than event handlers, which are more self-contained (which, of course, is a
benefit of the object-based model that Visual Basic employs). As you continue
each test/debug iteration, the overall number of bugs in the code should
decrease correspondingly until a shippable product exists.
Code Reviews
The code review (or inspection) process is a major part of the
software quality cycle, and it is also one of the most important. It is an
acknowledgment that the creation of test scripts or the use of automated
testing packages only goes so far in assuring the quality of the code.
Computers do not yet possess the levels of reasoning necessary to look at a
piece of code and deduce that it is not necessarily producing the result
specified by the design document. I guess when that day comes, we'll all be out
of a job.
The code review is the process whereby the human mind reads,
analyzes, and evaluates computer code, assessing the code in its own right
instead of running it to see what the outcome is. It is, as the name suggests,
a thorough examination of two elements:
-
The code itself
-
The flow of the code
A code review should also ascertain whether the coding style
used by the developer violates whatever in-house standards might have been set
(while making allowances for personal programming styles). On a fairly large
project a review should probably be conducted twice. The first review should be
early on in the development, for example when the first few substantial pieces
of code have been written. This will allow any bad practices to be nipped in
the bud before they become too widespread. A subsequent review much later in
the development cycle should then be used to verify that the code meets the
design criteria.
The value of this process should not be taken lightly-it's a
very reliable means of eliminating defects in code. As with anything, you
should start by inspecting your own code and considering what the reviewer is
going to be looking for. The sorts of questions that should come up are along
these lines:
-
Has the design requirement been met?
-
Does it conform to in-house development standards?
-
Does the code check for invalid or unreasonable parameters
(for example, a negative age in a customer record)?
-
Is the code Year 2000 compliant?
-
Are all handles to resources being closed properly?
-
If a routine has an early Exit subroutine or function call, is everything
tidied up before it leaves? For example, an RDO handle could still be open.
(The current versions of Windows are much better than their predecessors were
at tidying up resources, but it's still sloppy programming not to close a
resource when you are done with it.)
-
Are all function return codes being checked? If not, what is the point of the
function being a function instead of a subroutine?
-
Is the code commented sufficiently?
-
Are Debug.Assert statements used to their best advantage? We've been waiting a
long time for this, so let's use it now that we have it.
-
Are there any visible suggestions that infinite loops can occur? (Look for such
dangerous constructs as Do While True.)
-
Is the same variable used for different tasks within the same procedure?
-
Are algorithms as efficient as possible?
Testing Visual Basic Code
When you're writing a piece of code in any language, it is
important to continually ask yourself, "How am I going to test this?"
There are several general approaches that you can take.
Partner With Another Developer
One good approach to testing is to partner with another
developer with the understanding that you will test each other's code. Then, as
you type, you will be asking yourself, "How would I test this if I were
looking at this code for the first time? Would I understand it, and would I
have all the information that I needed?" Some questions are inevitable,
but I have found that if you know from the outset that somebody else is going
to perform a unit-level test on your code without the same assumptions or
shortcuts that you have made, that is excellent news! How many times have you
spent ages looking through your own code to track down a bug, only to spot it
as soon as you start to walk through it with another developer? This is because
we often read what we think we have written rather that what we actually have
written. It is only in the process of single-stepping through the code for the
benefit of another person that our brains finally raise those page faults and
read the information from the screen rather than using the cached copy in our
heads. If you're looking at somebody else's code, you don't have a cached copy
in the first place, so you'll be reading what is actually there. One further
benefit of this approach is that it will prompt you to comment your code more
conscientiously, which is, of course, highly desirable.
Test As You Go
Testing as you go has been written about elsewhere, but it is
something that I agree with so strongly that I'm repeating it here. As you
produce new code, you should put yourself in a position where you can be as
certain as possible of its performance before you write more code that relies
on it. Most developers know from experience that the basic architecture needs
to be in place and stable before they add new code. For example, when writing a
remote ActiveX server that is responsible for handling the flow of data to and
from Microsoft SQL Server, you will need a certain amount of code to support
the actual functionality of the server. The server will need some form of
centralized error handler and perhaps some common code to handle database
connections and disconnections. If these elements are coded, but development
continues on the actual data interfaces before these common routines are
tested, the first time you try to run the code, there will be many more things
that can go wrong. It's common sense, I know, but I've seen this sort of thing
happen time and again.
The first and most obvious way to test a new piece of code is
to run it. By that, I don't mean just calling it to see whether the screen
draws itself properly or whether the expected value is returned. I mean
single-stepping through the code line by line. If this seems too daunting a
task, you've already written more code than you should have without testing it.
The benefit of this sort of approach is that you can see, while it's still
fresh in your mind, whether the code is actually doing what you think it's
doing. This single concept is so important that Steve Maguire devotes an entire
chapter to it in his book
Writing Solid Code
(Microsoft Press, 1995).
|
Tip
Sometimes you will need to code routines that perform actions that will be
difficult or impossible to reverse. When such routines fail, they might leave
your applicationin an unstable state. An example might be a complicated file
moving/renaming sequence. Your ability to test such code will be limited if you
know that it might fail for unavoidable reasons. If you can predict that a
sequence of operations might fail and that you can't provide an undo facility,
it helps the user to have a trace facility. The idea is that each action that
is performed is written to a log window (e.g. a text box with the multiline
property set to True). If the operation fails, the user has a verbose listing
of everything that has occurred up to that point and can therefore take
remedial action.
|
Create Regular Builds
I have always been a fan of regular system builds. They force
the development team to keep things tidy. If everybody knows that whatever they
are doing is going to have to cooperate with other parts of the system every
Friday (for example), it is less likely that horrendously buggy bits of
half-completed code will be left in limbo. Code left in this state will
sometimes not be returned to for several weeks, by which time the original
developer will have forgotten which problems were outstanding and will have to
rediscover them from scratch.
If I'm writing a set of remote ActiveX servers, I will
generally try to have a new build ready each Monday morning for the other
developers to use. If I'm working on a large GUI-based system, I will probably
look more toward a build every other week. It's hard to be precise, however,
because there are always influential factors and, of course, you need the
necessary numbers of staff to do this. If you are in a team development, I
suggest that this is something you should discuss among yourselves at the
beginning of the coding cycle so that you can obtain a group consensus as to
what is the best policy for your particular project. It is likely that you will
get some slippage, and you might well decide to skip the occasional build while
certain issues are resolved, but overall, creating regular builds will allow
everybody to get a feel for how the project is shaping up.
If you consider yourself to be a professional developer or
part of a development team, you should be using a source code control system
(e.g. Microsoft Visual SourceSafe). I recommend that you only check in code
that will not break a build. This helps maintain the overall quality of the
project by keeping an up-to-date, healthy version of the system available at
all times.
Writing Test Scripts At The Same Time That You Code
Having stepped through your code, you need to create a more
formal test program that will confirm that things do indeed work. Using a test
script allows for the same test to be run again in the future, perhaps after
some changes have been made. The amount of test code that you write is really a
matter of judgment, but what you're trying to prove is that a path of execution
works correctly and any error conditions that you would expect to be raised are
raised. For critical pieces of code-the mortgage calculation algorithm for a
bank, for example-it might be worthwhile to actually write the specific code a
second time (preferably by someone else) and then compare results from the two.
Of course, there is a 50 percent chance that if there is a discrepancy, it is
in the test version of the algorithm rather than the "real" version,
but this approach does provide a major confidence test. I know of a company
that was so sensitive about getting the accuracy of an algorithm correct that
they assigned three different developers to each code the same routine. As it
happened, each piece of code produced a slightly different answer. This was
beneficial because it made the analyst behind this realize that he had not
nailed down the specification tight enough. This is a good example of the
prototype/test scenario.
Decide Where To Put Test Code
This might seem like a strange heading, but what we need to
consider is whether the nature of the development warrants a dedicated test
harness program or whether a bolt-on module to the application itself would be
suitable. Let's examine this further.
A major component-for example, a remote ActiveX server-has
clearly defined public interfaces. We want to test that these interfaces all
work correctly and that the expected results are obtained, and we also need to
be sure that the appropriate error conditions are raised. Under these
circumstances, it would be most suitable to write an application that links up
to the remote server and systematically tests each interface. However, let's
say a small, entirely self-contained GUI-based application is being created (no
other components are being developed and used at the same time for the same
deliverable). In this case, it might be more appropriate to write the test code
as part of the application but have the test interfaces (for example, a menu
item) only be visible if a specific build flag is declared.
Ensure Source Code Coverage During Testing
A series of test scripts should, of course, run every single
line of code in your application. Every time you have an If statement, or a
Select Case statement, the number of possible execution paths increases
rapidly. This is another reason why it's so important to write test code at the
same time as the "real" code-it's the only way you'll be able to keep
up with every new execution path.
The Visual Basic Code Profiler (VBCP) add-in is able to report
the number of times each line of code is executed in a run. Using VBCP while
testing your code will allow you to see which lines have been executed zero
times, enabling you to quickly figure out which executions paths have no
coverage at all.
Understanding The Test Data
This is an obvious point, but I mention it for completeness.
If you are responsible for testing a system, it is vital that you understand
the nature and meaning of whatever test data you are feeding to it. This is one
area in which I have noticed that extra effort is required to coax the users
into providing the necessary information. They are normally busy people, and
once they know that their urgently needed new system is actually being
developed, their priorities tend to revert to their everyday duties. Therefore,
when you ask for suitable test data for the system, it should be given to you
in a documented form that is a clearly defined set of data to be fed in. This
pack of test data should also include an expected set of results to be
achieved. This data should be enough to cover the various stages of testing
(unit, integration, and system) for which the development team is responsible.
You can bet that when the users start user acceptance testing, they will have a
set of test data ready for themselves, so why shouldn't they have a set ready
for you? Coax them, cajole them, threaten them, raise it in meetings, and get
it documented, but make sure you get that data. I realize that if you are also
the developer (or one of them), you might know enough about the system to be
able to create your own test data on the users' behalf, but the testing process
should not make allowances for any assumptions. Testing is a checking process,
and it is there to verify that you have understood the finer points of the
design document. If you provide your own test data, the validityof the test
might be compromised.
Get The Users Involved
The intended users of a system invariably have opinions while
the system is under development and, if given the opportunity to express these
opinions, can provide valuable feedback. Once a logical set of requirements
starts to become a real-life set of windows, dialog boxes, and charts that the
user can manipulate, ideas often start to flow. This effect is the true benefit
of prototyping an application because it facilitates early feedback. It is
inevitable that further observations will be forthcoming that could benefit the
overall usability or efficiency of the finished result. Unless you are working
to very tight deadlines, this feedback should be encouraged throughout the
first half of the development phase (as long as the recommendations that users
make are not so fundamental that the design specification needs to be changed).
A good way of providing this allowance for feedback is to make a machine
available with the latest system build that is accessible to anybody. This will
allow people to come along at any time and play. This is a very unstructured
approach, but it can lead to a lot of useful feedback. Not only can design
flaws be spotted as the system progresses, but other pairs of eyes become
involved in the debugging cycle.
To make this informal approach work, it is necessary to
provide a pile of blank user feedback forms that anybody can fill out and leave
in some prearranged in-tray for the attention of the development team. A
nominated individual should be responsible for maintaining a log of these
feedback reports and should coordinate among the development team any actions
that arise out of them. I've included a sample feedback form on the
accompanying CD (see the list of Word templates at beginning of this chapter).
Of course, a more elegant and up-to-date approach would be to use an
intranet-based electronic form that captures all such feedback and bug reports.
Having extolled the virtues of allowing the users to give you
continual feedback, I must point out one disadvantage with this approach. If
the currently available build is particularly buggy or slow (or both), this
could quite possibly cause some anxiety among the users and thus could earn the
system a bit of bad publicity before it gets anywhere near to going live.
Again, common sense is the order of the day. Some users are familiar with the
development cycle and will take early-build system instabilities in their
stride, but others won't. Make the most of the users and the knowledge that
they can offer, but don't give them a reason to think that the final system
will be a dog!
Track Defects
I mentioned earlier the importance of good project management,
and now we are going to return to this concept. Depending on the size and
structure of your project, the responsibility for keeping a record of the
defects will either rest with a dedicated test lead or with the project lead
(who is therefore also the test lead). Developers will find bugs in their own
code, and in many cases will fix them there and then. However, some bugs will
be too elusive to track down quickly, or there might not be time to fix them,
so they should be raised with the test lead. Faults will also be raised by the
users during their own testing, and also by anybody else involved in the test
program. Unfortunately, it is quite possible that faults may continue to be
found after the application has been released.
A suitable defect-tracking system will allow for the severity
of defects to be graded to different levels (from show-stopper to irrelevant),
and for each defect to be allocated to a specific member of the development
team. Ideally it should also tie in with the local email system. It is
important to maintain an efficient means of tracking all defects so that the
overall health of the project can be continually monitored. Toward the end of a
development of any size there is normally considerable pressure from the
business for it to be released. Before this can happen, however, the project
lead and the user lead will need to continually review the defect status list
until a point is reached when the user lead is satisfied with the correctness
of the system. This can only be properly achieved by maintaining a thorough,
central log of the current health of the system.
Test Plans
A test plan is analogous to the main design document for a
system. Though focused entirely on how the system will be tested rather than on
what should be in the system, the test plan should be written with the same
degree of seriousness, consideration, and checking as the main design document
because it determines the quality of the system. The secret of a good plan is
that it should allow any team member to continue in your absence. One day in
the future, you will have moved on, but the system will still be there.
Companies very rarely stand still these days, and changes to their working
practices-and therefore to the system- will follow. Whatever changes need to be
made, the new development team will be tremendously encouraged if they have
test scripts that are documented and are known to work from the start.
Test plans have other purposes than the reasons I describe
above. They provide a formal basis from which to develop repeatable (that is,
regression) tests. As systems evolve or as new builds are created during the
debug cycle, it is essential to know that the existing stability of the system
has not been broken. This can best be achieved through an ability to run the
same tests over and over as each new build is produced. Also, test plans
provide a basis from which the test strategy can be inspected and discussed by
all interested parties.
A good test plan will start with a description of the system
to be tested, followed by a brief discussion of the test's objectives. The
following elements should be included in the plan:
-
The objectives of the test exercise.
-
A description of how the tests will be performed. This will explain the various
degrees of reliance that will be made on key testing components, such as
rerunnable test scripts, manual checklists, end-user involvement, and so on.
-
A description of the environment in which the test will occur. For example, if
your organization supports several base environment configurations, you should
clearly state which of them you will be testing against.
-
A listing of the test data that will need to be made available for the tests to
be valid.
-
A discussion of any restrictions that might be placed on the test team that
could have an impact on the reliability of the test results. For example, if
you are testing a system that is likely to be used by a very large number of
people and that accesses a central database, it might be difficult for you to
simulate this level of volume usage.
-
A declaration of the relative orders of importance that you are placing on
different criteria-for example, your concern for robustness compared to that of
performance.
-
Any features that you will not be testing, with a commentary explaining why not
(to enlighten those who come after you).
-
An intended test schedule showing milestones. This should tie into the overall
project plan.
Then, using the same breakdown of functionality as was
presented in the design specification, start to list each test scenario. Each
scenario should include:
-
A reference to the item to be tested
-
The expected results
-
Any useful comments that describe how these test results can definitely confirm
that the item being tested actually works properly (success criteria)
Test Scripts
A test script can be either a set of instructions to a user or
to another piece of code. Generally speaking, I am referring to code-based test
scripts in this section. So a good test script should be approached in the same
way as the code that it is supposed to be testing. Therefore, it should be
designed, documented, commented, and tested. Tested? No, that doesn't
necessarily mean writing a test script for it, but it does mean single-stepping
through your test code while it runs to ensure that it is doing what you expect
it to do. If the code that you are testing is a particularly important piece,
the test code should be inspected and walked through as with any normal code.
The following rules apply to test scripts:
-
Test script functionality should be kept in sync with the application code.
-
The version/revision number of the test script must be the same as the
application.
-
Test scripts should be version controlled, just like the application code. Use
Microsoft Visual SourceSafe (or an equivalent) to keep track of any changes
that you make. That way, if you need to roll back to an earlier version of the
code for any reason, you will have a valid set of test scripts to go with it.
Stubs and Drivers
An application is basically a collection of software units
connected by flow-control statements. The best time to test each individual
unit is immediately after it has been written, if for no other reason than it
is fresh in your mind (and because if you don't do it now, you'll never have
the time later). Of course, having a software unit that relies on a call to
another unit is only testable if you either comment out the call or substitute
a dummy implementation. This dummy is known as a stub. Conversely, if you are
testing a unit that would normally be called by a higher-level unit, you can
create a temporary calling routine, called a driver. Let's take a closer look
at these concepts.
Stubs
A stub is a temporary replacement piece of code that takes the
place of a unit that has yet to be written (or made available by another
developer). The implementation of the stub can be simple or somewhat complex,
as conditions require. For instance, either it can be hard-coded to return a
set value, or it can perform any of the following:
-
Provide a validation of the input parameters.
-
Provide a realistic delay so as not to convey a false impression that your new
application is lightning-fast.
-
Provide a quick-and-dirty implementation of the intended functionality of the
unit that you are substituting. Be careful not to be too quick-and-dirty;
otherwise, you'll waste valuable time debugging throwaway code.
A useful task that you can perform with a stub is to pass the
input parameters into the debug window. In most cases, this will merely show
you what you expect to see, but it will occasionally throw up a parameter value
that you never expected. Although you would have (probably) found this out
anyway, you will have immediately been given a visible sign that there is
something wrong. While formalized testing is a good method of identifying bugs,
so is the commonsense observation process ("that can't be
right
").
Drivers
These either contain or call the unit that you are testing,
depending on the nature of the code. For a simple unit of code such as a
calculation routine, a dedicated piece of test code in another module is
sufficient to call the piece of code being tested and to check the result. The
idea of using a driver is to provide a basic emulation of the calling
environment to test the unit.
The advent of the ActiveX interface now means that it is
possible to invoke a test container simply by creating a reference to your
piece of code in a new instance of Visual Basic. This does, of course, mean
that your code must be given a public declaration and so on, but this
client/server-based approach truly leads to flexibility in your systems. And of
course, if you are creating ActiveX documents, you can test your development
only in a driver-style environment-for example, Microsoft Internet Explorer.
Planning a Code Component
As I said at the beginning, the most important concept that I
want this chapter to convey is the necessity of writing testable code. Less
experienced Visual Basic programmers have a tendency to implement a lot of
functionality directly beneath command buttons. I've certainly seen instances
where there are several hundred lines of code behind a button: code that
updates the screen, displays and processes the results from a File Open dialog
box, reads from the registry, performs a database access, and then writes to
the screen again. I've even seen code like this exist in two places: behind a
command button and again underneath a menu item. (Cut and paste can be such a
useful facility.)
When you write a piece of code it needs to be as
function-specific as possible. Therefore the monolithic code block that I've
just described should be broken down into small routines. First of all there
should be very little code behind a button-ideally, a call to another routine,
but a small number of relatively safe commands is acceptable. If there is a
need for a large amount of processing, there should be one procedure that
controls the overall flow and control of the process, and that procedure calls
function-specific routines. In the description I gave above, the database
access should be a separate routine, as should the registry code, and so on.
This is good practice, and there is no taboo in having many small private
routines attached to a form, a module, a class, or whatever . The testing is so
much easier this way, and it also makes for much more specific error handling
code. It's also tidier, of course.
It's important not to get too formal with the coding, though;
we'd never get it delivered. The code that goes into making a Microsoft Windows
application can be divided into two categories: process specific and interface
specific. In very general terms the test code that goes into the
process-specific sections is what needs to be planned beforehand because it's
the process that actually gets the user's work done. The interface-specific
elements of the system are still important, but they demand a much greater
degree of spontaneous interaction from the user.
To illustrate the planning process I have invented a
functional specification for a small DLL and have included an associated test
plan . Most real-world requirements will be more comprehensive than this but I
don't feel that additional detail would add any extra weight to the point that
I'm trying to get across. All software should be written from a functional
specification (or design document, if you prefer). However, you'll find that if
you write the test plan at the same time as (or immediately after) the
functional specification, you will continually identify test scenarios that
will prompt you to go back to the functional specification to add necessary
error handling directives. This happened to me while I was writing the test
script specification for the example DLL, even though it's only a simple
demonstration.
Functional Specification
Create a prototype ActiveX DLL component
(SERVERDATA.DLL) that encapsulates the StockList table of the SERVERDATA.MDB
database. The table is defined as shown here.
| Field name |
Data type |
Description |
| ID |
AutoNumber |
ID number for each record |
| StockCode |
Text (length = 8) |
Stock code to identify item |
| Name |
Text (length = 50) |
Name of stock item |
| StockLevel |
Number (long integer) |
Number of units currently held |
| UnitPrice |
Currency |
Price of each unit |
General requirements
The following characteristics should be defined for the DLL
component:
1. The DLL component does not explicitly need any startup
code.
2. The DLL component should have two classes defined.
i. CStockList, which provides a logical wrapper around the StockList table. Its
Instancing property should be set to MultiUse.
ii. CStockItem, which acts as a single record representation of the StockList
table. Its Instancing property should be set to PublicNotCreatable
3. Database access should be performed via ActiveX Data Objects (ADO).
4. The database should be opened during the first call upon it, and should be
closed during the Terminate event of the CStockList class. For the purposes of
this prototype it can be assumed that the client application will not generate
any database activity from the CStockItem class once the CStockList class has
been destroyed.
CStockList
Implement the following interface:
Add method
This method should create a new record in the StockList table
and populate that record with the parameter data. It should check that the
StockCode value has not already been used and that all numeric values are at
least zero (if there is an error then a negative value is returned).
Input parameters:
-
StockCode As String
-
Name As String
-
StockLevel As Long
-
UnitPrice As Currency
Count property
(read-only)
This property should return the number of records in the
StockList table.
Item method function
This function should create, instantiate, and return an object
of type CStockItem for the record identified by the StockCode parameter. This
function should raise an error in the event of the record not being found.
Input parameters:
1. StockCode As String
ItemList function
This function should return a collection of all StockCode
values that exist within the StockList table.
Input parameters:
None
StockValue property
(read-only)
This property should return the sum of each record's
StockLevel field multiplied by its UnitPrice field.
Remove method
This method should delete the record that matches the supplied
StockCode value
Input parameters:
1. StockCode As String
CStockList
Implement the following interface:
Name property
(read/write)
Let/Get for the Name field.
StockCode property
(read/write)
Let/Get for the StockCode field.
StockLevel property
(read/write)
Let/Get for the StockLevel field.
UnitPrice property
(read/write)
Let/Get for the UnitPrice field.
StockValue property
(read-only)
Get property only. Calculated dynamically and is the product
of the UnitPrice field and the StockLevel field.
Update method
This method should apply any changes that are made to any of
the read/write properties.
Test Script Specification
(The idea here is that we want to methodically check
each member of the public interface. Some of the test routines will
automatically make use of some of the other members.)
Objective: To ensure that
each member in the CStockList and CStockItem classes have been run at least
once to ensure correct behavior. This is a component-level test that will be
performed by the development team.
Test methodology: The two
classes are implemented in an ActiveX DLL. This allows for the creation of a
dedicated test harness application that will act as a normal client program.
For this initial test, sample data will be hard-coded into the test harness
application. (The possibility exists to extend this in the future so that test
data will be read from a data file.)
Scope of this strategy: This
is a generic document that outlines a method of testing without providing any
test data. Reasonable test data can be created as required.
Test environment: Windows 98
(full installation), run-time files as installed by Microsoft Visual Basic 6.
No service packs are applied to these products at this time.
SERVERDATA.DLL
Test 1
Members used: Add, Count,
Item, StockValue
Intent: Check that a record
is added successfully.
-
Formulate the data for a new record that doesn't already exist in the table.
-
Create a reference to a CStockList object.
-
Call the Item method using the new StockCode value (from step 1) and verify
that this raises a "not found" error. This is to check that the
record doesn't already exist. If it does exist, this test data has already been
used and so the rest of the test should be aborted.
-
Call the Count property. Check that this tallies with the number of records
currently in the table (via Microsoft Access?).
-
Call the StockValue property to establish the value Y of total stock currently
held.
-
Calculate the value X of the new item of stock that is to be added by
multiplying the StockLevel value with the UnitPrice value.
-
Call the Add method with the new stock data.
-
Call the StockValue property and verify that it is equal to the value of X + Y
-
Call the Item function to obtain a reference to a CStockItem object. Verify
that each property matches the original data. Release the reference to the
CStockItem object.
-
Release the reference to the CStockList object.
SERVERDATA.DLL
Test 2
Members used: Add
Intent: Prove that a new
record with a duplicate key value will be rejected.
-
Create a reference to a CStockList object.
-
Attempt to add a record that already exists. A predictable error should be
raised (i.e. client error handler should include a check for this specific
error code being raised).
-
Release the reference to the CStockList object.
SERVERDATA.DLL
Test 3
Members used: Remove
Intent: Check that an
attempt to delete a record that doesn't exist will fail gracefully.
-
Create a reference to a CStockList object.
-
Attempt to remove a record that doesn't exist. A predictable error should be
raised.
-
Release the reference to the CStockList object.
SERVERDATA.DLL
Test 4
Members used: Item, Update
Intent: Prove that the
CStockItem.Update method will reject an attempt to modify a record where the
StockCode value would be the same as an existing record.
-
Create a reference to a CStockList object.
-
Attempt to rename a StockCode value to an existing value. A predictable error
should be raised.
-
Release the reference to the CStockList object.
Performance Testing
Performance testing is somewhat less rigid in its
documentation requirements than the other types of testing. It is concerned
with the responsiveness of the system, which in turn depends on the efficiency
of either the underlying code or the environment in which the system is
running. For example, a database system might work fine with a single tester
connected, but how does it perform when 20 users are connected? For many
systems, performance is just a matter of not keeping the user waiting too long,
but in other cases, it can be more crucial. For example, if you are developing
a real-time data processing system that constantly has to deal with a flow of
incoming data, a certain level of performance expectation should be included in
the design specification.
Performance is partly up to the efficiency of the network
subsystem component within Windows, but it is also up to you. For example, if
you are accessing a database table, what kind of locks have you put on it? The
only way to find out how it will run is through volume testing. But performance
is also a matter of perception. How many times have you started a Windows
operation and then spent so long looking at the hourglass that you think it has
crashed, only to find two minutes later that you have control again? The
Windows
Interface Guidelines for Software Design
(Microsoft Press, 1995) offers very good advice on how to show the user that
things are still running fine (using progress bars, for instance).
Profiling your code is an obvious step to take when
performance is an issue, particularly for processor-intensive operations.
Profiling can point out where the most time is being consumed in a piece of
code, which in turn will show you the most crucial piece of code to try to
optimize.
Preparing a Suitable Test Environment
If you are testing a system for a corporate environment, it's
a good idea to have a dedicated suite of test machines. As a result, machines
are available for end users to try out the new system without being an
inconvenience to you, and they can also focus on the task at hand by being away
from their own work environment. More important, it means that you are not
running the software on a machine that might contain other versions of the
system (or at least some of its components) that you are developing.
The nature, size, and variety of the test environment will
inevitably depend on the size of your organization. A large corporation will
conceivably have dedicated test rooms containing a dozen or so test machines,
which will not only offer the scope to test the software under different
conditions but will also allow for a degree of volume testing (several
different users using the same network resources at the same time, particularly
if you have developed a product that accesses a shared database). If you work
for a small software house or you are an independent developer, chances are you
will not be able to provide yourself with many testing resources.
Most software these days has one of two target markets. The
software is either intended for some form of corporate environment, or for
commercial sale. Corporate environments can normally provide test environments,
and if you work for a small software house or you are an independent developer,
you will probably be expected to perform system testing on site anyway.
(Obviously your user-acceptance tests must be on site.) If, however, there is
no mention of this during your early contract negotiations or project planning,
it is worth asking what sort of test facilities your employer or client will be
able to provide for you. It's better to arrange this at the outset rather than
muddle your way through a limited test.
Test Machine Configurations
If you are testing a system for a corporate environment, it is
worthwhile having two different types of test machine configurations. A
"plain-vanilla," or basic, configuration gives you a benchmark with
which to work. A second configuration that contains a typical corporate build
will highlight any problems you might encounter. Let's examine them in more
detail.
The plain-vanilla configuration
In this configuration, a plain-vanilla machine is configured
solely for the purpose of testing your new system. Preferably, the hard disk
will have been formatted to remove everything relating to previous
configurations. The machine should then be loaded with the following:
-
The version of Windows you are testing against.
-
The minimum network drivers that you need to get your configuration to work. By
this, I mean that if your corporate environment runs on a TCP/IP-based
protocol, check that the machine is not also running NetBEUI or IPX/SPX.
-
The build (and only that build) of the system that you are testing.
This test will allow you to assess the performance in a pure
environment. Whatever problems arise during this test are either
straightforward bugs in your system or are fundamental problems in the way that
you are trying to work with the Windows environment. By testing in such an
uncontaminated environment, you can be sure that the problems are between you
and Windows and that nothing else is causing any problems that arise at this
stage.
I have a personal reason for being so particular about this
point. A few years back, I was writing a client/server system using a
non-Microsoft development tool. The product itself was buggy, and it was
difficult to get any form of stable build from it at all. Eventually,
everything that went wrong was blamed on the development tool. Because we
concentrated on trying to get some common routines built first, my co-developer
and I did not attempt to hook up to the Microsoft SQL Server service for a
couple of weeks. When we did try, it wouldn't work. We both blamed the tool
again. Because we had seen it done in a training course, we knew that it should
work. Therefore, we reasoned, if we tried doing the same thing in different
ways, we eventually would find success. We didn't. Only when we came to try
something else that involved a connection to SQL Server did we find that it was
the current Windows configuration that was at fault. We decided to reload
Windows to get a plain-vanilla environment, and, sure enough, we got our
database connection. As we reloaded each additional component one by one, we
found out that the antivirus terminate-and-stay-resident (TSR) program that we
were both using was interfering with the SQL Server database-library driver!
When we changed to a different antivirus tool, the problem went away.
The corporate configuration
Having gained a benchmark against what works and what doesn't,
you can then repeat the tests against a typical corporate environment. For
example, your company might have several standard configurations. The base
environment might consist of Windows 95, Microsoft Office (a mixture of
standard and professional editions), a couple of in-house products (an internal
telephone directory application that hooks up to a Microsoft SQL Server service
somewhere on the network), and a third-party communication package that allows
connectivity to the corporate mainframe. Typically, additional install packs
are created that add department-specific software to the base environment. For
example, the car fleet department will probably have an off-the-shelf car pool
tracking system. Allowances need to be made in your testing platforms to take
into account more diverse variations of the corporate base environment, but
only if the software that you have developed is likely to run in this
environment, of course.
In a perfect world, there would be no problem running your new
system in these environments. However, inconsistencies do occur. Products
produced by such large companies as Microsoft are tested so widely before they
are commercially released for sale that issues such as machine/product
incompatibility are addressed either internally or during the beta test cycle.
(Indeed, the various flavors of Windows currently available do contain the
occasional piece of code that detects that it is running on a specific piece of
hardware or with a specific piece of software and makes allowances
accordingly.) One of these inconsistencies can be attributed to executable file
versions. For example, different versions of the WINSOCK.DLL file are available
from different manufacturers. Only one of them can be in the Windows System or
System32 directory at any time, and if it's not the one you're expecting,
problems
will
occur.
Another problem that can arise in some companies-as incredible
as it seems-is that key Windows components can be removed from the corporate
installation to recover disk space. Many large corporations made a massive
investment in PC hardware back when a 486/25 with 4 MB of RAM and a 340 MB hard
disk was a good specification. These machines, now upgraded to 16 MB of RAM,
might still have the original hard disks installed, so disk space will be at a
premium. This is less of a problem nowadays with the relative cheapness of more
powerful machines, so if your organization doesn't suffer from this situation,
all is well, but it is a common problem out there. I am aware of one
organization, for example, that issued a list of files that could be
"safely" deleted to recover a bit of disk space. Apart from the
games, help files for programs such as Terminal and the object packager (ever
use that? me neither), there was also the file MMSYSTEM.DLL. This file is a key
component of the multimedia system. In those days (Windows 3.1), very few of
the users had any multimedia requirements, so the problem went unnoticed for a
while. The fix was obviously quite straightforward, but it still would have
caused problems. If your attitude is "Well, that's not my problem,"
you are wrong. You need to be aware of anything that is going to prevent your
system from running properly at your company, and if a show-stopping bug is not
discovered until after the rollout, you'll be the one who looks bad, no matter
who you try to blame.
A good indication of the amount of effort that went into
producing a build of the first version of Windows NT can be found in the book
Show-Stopper:
The Breakneck Race to Create Windows NT and the Next Generation
at Microsoft, by G. Pascal Zachary (Free Press, 1994). Not only is it an interesting
read, it describes well the role of the testing teams within a large
development environment-larger than most of us will be exposed to during our
careers, I dare say. But the book conveys very well the necessity of structure
and discipline that must be maintained in large developments.
A Final Word of Caution
And now for the bad news: once you have completed the testing,
your application or component will still probably have bugs in it. This is the
nature of software development, and the true nature of testing is unfortunately
to reduce the number of bugs to a small enough number that they do not detract
from the usefulness and feel-good factor of the product. This includes the
absence of "showstopper" bugs-there is still no excuse for shipping
something that has this degree of imperfection. In running through the testing
cycles, you will have reduced the number of apparent bugs to zero. At least
everything should work OK. However, users are going to do things to your system
that you would never have imagined, and this will give rise to problems from
time to time. In all likelihood, they might trigger the occasional failurethat
cannot apparently be repeated. It does happen occasionally, and the cause is
most typically that the pressures (commercial or otherwise) on the project
management team to deliver become so strong that the team succumbs to the
pressure and rushes the product out before it's ready. They then find that the
users come back to them with complaints about the stability of the product.
Sometimes you just can't win.