I dare to say that the universally accepted “wisdom” tells that testers are less qualified professionals than developers, that one becomes a tester if he’s not good enough to be a developer, that it’s expected that good testers eventually move to development and that professional testing is a disposable asset because, after all, one has the option to hire school students to do the job.
In this article I explore briefly the issue of tester intelligence then I move on to possible ways to optimize the usage of human intelligence in software testing.
Cogito, ergo sum
“I think, therefore I am” said Descartes many years ago and this statement, so much true for any human endeavor, is even more valid for that complex and brain intensive activity named software construction. Indeed, building software is hard, complex and – unfortunately - very much error prone.
What about software testing?
When it comes to sheer verification of conformance to specs, software testing clearly requires a lower level of expertise than development: we don’t care how much intelligence has been put into a piece of software as long as it conforms to the specs.
However, when it comes to finding on purpose malfunctions in software, things do change tremendously. The tester is not anymore the patient, hard working bookkeeper of features from above. He’s more like a hacker, trying to exploit any little clue extracted from the system under test.
A hacker is anything but dumb and so is the creative professional tester. Creative software testing is a highly intelligent activity requiring cognitive processes as complex as the ones used for development (but different, of course). Needless to say, such good testers are hard to find.
Alas, like any intelligent human activity, creative software is expensive. Are there ways to optimize it?
Yes, there are.
When the artificial flavor is good
- it is a search problem because the tester has to search the space of all possible test scenarios and select only the ones leading to bugs.
- it is a non-decidable problem because the decision whether a certain behavior is indeed defective resides outside the testing process since it usually requires input from external sources (the developer, the rest of the team).
- it is an infinite-space search problem because the total number of possible scenarios is extremely large, practically infinite.
Does that sound familiar? If it doesn’t, it should: chess players, stock traders, investment businessmen, military or economic strategists, business executives, medical diagnosticians and many others have been playing with such things for years – and they are all endowed with pretty darn good levels of intelligence.
Not only that: such problems have been tackled by computer science, too. It’s named Artificial Intelligence (A.I), it deals with problems non-tractable by ordinary methods and it’s yielded notable successes in domains like medical diagnosis, oil drilling, automated vision, game playing and others.
Does Artificial Intelligence have any place in software testing? Some attempts have been devised so far yet there is definitely room for better.
Let’s take a closer look.
A proven path: theorem proving
Theorem proving is a field of Artificial Intelligence that deals with the automated means to prove that a certain statement is true or false. Rooted in mathematical logic, theorem proving employs sophisticated methods of symbolic processing.
The principle of using theorem proving in software testing is very simple:
- testing conformance to specifications is equivalent with: given a software program
Pand a set of specifications
S, is the statement ”
P and S“ true? In other words, does
Shold for any run of
- bug-oriented testing is equivalent with: given a software program
Pand a set of specifications
S, is the statement ”
P and not S“ true? In other words, are there runs of
Pthat break the specs
A theorem prover cannot work with a computer program in its direct, binary form, neither can it understand specs in natural language. So, a test system based on theorem proving must perform some transformations upon both the system under test as well as upon the specs to bring them in line with what the theorem prover can work upon.
These transformations are problematic and they wonderfully reveal the limits of the method:
- not every system’s internals are expressible as a formula that can be processed automatically. Hence, some approximations must take place, approximations that introduce a difference between the system under test and its formal expression.
- not every specification in natural language can be translated into a format that can be processed automatically. Again, some approximations must take place.
To summarize the method in terms of pros and cons, one can say:
- Pros: it is a precise and rigorous method, based upon on the time-tested, solid foundation of mathematical logic.
- Cons: it requires approximations and it relies on access to the system’s internals. This means it cannot do black-box testing.
A career in modeling
Capturing the essentials of a system’s behavior, apart from specifications, can be achieved via models. Curiously, despite modeling being essential to other branches of engineering, the software industry has recognized its importance only recently.
A software model is an artifact that represents, in simplified form, the behavior of the software system being modeled. Think of a model as the maquette of the software system being built.
Software models can be executable or non-executable. Executable models may be state-based modeling, i.e. they try to mimic the most important states of the system as well as the transitions from one state to another. When each state specifies the invariants that the system must satisfy, then the model becomes a verification tool, too.
Models are viable for software testing by the means of Artificial Intelligence because:
- the elements of a model (states, transitions) can be processed by automatic means out of the box – AND -
- the behavior of a model is much simpler than the actual behavior, hence it is easier to process automatically.
As simple as they are, models are still too complex to be explored exhaustively. Their state space is big enough to require Artificial Intelligence techniques, especially from the realm of search algorithms.
Because they resemble more what the user sees (in contrast with the theorem proving approach, where the system under test becomes a mathematical formula) they are closer to the real-life work of the tester. However, because they rely on search algorithms empirical in nature (i.e. not as theoretically sound as mathematical logic) model-based testing does not enjoy the same theoretical precision that theorem proving has.
To summarize the method in terms of pros and cons, one can say that:
- Pros: it represents the behavior of a system and not the internals, therefore it supports black-box testing better.
- Cons: it adds extra work to build the model, it is not theoretically sound as theorem proving, modeling requires ignoring details of the system that may prove important later on.
Darwin was right
- the translation of the system under test into a formalism that can be maneuvered by the theorem prover – OR -
- the manually built representation of the system’s behavior as a model.
Either way, the view of the system under test is static, it does not evolve over time. Once the view gets constructed, the test engine uses it unchanged as long as testing goes. If one wants a better, more detailed view of the system under test, he must reconstruct it from scratch.
Such approach, obviously, involves a lot of work just to recode the information that already exists in the system. Wouldn’t be nice to have a testing system that doesn’t require that much information upfront but it learns the system while testing it?
Such a system would depend pretty much on how the learned information gets represented. One way to represent that information consists in the test scenarios themselves, since any test scenario does represent some information about the system under test. This means that having more scenarios is having more information and having a lot of scenarios is having a lot of information.
Such system would not store all the scenarios but only the ones containing maximum of test-related information, i.e. the ones revealing most defects or the ones exercising most system states. To avoid writing all the scenarios by hand the system has to start with a finite set of hand-crafted scenarios and it has to generate new scenarios automatically, based on the existing ones.
One method of generating new scenarios is by “mixing” existing ones to produce “offspring” while retaining the best “children” and discarding the sub-optimal ones. Since the “mixing” is akin to how the genes of a child’s parents mix to produce the genome of the child, these methods got called genetic algorithms.
These algorithms make up an evolutionary model of software testing because they work upon a population of test scenarios that, with each new generation, becomes better fit to the “environment” represented by the system under test.
The success or failure of genetic algorithms rely on two elements:
- the individual actions which, when applied on succession, form the scenario. These actions are atomic, mostly stateless and mostly independent computational units that try to mimic an atomic piece of functionality of the system under test. Unfortunately, not all the systems support such atomic, sequential and independent decomposition.
- the mixing procedure must be fast in order to ensure a high rate of generational renewal of the population. If the above-mentioned actions are truly stateless and independent, the mixing procedure is simple. Yet, a higher rate of inter-dependent actions requires more intelligent mixing procedures which may prove too slow.
To summarize the method in terms of pros and cons, one can say that:
- Pros: it requires little effort upfront and the more it runs the more intelligent, better informed results it produces.
- Cons: performance degrades for systems whose functionality does not support decomposition into atomic, stateless and independent actions.
The previous sections show three fields of Artificial Intelligence that have been used in software testing. Some other fields of AI are good candidates, though. According to my knowledge to date, they haven’t been considered as such.
This section tries to present them along with reasons on why considering them good candidates for software testing makes sense.
Rule-based expert systems
Rule-based expert systems is one field of Artificial Intelligence that has enjoyed considerable commercial success. Expert systems have been used in areas like medicine and mining and, albeit very expensive, they’ve saved large amounts of money to the ones who used them.
A rule-based expert system has two parts:
- a set of “rules” processed by an inference engine that represents the “thinking” of the system.
- a set of “facts” that represents the “memory” of the system.
When presented with a problem, the expert system applies the rules upon the facts in order to draw a conclusion related to the problem. A solution may consist in a yes/no answer, a sequence of steps leading to a result or an explanation for a certain conclusion.
We may consider that a hypothetical expert systems for software testing should have the following elements:
- a set of “rules” indicating standard procedures to test a testable element.
- a set of “facts” containing knowledge about testable elements: UI controls, APIs, protocols, data structures, hardware ports, etc.
When presented with a problem – i.e. a description of the system in terms of both structure and functionality – such an expert system would yield test procedures while explaining the reasons for choosing them. Expected outcomes would be: producing a test plan, yielding several standard test scenarios or proposing quality metrics.
Neural networks are a field of Artificial Intelligence that has been used with success in form recognition. They proved capable of recognizing patterns like forms, images, handwriting or even voices. Their principle consists in a decision scheme based on the parallel work of tiny decisional elements named neurons which are inter-linked in various ways. More complex patterns require more complex neural networks.
Neural networks have been designed with static information in mind. This means that they can recognize patterns like images or handwriting but they are not fit to classify motion (unless, of course, motion is decomposed in individual frames, although I am not sure whether such an approach has been tried as of today). When thinking of testing, this means that reasoning upon the dynamics of a software system is unlikely to succeed with neural networks.
Yet, neural networks could be used to reason upon static aspects of software, such as GUI layout. For example, writing a program that tells whether the GUI controls of a form are harmoniously arranged is nearly impossible to do in conventional ways, yet it becomes tangible with neural networks.
Case-based reasoning is a new field in Artificial Intelligence that deals with problems that resist an analytical description hence they aren’t tractable by analytic processes, no matter how advanced. Case-based reasoning is the automatic counterpart of the “that’s the way we do it” from real life that we hear so often from people with great empirical experience who know they are right but cannot explain why.
A case-based reasoning system consists in a large database of problems, their characteristics and their resolutions. These are the cases. When a new problem arrives, the system tries to match the new situation to one of the existing cases and to propose a solution based on existing precedents. The solution bears no logical explanation since there is no apparent, logical correlation between a problem, its characteristics and its resolution. Yet, it works because a liaison does exist but it is intractable by computational means.
Case-based reasoning may work for software testing considering that software developers, being all humans, most likely make similar mistakes when faced with a similar design. Hence, a case-based testing system doesn’t need to deeply analyze the system’s structure or behavior: provided with a description of the design, the case-based testing system might look up into the database of preceding cases to pinpoint the most probable vulnerabilities.
Software testing, like any other creative endeavor, requires a significant amount of intelligence. Various fields of Artificial Intelligence have been used to replace or complement the intelligence of software testers which, like any human intelligence, is slow, expensive and error prone.
This article presented an overview of several usages of Artificial Intelligence in software testing while proposing other fields of AI as good candidates for the same purpose, along with reasons for such proposals.