![]() |
seminar
outline
last updated December
1, 1997
|
formalWARE
project
formalWARE
formalWARE
|
Probabilistic
Analysis of Software Testing for a Rare Condition
Lee White
Problem Statement Suppose we have a system that monitors a stream of inputs with the objective of detecting a particular condition "C". You can imagine that this system is implemented mainly as software and that the specification and implementation of condition "C" are very complex. For instance, the system could be a computer-based system in a chemical plant that must detect the rare occurrence of a dangerous combination of physical conditions (involving temperature, pressure and viscosity in producing a particular chemical). To gain confidence that the implementation
is correct with respect to its specification, we want to perform some lab
testing. In particular, we are interested in false positives, i.e.,
instances where the system incorrectly reports that the inputs satisfy
condition "C". We have three fundamental
2) How many lab test cases are required to achieve this level of confidence? 3) What assumptions would be needed to make the lab test inputs representative of normal operation? With respect to 2), how do we compute n in the above statement? Ideally, we would like to have a formula with a variable R for the "acceptable rate for false positives", e.g., R = 0.1% (i.e., 1 in 1000 false positives). With respect to 3), for the analogy with opinion polls to hold up, we will have to assume that the lab test cases are independent except that they are also representative of the kinds of inputs that would be applied to the system during normal operation. What are the issues which need to be taken into account to ensure this is correct, and what does this imply about the statement of this problem? First Model: 1) Description of 4 Cases:
This will be a multinomial distribution in these four probabilities. 3) However, because these four events are mutually disjoint and exhaustive, we can obtain the marginal distribution for p1 simply as a Binomial distribution. As the problem is symmetric in all the variables, the same analysis will work for all cases, most importantly for p2. 4) Now either we know probability p1 a priori, or we don't; next consider what is involved in estimating p1, and the number of tests required. Estimate p1 Parameter for First Model:
a) Each test should be statistically independent of other tests. b) Statistical properties of the system should not vary with time. c) We need to know the OPERATIONAL INPUT DISTRIBUTION of the system. 6) The most important and subtle of these assumptions is c): it involves a complete description of the probabilities and assumptions of all the inputs to a system. An example is given of a telephone switching system. 7) For our analysis, we will assume the first model describes the operational input distribution, unless we are given more information. 8) The Binomial distribution is rather tedious to use for computation. We will try to approximate the Binomial distribution with either the Normal or the Poisson distribution. If n*p1*(1-p1) > 50, we can use the Normal approximation, which is easy to compute. 9) In deciding how many tests are
required, we need to get an estimate p1' which is close to p1:
11) How small should DELTA be? It makes sense that if p1 is 0.001,then DELTA should be 0.00025, or more generally, 25% of the estimated value of p1. 12) Once the problem has been set up in this way, with values of DELTA and ALPHA selected, there are a number of methods for estimating n; we will use and illustrate one such method; the details of doing this estimation are not as important as setting up the problem. 13) The number of tests n is now given for p1=0.01, 0.001, and 0.0001. It is shown that the factor n*p1*(1-p1) is 130 for our assumptions, well above the criterion of 50 or more; so the Normal approximation assumption is justified. Analysis of First Model: 14) Now that we can assume that p1 is known accurately, what is the effect of running tests on the system, and how can we describe the outcome of these tests? Again, how many tests are required for certain levels of false positive rates? 15) Again use Poisson and Normal
approximations to the Binomial but this time we will find that the number
of tests are smaller, and the Normal approximations are not so accurate;
the Poisson approximation will be, however.
Second Model: 1) Given condition C:
pfp: Probability of reporting C when C has not occurred (related to false positives, and very small) pmc: Probability of reporting not C when C has actually occurred (related to misclassifying C condition, and very small)
4) We encounter a new problem with
this model if we try to estimate probability parameters to find the probability
of false positives (or probability of misclassification of the C condition):
5) Assume that we can control the C condition during the lab tests; then here is a method to determine the desired probabilities: a) Find pfp ACCURATELY with the C condition off; count positives; how many tests? b) Find pmc ACCURATELY with the C condition on; count negatives. c) Next submit random tests and
count positives; these positives should be calculated by the following
formula:
e) Finally evaluate pfp*(1-pc) for false positive rate.
Two Cases: Defective item or not (analogous to condition C) Two Cases: Inspected item or not (analogous to reporting condition C) (Note: Second Model is slightly more complex, since there are four cases analogous to inspection, not two.)
|