50.003 - Auto Test Generation Part 2¶

Learning Outcomes¶

By the end of this unit, you should be able to

Apply Genetic algorithm for software testing.
Apply Symbolic execution for software testing.
Explain how to use symbolic execution for software verification.

Recap¶

Recall from the last unit, we learned a special version fuzzing technique which leverages on the feedback given by the test reports.

graph
    N1-->N2
    N2("Execute Test Cases")-->N3
    N3("Collect Feedback")-->N1("Generate Test Cases")

In each iteration, we take the feedback from the previous test execution to generate the next test cases, hoping the next test cases will perform better. The process of translating the feedback to a new version of test case might be quite tricky to implement as a rule-based system.

Genetic algorithm¶

A possible way to automate the process of feedback-to-generate-better-test-cases is to adopt the genetic algorithm.

Genetic Algorithms were invented to mimic some of the processes observed in natural evolution. The idea with GA is to use this power of evolution to solve optimization problems. The father of the original Genetic Algorithm was John Holland who invented it in the early 1970's.

The main idea is to represent the set of all possible results, good or bad, intermediate or final as an entire population. Any subset of the entire population is considered as a generation.

The algorithm starts with randomly pick a subset as the first generation. The following steps are taken in each iteration

compute the fitness score for each member in the current generation \(G_i\).
pick a subset of \(G_i\), say \(G_i'\) such that all members in \(G_i'\) are fitter than the rest by a threshold.
Use members in \(G_i'\) as "parents" to "breed" the next generation \(G_{i+1}\).
set the current generation as \(G_{i+1}\).

The above loop terminates when either * the maximum number of iterations is reached or * the overall fitness meets the minimum requirement or * the fitness increment is smaller than threshold \(fit(G_{i+1}) - fit(G_i) < \epsilon\).

Using GA in test generation¶

for software testing, it is natural to apply GA.

Let the set of all possible test inputs be the entire population. Each generatation is a proper subset of the entire population.
Let the code coverage score be the fitness score for each test input.
Viewing each test input as a sequence of bytes. There are at least two possible "breeding" operations.
Cross-over. Taking two test inputs, say A and B from the \(G_i'\), and pick some random byte offsets that are common to both. Replace the bytes in A by those bytes from B based on the offsets. The result of this operation will be the member of the next generation.
Mutatation. Take one test input say C from \(G_i'\), randomly pick a position and change the byte at the position which form a new input in the next generation.

Limitation of GA based test generation¶

GA automates the search of best test cases for the given test subject. However it might take many iterations to reach the optimal set. Consider

function example(x,y) {
    let a = [[1],[2],[3],[4],[5]]
    if (y == 42342531) {
        a[x][0] = y // x must be [0..4]
    }
}

In the above example, it will take a long time for the GA test case search to discover the bug is prensent when x=5 and y=42342531.

To reach that branch fast, we need to exploit the code structure of example(), it is kind of white box fuzzing.

Symbolic execution¶

Symbolic exeuction is a kind of dynamic analysis that tries to reach to all executable paths in a function. It differs from concreate execution and testing for not giving concrete values to the function being invoked, instead, some symbolic arguments are given so as to collect the information. It differs from static analysis techniques as it retains a per path level of information for analysis, (which could also be a drawback, we will discuss it shortly).

Given the function example(x,y), invoking it with a concrete value say example(100,42342531) is known as a concrete execution. Symbolic execution on the other hand, does not invoke the funciton with concrete values, instead symoblic values are passed in.

Supose A and B are two symbolic values, we can think of them as logical terms, not variable.

function example(A,B){
   // A can be anything, B can be anything
   let a = [[1],[2],[3],[4],[5]]
   if (B == 42342531) {
       // B must be 42342531
       a[A][0] = B // 0 <= A < 5
   } 
}

From the above, we find that for the path 3->4->6->7 to be covered, B must be 42342531. By doing so, we can figure out the test case to cover that path.

Besides generating test cases, Symbolic execution can detect bugs, dead codes and redunant assertion.

For example, if we add an assertion statement at line 6,

function example(A,B){
   // A can be anything, B can be anything
   let a = [[1],[2],[3],[4],[5]]
   if (B == 42342531) {
       // B must be 42342531
       assert(B>0) // B > 0
       a[A][0] = B // 0 <= A < 5
   } 
}

The symbolic execution framework generates the following set of constraints for the path 3->4->6->7->8,

{ B == 42342531 && 
  B > 0 && 
  0 <= A && 
  A < 5 
}

since the first constraint implies the second one, the second constraint is redundant.

Similarly if we have the following code

function example(A,B){
   // A can be anything, B can be anything
   let a = [[1],[2],[3],[4],[5]]
   if (B == 42342531) {
       // B must be 42342531
       if (B<0) { // B < 0
         a[A][0] = B // 0 <= A < 5
       } 
   }
}

The symbolic exuection framework generates the constraint set for the path 3->4->6->7->8->9->10, as

{
    B == 42342531 && 
    B<0 && 
    0 <= A && 
    A < 5  
}

which can not be satisfied hence this path of code is not reachable.

Finally if we have the following code

function example(A,B){
   // A can be anything, B can be anything
   let a = [[1],[2],[3],[4],[5]]
   if (B == 42342531) {
       // B must be 42342531
       A = B // A == B 
       a[A][0] = B //0 <= A < 5
       
   }
}

we have the following set of constraints

{ B == 42342531 && B == A && 0 <= A < 5 
}

which is unsatifiable hence a bug must be present.

As we observe, symbolic execution proceeds by path following. For each path it traverse, it generates a set of constraints,

if it is an assignment and the LHS is first time being assigned, x = e, it generates an equality constraint x == e.
if it is an re-assignment x = e, we need to create a fresh copy of x, say the last defined/assigned of x was x_i, then add x_{i+1} == e to the constraint sets, and rename all the use of x to x_{i+1} in the statements that come after the current one.
if it is an if p { s1 } else { s2 }, it generate two paths with the first path having p and the second path having !(p).
if it is a while p { s }, it generates a path into the loop body with p as part of the constraint set, and another path with !p as it exits the loops.

We can also add extra constraints such as the reference of the array elements must be within range.

For example,

function sum_upto(n) {
   let s = 0;
   let i = 0;
   while (i < n) {
       s = s + i;
       i = i + 1;
   }
   return s;
}

we first "get" rid of the while loop by unrolling it once and one.

Path 1:

function sum_upto(N) {
   let s = 0;
   let i = 0;
   assert(i >= N) // !(i<n)
   return s;
}

Path 2:

function sum_upto(N) {
   let s = 0;
   let i = 0;
   assert(i < N) // (i<n)
   s = s + i;
   i = i + 1;
   assert(i >=N) // !(i<n)
   return s;
}

We then rename the variables used in the assignments into single assignment form.

Path 1:

function sum_upto(N) {
   let s0 = 0;
   let i0 = 0;
   assert(i0 >= N) // !(i<n)
   return s0;
}

Path 2:

function sum_upto(N) {
   let s0 = 0;
   let i0 = 0;
   assert(i0 < N) // (i<n)
   s1 = s0 + i;
   i1 = i0 + 1;
   assert(i1 >=N) // !(i<n)
   return s1;
}

Then we have two versions of the above.

One version of not entering the while loop body.

{
   s0 == 0 &&
   i0 == 0 &&
   !(i0 < N)  
}

And another version of executing the while loop body once.

{
   s0 == 0 &&
   i0 == 0 &&
   i0 < N && 
   s1 == s0 + i0 &&
   i1 == i0 + 1 &&
   !(i1 < N)
}

We could have more versions of having while loop being unrolled more than once, we just need to be adding extra superscripts to distinguish variables being re-assigned in different iterations.

We finally can send these versions to constraint solvers to check for satisfiability.

Limitation of Symbolic execution¶

Symbolic execution was a nice framework with strong formalism behind and highly useful. There are several issues required more care.

Limitation of the solvers¶

It relias on the solver's capabilities. Hence in case of complex inequality, we might not get an answer.

function f(x, y, z) {
    if (x*x*x*x + y*y*y < z*z) {
        assert(false);
    }
}

Existing SMT solvers support theories on linear integer arithmetic, bit vectors, string, etc. Existing SMT solvers are not particularly scalable or efficient for certain theories e.g., non-linear arithmetic.

In such cases symbolic execution will not (always) be able to evaluate which side of the branch should be taken.
Thus, symbolic execution may randomly generate some inputs to find one of the feasible execution, losing opportunities to cover a large portion of code.

Memory object aliasing¶

Keeping track of constraints among variables with primitive values are straight-forward, it is not that simple when dealing with reference to objects in the heap.

For instance consider

class A {
   constructor(x) { this.attr = x};
}
function foo(N) {
   let a = new A(N); // a == mloc1 && mloc1.attr == N
   if (a.attr == N) {
       let b = a; // b == a
       b.attr = N+1; // mloc1.attr == N +1 
   }
}

If we naively construct a set of constraint from a path from 5→6→7→8→9, we have the following

{
  a == mloc1 && 
  mloc1.attr == N &&
  mloc1.attr == N &&
  b == a &&
  mloc1.attr == N+1
}

which is not satisfiable, but the above path of executions should be valid. Clearly we need a better model to manage to heap objects.

Treatment of loops¶

Symbolic Execution does not cover all the paths, e.g. we have to fix a limit to how many time the loop can be unrolled, or require the programmers to annotate the loop with invariant (which is uncommon for day-to-day programming). For example, if a program add an invariant constraint of the while loop s <= i * (i-1) / 2, we could verify that for all versions of s_k and i_k, satifies the invariant contraints, for example

s0 == 0 && i0 == 0  && i0 == N  ==> s0 <=  i0 * (i_{-1} - 1) / 2

i_{-1} is undefined but since i0 is 0, hence the above holds

s1 == s0 + i0 && i1 == i0 + 1 && i0 == 0 && i0 == N  ==> s1 <=  i1 * (i0 - 1) / 2

This generalizes to for all k. Thus, when N >=0, We can then use the invariant plus the post condition i==N as the approximiated constraint set for the loop for the subsequent paths, this is to like combining static analysis with dynamic analysis.