iDiscovery: Feedback-Driven Dynamic Invariant Discovery

This is the homepage for iDiscovery, a novel tool for infer precise program invariants. As we know, program invariants can help software developers identify program properties that must be preserved as the software evolves, however, formulating correct invariants can be challenging. In this work, we introduce iDiscovery, a technique which leverages symbolic execution to improve the quality of dynamically discovered invariants computed by Daikon. Candidate invariants generated by Daikon are synthesized into assertions and instrumented onto the program. The instrumented code is executed symbolically to generate new test cases that are fed back to Daikon to help further refine the set of candidate invariants. This feedback loop is executed until a fix-point is reached. To mitigate the cost of symbolic execution, we present optimizations to prune the symbolic state space and to reduce the complexity of the generated path conditions. We also leverage recent advances in constraint solution reuse techniques to avoid computing results for the same constraints across iterations. Experimental results show that iDiscovery converges to a set of higher quality invariants compared to the initial set of candidate invariants in a small number of iterations.

System Requirements

Currently, iDiscovery toolkit has been tested on Mac/Linux systems with:

Downloads

The iDiscovery distribution mainly contains two parts:

Installation

The installation of the iDiscovery toolkit mainly contains the following steps:

After these steps, iDiscovery should be ready to run.

iDiscovery Main Steps

Configuration

Before starting iDiscovery, the user need to set the configuration file, iDiscovery.cfg:

//The path of the iDiscovery project idiscdir="/Path/To/jpf-idisc" //The path of the Daikon library daikondir="/Path/To/daikon-bin/daikon.jar" //The name of the project under test, e.g., sub="rjc" //The full name for the class under test, e.g., fullsub="rjc.RJCMain" //The full name for the method under test (following Symbolic PathFinder's naming format), e.g., meth="rjc.rjc.MainSymbolic(double, double, double, double, double, double, double[], double[])" //The path of the original file for the class under test, e.g., origjavafile="/Path/To/rjc/source/rjc/rjc.java" //The path for storing the file with synthesized assertions, e.g., genjavafile="$idiscdir/src/examples/rjc/rjc.java" //The path for project jpf property file when symbolic execution sub_jpf="src/examples/rjc/rjc.jpf" //The path for project jpf property file when symbolic execution using Green solution reuse, e.g., sub_green_jpf="src/examples/rjc/rjc-green.jpf" //The binary dir for the original program under test, e.g., source_bin="/Path/To/rjc/bin"

Executing iDiscovery

After configuration, iDiscovery can simply be executed using the ./iDiscovery.sh command (located in the root directory of iDiscovery).

The complete format for executing iDiscovery is:

./iDiscovery.sh [option] ...

The current implementation supports the following options:


For example, the following command will apply all three optimizations:

./iDiscovery.sh -s -r -g

Results

After iDiscovery execution, all the results are stored in the results directory of iDiscovery. The results for each specific execution mode are stored in a sub-directory named by that execution mode, e.g., separation-restriction-ungreen stores the results for using both optimizations but not green. The results will have two parts:

Illustration

Tool and Subject Setup

We first download both the iDiscovery tool and the rjc artifact, and put them into different program locations. Then we execute the following command to copy the rjc subject into the src/examples directory of iDiscovery. The reason is that in each iteration, iDiscovery will apply symbolic execution on the program under the src/examples directory. Therefore, we should first make a complete copy of the project under test in that directory. Then, in each iteration iDiscovery will automatically replace the corresponding Java file in the src/examples directory with the one including synthesized assertions.

cp -r /path/to/rjc /path/to/iDiscovery/src/examples

iDiscovery requires the user to download the Daikon tool. Users can refer to the Daikon homepage for details9. Then, we can execute the following command in the jpf-idisc directory to build iDiscovery:

ant -f build.xml build

Follow the steps discussed before to set up the configuration for running iDiscovery.

iDiscovery Execution

We first try iDiscovery without optimizations using the following command:

cd /path/to/jpf-idisc</br> ./iDiscovery.sh

iDiscovery automatically created a results directory in its root directory. It created a sub-directory named unseparation-unrestriction-ungreen, which includes both the invariant generation and symbolic execution information. This execution of iDiscovery timed out (exceed the 20 hour limit) after 4 iterations.

To improve the efficiency of symbolic execution, we now try iDiscovery with only the violation restriction optimization using the following command:

cd /path/to/jpf-idisc ./iDiscovery.sh -r

There will be a sub-directory named unseparation-restriction-ungreen in the result directory, which includes both the invariant generation and symbolic execution information. For this configuration, the execution of iDiscovery finished within 4 iterations but symbolic execution still took nearly 14 hours to complete. In terms of effectiveness, this configuration falsified 96 incorrect invariants while augmenting 2 new invariants in our evaluation of iDiscovery.

The third time, we try iDiscovery with both the assertion separation and violation restriction optimizations using the following command:

cd /path/to/jpf-idisc ./iDiscovery.sh -sr (or, ./iDiscovery.sh -s -r) (or ./iDiscovery.sh -r -s)

iDiscovery created a sub-directory named separation-restriction-ungreen in the result directory, which includes both the invariant generation and symbolic execution information. This time, the execution finished within 4 iterations and symbolic execution only took 5 minutes and 29 seconds in our evaluation. The invariant inference results are the same as the previous configuration.

Finally, if we also choose to enable the solution reuse optimization by adding the option -g, the symbolic execution time would be further reduced. However, note that the current version of Green does not support the CORAL solver which is required for solving the constraints generated for this example (and potentially other artifacts).