A tool for generating nonlinear numerical invariants for C and Java programs. DIG uses dynamic analysis to infer invariants over program execution traces and applies symbolic execution to inferred invariants.
DIG is an invariant generation tool that discovers program properties at arbitrary program locations (e.g., loop invariants, post conditions). DIG focuses on numerical invariants and currently supports the following numerical relations:
x+y=5
, x*y=z
, x*3y^3 + 2*zw + pq + q^3 = 3
-4 <= x <= 7, -2 <= - x - y <= 10
max(x,y) <= z + 2
, and nonlinear forms, e.g., x^2 −xy −xz +yz =0
is (x −y)=0 v (x − z)=0
x == 0 (mod 4), x+y == 1 (mod 5)
A[i] = B[C[3*i+2]]
t1 = 2^x, t2 = log(n)
, and have DIG infer invariants over terms.DIG’s numerical relations (in particular, nonlinear relations) have been used in many research projects, including
TSE21
, ICSE12
, ICSE14
, ASE17
, FSE17
, TOSEM13
)O(N^2)
or O(NM)
(ASE17
, FSE17
)T(n)=3*T(n/2) + f(n)
, (SEAD20
, OOPSLA21
)OOPSLA20
)A[i] = B[C[2i + 3]]
, (ICSE12
, TOSEM13
)dig-nla
and dig-nla-scaling
.
various examples demonstrating the power of DIG.
A good starting point to understand DIG and its usage at high level is reading our ICSE’22 tool and TSE’21 papers .
FAQs might be useful for researches, e.g., what makes DIG different than others?
bash
# clone DIG
$ git clone --depth 1 https://github.com/dynaroars/dig.git
# Then go to DIG's directory
$ cd dig # in DIG's directory
# build the docker image, will take some time to install and build everything
$ docker build . -t='dig'
...
...
# then run dig
$ docker run -it dig
# docker will drop you into a Linux prompt like below
$ root@b53e0bd86c11:/dig/src#
# now you can run DIG -- the more CPUs/CORES your machine has, the faster DIG will run.
# run DIG on a trace file
root@931ac8632c7f:/dig/src# time ~/miniconda3/bin/python3 -O dig.py ../examples/traces/cohendiv.csv -log 4
...
...
# or on a C program
# FASTER: restrict nonlinear eqts to degree 2 and but don't generate inequalities or minmax invariants and
root@931ac8632c7f:/dig/src# time ~/miniconda3/bin/python3 -O dig.py ../benchmark/c/nla/cohendiv.c -maxdeg 2 -noieqs -nominmax -log 4
# If you have some error SARL expected version X but found Y, then edit ~/.sarl and change from X to Y
# SLOWER: but gives everything
root@931ac8632c7f:/dig/src# time ~/miniconda3/bin/python3 -O dig.py ../benchmark/c/nla/cohendiv.c -log 4
...
# to update DIG to the latest from github, do a git pull in the main DIG directory in the Docker
root@931ac8632c7f:/dig/src# git pull
...
...
DIG can generate invariants from a trace file (a plain text semi-colon separated csv
file consisting of concrete values of variables) or a program (a C file .c
).
csv
file consisting of concreting program execution traces as shown below.txt
# in DIG's src directory
$ less ../test/traces/cohendiv.csv
vtrace1; I q; I r; I a; I b; I x; I y
vtrace1; 4; 8; 1; 4; 24; 4
vtrace1; 16; 89; 1; 13; 297; 13
vtrace1; 8; 138; 4; 76; 290; 19
vtrace1; 0; 294; 8; 192; 294; 24
vtrace1; 64; 36; 4; 16; 292; 4
...
vtrace2; I x; I y; I q; I r
vtrace2; 280; 24; 11; 16
vtrace2; 352; 11; 32; 0
vtrace2; 22; 298; 0; 22
vtrace2; 274; 275; 0; 274
vtrace2; 2; 287; 0; 2
...
txt
# in DIG's src directory
tnguyen@origin ~/d/src (dev)> time ~/miniconda3/bin/python3 -O dig.py ../tests/traces/cohendiv.csv -log 3 (base)
settings:INFO:2021-10-29 13:51:40.966898: dig.py ../tests/traces/cohendiv.csv -log 3
alg:INFO:analyzing '../tests/traces/cohendiv.csv'
alg:INFO:check 546 invs using 181 traces (0.26s)
alg:INFO:simplify 544 invs (2.35s)
vtrace1(17 invs):
1. a*y - b == 0
2. q*y + r - x == 0
3. -q <= 0
4. -y <= -1
5. a - b <= 0
6. r - x <= 0
7. b - r <= 0
8. a - x <= -5
9. -b + y <= 0
10. -x + y <= -6
11. -q - r <= -8
12. -r - x <= -16
13. -x - y <= -10
14. a + 2 - max(q, r, y) <= 0
15. y + 2 - max(b, q, r, 0) <= 0
16. -q === 0 (mod 2)
17. -r - x === 0 (mod 2)
vtrace2(8 invs):
1. q*y + r - x == 0
2. -q <= 0
3. -r <= 0
4. q - x <= 0
5. r - x <= 0
6. r - y <= -1
7. -q - r <= -1
8. -x - y <= -10
cohendiv.c
programc
// in DIG's src directory
// $ less ../test/cohendiv.c
#include <stdio.h>
#include <stdlib.h>
void vassume(int b){}
void vtrace1(int q, int r, int a, int b, int x, int y){}
void vtrace2(int q, int r, int a, int b, int x, int y){}
void vtrace3(int q, int r, int x, int y){}
int mainQ(int x, int y){
vassume(x >= 1 && y >= 1);
int q=0;
int r=x;
int a=0;
int b=0;
while(1) {
vtrace1(q, r, a, b, x, y);
if(!(r>=y))
break;
a=1;
b=y;
while (1){
vtrace2(q, r, a, b, x, y);
if(!(r >= 2*b))
break;
a = 2*a;
b = 2*b;
}
r=r-b;
q=q+a;
}
vtrace3(q, r,x, y);
return q;
}
void main(int argc, char **argv){
mainQ(atoi(argv[1]), atoi(argv[2]));
}
vtraceX
where X
is some distinct number and call that function at that location.
For example, in cohendiv.c
, we call vtrace0
, vtrace1
at the head of the outter and inner while loops find loop invariants and vtrace2
before the function exit to find post conditions.vtraceX
takes a list of arguments that are variables in scope at the desired location. This tells DIG to find invariants over these variables.cohendiv.c
and discover the following invariants at the vtracesX
locations:sh
$ time ~/miniconda3/bin/python3 -O dig.py ../tests/cohendiv.c -log 3
settings:INFO:2021-10-29 13:51:11.038391: dig.py ../tests/cohendiv.c -log 3
alg:INFO:analyzing '../tests/cohendiv.c'
alg:INFO:got symbolic states at 4 locs in 4.21s
alg:INFO:got 69 ieqs in 1.11s
alg:INFO:got 377 minmax in 1.69s
alg:INFO:got 6 eqts in 5.50s
alg:INFO:check 452 invs using 680 traces (0.33s)
alg:INFO:simplify 452 invs (1.40s)
* prog cohendiv locs 4; invs 29 (Eqt: 5, MMP: 1, Oct: 23) V 6 T 3 D 2; NL 5 (2) ;
-> time eqts 5.5s, ieqs 1.1s, minmax 1.7s, simplify 1.8s, symbolic_states 4.2s, total 11.5s
rand seed 1635533471.04, test 62
tmpdir: /var/tmp/dig_92233634043151007_2nugp63w
vtrace0(2 invs):
1. -y <= -1
2. -x <= -1
vtrace1(12 invs):
1. a*y - b == 0
2. q*y + r - x == 0
3. -r <= 0
4. -a <= 0
5. -y <= -1
6. q - x <= 0
7. a - q <= 0
8. b - x <= 0
9. r - x <= 0
10. a - b <= 0
11. -q - r <= -1
12. min(q, y) - b <= 0
vtrace2(8 invs):
1. a*y - b == 0
2. q*y + r - x == 0
3. -q <= 0
4. -y <= -1
5. r - x <= 0
6. b - r <= 0
7. a - b <= 0
8. -b + y <= 0
vtrace3(7 invs):
1. q*y + r - x == 0
2. -q <= 0
3. -r <= 0
4. q - x <= 0
5. r - x <= 0
6. r - y <= -1
7. -r - x <= -1
-noss
option disables symbolic states and thus makes DIG behaves as a pure dynamic invariant generation tools. Here, DIG runs the program on random inputs, collects traces, and infers invariants. It does not use symbolic states and thus does not require symbolic execution tools, but it can have spurious results.sh
$ time ~/miniconda3/bin/python3 -O dig.py ../tests/cohendiv.c -log 3 -noss -nrandinps 10
settings:INFO:2021-10-23 12:37:15.965808: dig.py ../tests/cohendiv.c -log 3 -noss -nrandinps 10
alg:INFO:analyzing '../tests/cohendiv.c'
alg:INFO:analyzing '../tests/cohendiv.c'
infer.eqt:WARNING:18 traces < 35 uks, reducing to deg 2
infer.eqt:WARNING:38 traces < 84 uks, reducing to deg 2
infer.eqt:WARNING:50 traces < 84 uks, reducing to deg 2
alg:INFO:testing 678 invs using 106 traces (0.30s)
alg:INFO:simplify 670 invs (3.26s)
vtrace1 (17 invs):
1. a*y - b == 0
2. q*y + r - x == 0
3. -a <= 0
4. -r <= 0
5. -y <= -9
6. a - b <= 0
7. a - q <= 0
8. r - x <= 0
9. q - x <= -6
10. b - x <= -2
11. -a - r <= -2
12. -x - y <= -16
13. min(q, r, x) - b <= 0
14. a + q === 0 (mod 2)
15. a - q === 0 (mod 2)
16. -a - q === 0 (mod 2)
17. -a + q === 0 (mod 2)
vtrace2 (17 invs):
1. a*y - b == 0
2. q*y + r - x == 0
3. -q <= 0
4. -y <= -9
5. r - x <= 0
6. b - r <= 0
7. -b + y <= 0
8. -r + y <= -2
9. -q - x <= -12
10. min(a, b, q) - y - 1 <= 0
11. b + 2 - max(a, q, r, y) <= 0
12. q === 0 (mod 2)
13. -q === 0 (mod 2)
14. r - x === 0 (mod 4)
15. r + x === 0 (mod 2)
16. -r + x === 0 (mod 4)
17. -r - x === 0 (mod 2)
vtrace3 (9 invs):
1. q*y + r - x == 0
2. -r <= 0
3. -q <= 0
4. -y <= -9
5. r - x <= 0
6. r - y <= -1
7. -q - x <= -6
8. -q - r <= -3
9. -x - y <= -16
benchmark/c/nla
contains many programs having nonlinear invariants.src/settings.py
lists all the defaut parameters). Use -h
or --help
to see options that can be passed into DIG. Below we show several useful ones.x^7
). This can take time and so we can specify DIG to search for equalities no more than some maximum degree X
using the option -maxdeg X
. This will make DIG runs faster (with the cost of not able to find equalities with higher degrees than X
).-noeqts
, -noieqs
, -nominmax
, nocongruences
sh
$ ~/miniconda3/bin/python3 -O dig.py ../tests/cohendiv.c -log 3 -maxdeg 2 -noieqs #find equalities up to degree 2 and do not infer inequalities
...
2
variables with coefs in in the set {-1,0,1}
). We can customize DIG to find more expression inequalities (of course, with the trade-off that it takes more time to generate more expressive invs).Sqrt1.java
to demonstratesh
$ ~/miniconda3/bin/python3 -O dig.py ../benchmark/c/nla/sqrt1.c -nominmax -nocongruences # find default, octagonal, ieq's.
...
1. 2*a - t + 1 == 0
2. 4*s - t**2 - 2*t - 1 == 0
3. -a <= 0
4. a - n <= 0
5. -n + t <= 2
6. -s + t <= 0
$ ~/miniconda3/bin/python3 -O dig.py ../benchmark/c/nla/sqrt1.c -nominmax -nocongruences -ideg 2 # find nonlinear octagonal inequalities
...
1. 2*a - t + 1 == 0
2. 4*s - t**2 - 2*t - 1 == 0
3. -a <= 0
4. a - n <= 0
5. -s + t <= 0
6. -n + t <= 2
7. -s**2 + t**2 <= 0
$ ~/miniconda3/bin/python3 -O dig.py ../benchmark/c/nla/sqrt1.c -nominmax -nocongruences -icoefs 2 # find linear inequalities with coefs in {2,-1,0,1,2}
...
1. 2*a - t + 1 == 0
2. 4*s - t**2 - 2*t - 1 == 0
3. -a <= 0
4. a - n <= 0
5. -n + 2*t <= 6
6. -2*n + s <= 2
7. -2*s + 2*t <= 0
csv
file consisting of program traces and it will infer invariants just over those traces (i.e., pure dynamic).d
(-maxdeg d
) will tell DIG not to search for nonlinear invariants with degree more than d
or disabling certain types of invariants if you’re not intested in them (e.g., -nominmax
to disable the computation of min/max properties)sh
@article{nguyen2021using,
title={Using symbolic states to infer numerical invariants},
author={Nguyen, Thanhvu and Nguyen, KimHao and Dwyer, Matthew B},
journal={IEEE Transactions on Software Engineering},
volume={48},
number={10},
pages={3877--3899},
year={2021},
publisher={IEEE}
}
sh
@inproceedings{nguyen2012using,
title={Using dynamic analysis to discover polynomial and array invariants},
author={Nguyen, ThanhVu and Kapur, Deepak and Weimer, Westley and Forrest, Stephanie},
booktitle={2012 34th International Conference on Software Engineering (ICSE)},
pages={683--693},
year={2012},
organization={IEEE}
}
Technical information about DIG and projects build upon DIG can be found from these papers. The tool
paper (ICSE22
) and Symbolic States
paper (TSE21
) are probably a good starting point.
Distinguish Paper Award