SMT Theories
A key feature of SMT is that the entire problem is parameterized by the choice of a theory \(T\). This is important because it means that SMT is an algorithmic framework, rather than a fixed algorithm. Thus, if a particular problem cannot easily be encoded in any existing theory supported by SMT solvers, one option is to add support for a new theory which is better suited to the problem. In fact, this is exactly the process by which many of the theories supported by modern SMT solvers were added.
Theories can be used alone or in arbitrary combinations. Besides the theory,
other parameters related to the syntax of formulas include whether or not to
enable quantifiers and whether to disallow or limit the use of certain theory
operations. In the SMT-LIB standard, and in solvers that support it, these pa-
rameters are configured by specifying a logic. A logic identifies the theory
(or theories) being used and optionally imposes syntactic restrictions on the
allowed formulas. Users can provide the SMT solver with a predefined logic name
(like QF_LIA, QF_BV and UF
seen earlier) to specify which logic is to be
used. By default (i.e., if no logic name is provided), SMT solvers typically
enable all the theories they support and allow all operations. This is
equivalent to using the special logic name ALL
. However, solvers are often
tuned with specific heuristics for specific logics. Thus, it is advisable to
provide the solver with the most specific logic name possible. In this section,
we discuss the most common theories and logics supported by SMT solvers, with
examples of each.
Core Theory and Uninterpreted Symbols
The SMT-LIB standard defines a core theory which consists of a core signature
with a fixed interpretation that is always present, regardless of which other
theories are being used. The core theory defines the Boolean sort Bool
(BoolSort()
in Python), the Boolean theory constants true
and
false
(BoolVal(True)
and BoolVal(False)
in Python),
and the operators =
, not
, and
, or
, xor
, and
=>
(==
, Not
, And
, Or
,
Xor
, and Implies
in Python), all with the usual meanings.
The equality symbol =
is polymorphic: it can be applied to two terms
of the same sort, for any predefined or user-declared sort. There are also two
more polymorphic operators that require a bit more explanation. The
distinct
(Distinct
) operator takes two or more arguments of
the same sort and returns true
exactly when all the arguments have
pairwise distinct values. The ite
(If
) operator takes three
arguments, the first of which must be of Boolean sort. The other two arguments
can have any sort as long as it is the same for both. The meaning of the
ite
operator is the second argument when the first argument is true, and
the third argument otherwise.
The simplest logic that builds on the core theory is QF_UF, short for
“quantifier-free uninterpreted functions.” This logic disallows quantifiers
and does not define any new symbols beyond those in the core theory. However,
it allows the user to extend the signature with new sorts and symbols. The SMT
solver is allowed to interpret these symbols in any way it chooses. This is
why they are referred to as uninterpreted: the solver does not impose any
restrictions on the interpretation (besides the declared arity and rank). The
following example illustrates the use of uninterpreted symbols as well as the
And
and Distinct
operators.
Example 4. Let \(f\) be a unary function from \(U\) to \(U\), for some set \(U\). Check that, whatever the meaning of \(f\), if \(f(f(f(x)))=x\) and \(f(f(f(f(f(x)))))=x\), then \(f(x)=x\).
Solutions for this example are shown below.
1(set-logic QF_UF)
2(declare-sort U 0)
3
4(declare-fun f (U) U)
5(declare-const x U)
6
7(assert (and (= (f (f (f x))) x) (= (f (f (f (f (f x))))) x)))
8(assert (distinct (f x) x))
9
10(check-sat)
1from cvc5.pythonic import *
2U = DeclareSort("U")
3f = Function("f", U, U)
4x = Const("x", U)
5
6s = SolverFor('QF_UF')
7s.add(And((f(f(f(x))) == x), (f(f(f(f(f(x))))) == x)))
8s.add(Distinct(f(x), x)) # negation of f(x) = x
9print(s.check())
10
We can derive \(f(x) = x\) from the first assertion by performing a series of substitutions, and thus the problem is unsatisfiable.
1unsat
1unsat
Now, we present a simple example that illustrates the ite
operator. It
also shows that in Python, we can use !=
instead of
Distinct
to assert that two terms are distinct.
Example 5. Suppose we know that \(x\) is either equal to \(y\) or \(z\), depending on the value of the Boolean \(b\). Suppose we further know that \(w\) is equal to one of \(y\) or \(z\). Does it follow that \(x\) equals \(w\)?
Solutions are shown below.
1(set-logic QF_UF)
2(set-option :produce-models true)
3
4(declare-sort U 0)
5
6(declare-const b Bool)
7(declare-const x U)
8(declare-const y U)
9(declare-const z U)
10(declare-const w U)
11
12(assert (= x (ite b y z)))
13(assert (or (= w y) (= w z)))
14(assert (distinct x w))
15
16(check-sat)
17(get-model)
1from cvc5.pythonic import *
2U = DeclareSort("U")
3b = Const("b", BoolSort())
4x, y, z, w = Consts("x y z w", U)
5
6s = SolverFor('QF_UF')
7s.add(x == (If(b, y, z)))
8s.add(Or((w == y), (w == z)))
9s.add(x != w)
10
11if s.check() == sat:
12 m = s.model()
13 print("\n".join([str(k) + " : " + str(m[k]) for k in m]))
The SMT-LIB and Python outputs when using cvc5 are as follows.
1sat
2(
3; cardinality of U is 2
4; rep: (as @U_0 U)
5; rep: (as @U_1 U)
6(define-fun b () Bool true)
7(define-fun x () U (as @U_1 U))
8(define-fun y () U (as @U_1 U))
9(define-fun z () U (as @U_0 U))
10(define-fun w () U (as @U_0 U))
11)
1 b : True
2 w : (as @U_0 U)
3 x : (as @U_1 U)
4 y : (as @U_1 U)
5 z : (as @U_0 U)
The result tells us that it does not follow that \(x=w\). The model
gives us a counterexample to that claim. Because the sort U
is
uninterpreted, the model returned by cvc5 must choose an interpretation for
it. Here, cvc5 tells us that it is interpreting U
as a set with two
elements, named @U_0
and @U_1
. The model then specifies that
x
and y
have one value and z
and w
have the other,
so x
is not equal to w
.
Arithmetic
Though there are many tools available for arithmetic reasoning, SMT solvers are unique in their ability to reason efficiently about arbitrary Boolean combinations of arithmetic constraints, as well as to combine arithmetic reasoning with reasoning about other theories. It is important to note that SMT solvers reason precisely about both integer and real arithmetic. That is, they use arbitrary-precision arithmetic as opposed to machine integer or floating-point approximations. This means that SMT solvers are not susceptible to the numerical errors that can arise, for instance, when using floating-point arithmetic to approximate real arithmetic. It also means that for problems whose complexity lies mainly in the arithmetic reasoning, as opposed to Boolean reasoning, SMT solvers are typically slower than tools that use floating-point approximations. The underlying algorithms for arithmetic reasoning in SMT solvers are based on standard techniques that have been adapted to the SMT context, such as the Simplex algorithm [R20] and Cylindrical Algebraic Decomposition [R2].
There are a large number of logics to choose from within the arithmetic umbrella, with reasoning over reals generally more efficient than reasoning over integers, and reasoning over less expressive formulas generally more efficient than reasoning over more expressive ones. We briefly discuss the various logics here.
Difference Logic
In difference logic, every arithmetic constraint must be of the form \(x - y \bowtie c\) or \(x \bowtie c\), where \(\bowtie\ \in\{=, <, >, \le, \ge\}\), and \(c\) is a numeric theory constant. If \(x\) and \(y\) range over integers, we call it integer difference logic, and if they range over reals, we call it real difference logic. Efficient algorithms exist for both [R17], [R36]. The names of these logics are QF_IDL and QF_RDL, respectively. One application for difference logic is job shop scheduling [R41].
Example 6. Suppose we have 3 jobs to complete on 2 machines. Job 1 requires machine 1 for 10 minutes and then machine 2 for 5 minutes. Job 2 requires machine 2 for 20 minutes and then machine 1 for 5 minutes. And Job 3 requires machine 1 for 5 minutes and then machine 2 for 5 minutes. Can all jobs be completed in 30 minutes?
To solve the problem, we create integer variables for the start times of each task within each job. We assert that the start times are non-negative, each task within each job doesn’t start until the previous task finishes, and tasks on each machine don’t overlap. Finally, we check that each task finishes on time.[1]
1(set-logic QF_IDL)
2(set-option :produce-models true)
3
4(declare-const j11 Int)
5(declare-const j12 Int)
6(declare-const j21 Int)
7(declare-const j22 Int)
8(declare-const j31 Int)
9(declare-const j32 Int)
10
11(assert (and (>= j11 0) (>= j12 0) (>= j21 0) (>= j22 0) (>= j31 0) (>= j32 0)))
12(assert (and (>= (- j12 j11) 10) (>= (- j22 j21) 20) (>= (- j32 j31) 5)))
13(assert (and (or (>= (- j22 j11) 10) (>= (- j11 j22) 5))
14 (or (>= (- j31 j11) 10) (>= (- j11 j31) 5))
15 (or (>= (- j31 j22) 5) (>= (- j22 j31) 5))))
16(assert (and (or (>= (- j21 j12) 5) (>= (- j12 j21) 20))
17 (or (>= (- j32 j12) 5) (>= (- j12 j32) 5))
18 (or (>= (- j32 j21) 20) (>= (- j21 j32) 5))))
19(assert (and (<= j12 25) (<= j22 25) (<= j32 25)))
20
21(check-sat)
22(get-model)
1from cvc5.pythonic import *
2
3j11,j12,j21,j22,j31,j32 = Ints("j11 j12 j21 j22 j31 j32")
4
5s = SolverFor('QF_IDL')
6s.add(And([x >= 0 for x in [j11, j12, j21, j22, j31, j32]]))
7s.add(And(j12 - j11 >= 10, j22 - j21 >= 20, j32 - j31 >= 5))
8s.add(And(Or(j22 - j11 >= 10, j11 - j22 >= 5),
9 Or(j31 - j11 >= 10, j11 - j31 >= 5),
10 Or(j31 - j22 >= 5, j22 - j31 >= 5)))
11s.add(And(Or(j21 - j12 >= 5, j12 - j21 >= 20),
12 Or(j32 - j12 >= 5, j12 - j32 >= 5),
13 Or(j32 - j21 >= 20, j21 - j32 >= 5)))
14s.add(And(j12 <= 25, j22 <= 25, j32 <= 25))
15
16print(s.model() if s.check() == sat else "unsat")
The output is as follows.
1sat
2(
3(define-fun j11 () Int 5)
4(define-fun j12 () Int 20)
5(define-fun j21 () Int 0)
6(define-fun j22 () Int 20)
7(define-fun j31 () Int 15)
8(define-fun j32 () Int 25)
9)
1[j11 = 5, j12 = 20, j21 = 0, j22 = 20, j31 = 15, j32 = 25]
Exercise 4. What is the minimum amount of time that it will take to complete all of the jobs in Example 6? What is the minimum if we change the requirements so that Job 2 only requires machine 2 for 15, and all other requirements remain the same?
Linear Arithmetic
The logic of linear arithmetic allows arithmetic constraints to have any form
that is equivalent to \(\sum c_i x_i + b \bowtie 0\), where \(b,c_i\)
are numeric theory constants and \(\bowtie\ \in\{=,<,>,\le,\ge\}\). As
before, there are both integer and real variants, QF_LIA and
QF_LRA, respectively. One can also mix the two with QF_LIRA
. Note
that, according to the SMT-LIB standard, when using QF_LIRA
, integers and
reals should not be mixed in the same linear sum, but most solvers (including
cvc5 and z3) are more permissive and do allow mixed terms.
Example 1 is a good example of a simple QF_LIA problem.
Exercise 5. Repeat Exercise 1, but change the logic to QF_LRA, change the types of the variables from
Int
toReal
, and append.0
to each numeric constant. Now, what output does the solver give?
Nonlinear arithmetic
Moving up the expressiveness hierarchy, we next have logics for quantifier-free nonlinear arithmetic. In these logics, arbitrary polynomials are allowed in constraints. The logic QF_NRA is for nonlinear arithmetic over the reals, which is decidable but with doubly exponential complexity [R2]. On the other hand, the same logic over integers, QF_NIA, is undecidable. cvc5 implements a decision procedure for QF_NRA based on a combination of heuristic pruning and cylindrical algebraic coverings [R29]. cvc5 and other tools implement incomplete heuristic procedures for QF_NIA.
Example 7. Find a solution for \(x^2y + yz + 2xyz + 4xy + 8xz + 16 = 0\).
1(set-logic QF_NRA)
2(set-option :produce-models true)
3
4(declare-const x Real)
5(declare-const y Real)
6(declare-const z Real)
7
8(assert (= (+ (* x x y) (* y z) (* 2.0 x y z) (* 4.0 x y) (* 8.0 x z) 16.0) 0.0))
9(check-sat)
10(get-model)
1from cvc5.pythonic import *
2x, y, z = Reals("x y z")
3s = SolverFor('QF_NRA')
4s.add(x*x*y + y*z + 2*x*y*z + 4*x*y + 8*x*z + 16 == 0)
5print(s.model() if s.check() == sat else "unsat")
6
The output is as follows.
1sat
2(
3(define-fun x () Real (- 2.0))
4(define-fun y () Real (- 7.0))
5(define-fun z () Real (/ (- 44) 5))
6)
1[x = -2, y = -7, z = -44/5]
Arrays
Consider the following Python function which swaps two elements in a dictionary.
def swap(a,i,j):
tmp = a[i]
a[i] = a[j]
a[j] = tmp
If a[i]
and a[j]
happen to be equal, the dictionary
a
is unchanged by the function. To prove this fact, we could try
modeling dictionaries as uninterpreted functions. However, asserting that two
functions are equal is not allowed in first-order logic. Alternatively, we
could use a quantifier to assert that two functions return the same output when
given the same input, for any input. However, we would like to avoid
quantifiers when possible, as their use puts us in an undecidable logic.
Fortunately, the SMT-LIB standard includes a theory of arrays [R30], which can
help in this situation. The theory is perhaps more accurately viewed as a
theory of mutable maps and is parameterized by two sorts, one for the index
(corresponding to the key type of the dictionary) and one for the elements
(values in the dictionary). For example, the SMT-LIB sort (Array Int
Real)
represents arrays indexed by integers and containing reals. Note that
SMT arrays are always total, in the sense that they have an element for every
value in the index sort. In particular, an array indexed by Int
is
conceptually infinite.
The theory has two operators: select
, which takes an array and an index
and returns the element at that index, and store
, which takes an array
\(a\), an index \(i\), and an element \(e\), and returns a new
array that is the result of updating \(a\) with the element \(e\) at
index \(i\).
Typically, the theory of arrays is used in combination with other theories that
make sense for the index and element sorts. For example, the logic
QF_DT
allows quantifier-free formulas with variables that range over
integers and arrays of integers. The simplest logic with arrays is
QF_AX, in which all the sorts must be uninterpreted.
In the example below, we encode the above problem using the array theory.
Example 8. For the Python program above, show that, for arbitrary index and element sorts, if
a[i]
anda[j]
are equal, then so area
andswap(a,i,j)
.
1(set-logic QF_AX)
2(declare-sort I 0)
3(declare-sort E 0)
4(declare-fun i () I)
5(declare-fun j () I)
6(declare-fun tmp () E)
7(declare-fun a_in () (Array I E))
8(declare-fun a_out () (Array I E))
9
10(assert (= tmp (select a_in i)))
11(assert (= a_out (store (store a_in i (select a_in j)) j tmp)))
12(assert (= (select a_in i) (select a_in j)))
13(assert (distinct a_in a_out))
14
15(check-sat)
1from cvc5.pythonic import *
2I = DeclareSort("I")
3E = DeclareSort("E")
4i, j = Consts("i j", I)
5tmp = Const("tmp", E)
6array = ArraySort(I, E)
7a_in, a_out = Consts("a_in, a_out", array)
8
9s = SolverFor('QF_AX')
10
11s.add(tmp == a_in[i])
12s.add(a_out == (Store(Store(a_in, i, a_in[j]),
13 j, tmp)))
14s.add(a_in[i] == a_in[j])
15s.add(a_in != a_out)
16
17print(s.check())
18
The output is as follows.
1unsat
1unsat
Exercise 6. Another property of
swap
that we can prove is that ifa[i]
anda[j]
are distinct, thenswap
would changea
. Modify the solution for Example 8 to prove this property.
Bit-vectors
Consider a simple implementation (written in a C-like syntax) for computing the absolute value of a 32-bit integer: \(abs(x) := x < 0 \;?\; {-x} : x\). Instead of branching on \(x < 0\), it is possible to compute the absolute value of \(x\) with three or four branch-free operations [R28] as follows. Let \(\mathit{xrs}\) be an abbreviation for the arithmetic right shift (\(\mathop{>\kern-.3em>}_s\)) of \(x\) by \(31\) bits. Note that the result of this operation is either \(0\) or \(-1\) (all bits set to 1), depending on the most significant bit (MSB) of \(x\): if the MSB of \(x\) is 0, \(\mathit{xrs}\) is 0; otherwise, \(\mathit{xrs}\) is -1. Three branchless alternatives for computing the absolute value of \(x\) are as follows.
\(\mathit{abs}_1(x) := (x \oplus \mathit{xrs}) - \mathit{xrs}\)
\(\mathit{abs}_2(x) := (x + \mathit{xrs}) \oplus \mathit{xrs}\)
\(\mathit{abs}_3(x) := x - ((x \mathop{<\kern-.3em<} 1) \mathrel{\&} \mathit{xrs})\)
These branchless versions of \(\mathit{abs}(x)\) make use of the 32-bit versions of the bit-wise operations exclusive or (\(\oplus\)), bit-wise and (\(\mathrel{\&}\)), logical shift left (\(\mathop{<\kern-.3em<}\)), and arithmetic shift right (\(\mathop{>\kern-.3em>}_s\)).
We can use an SMT solver to prove whether the branchless versions are equivalent to the original implementation. Note that integers, as discussed in Arithmetic, are not a good fit, as it is difficult to model the bitwise operators using the arithmetic operators. However, the SMT-LIB standard includes a theory of fixed-size bit-vectors, which defines the bit-precise semantics of fixed-size machine integers. The name for the quantifier-free logic containing just this theory is QF_BV. Using this logic, we can easily check the equivalence of the absolute value computations.
Example 9. Show that the first branchless alternative \(\mathit{abs}_1\) is equivalent to \(\mathit{abs}\).
1(set-logic QF_BV)
2
3(declare-const x (_ BitVec 32))
4
5; orignal abs(x) version
6(define-fun abs ((x (_ BitVec 32))) (_ BitVec 32) (ite (bvslt x (_ bv0 32)) (bvneg x) x))
7
8; encode branchless abs(x)
9(define-fun xrs () (_ BitVec 32) (bvashr x (_ bv31 32)))
10(define-fun abs1 ((x (_ BitVec 32))) (_ BitVec 32) (bvsub (bvxor x xrs) xrs))
11
12(assert (distinct (abs x) (abs1 x))) ; check if abs() and abs1() are equivalent
13
14(check-sat)
1from cvc5.pythonic import *
2x = Const("x", BitVecSort(32))
3xrs = x >> 31
4s = SolverFor('QF_BV')
5s.add(If(x < 0, -x, x) != (x ^ xrs) - xrs) # prove abs() == abs1()
6print(s.model() if s.check() == sat else "unsat")
The output is as follows.
Exercise 7. Show that the second and third branchless alternatives \(\mathit{abs}_2\) and \(\mathit{abs}_3\) are equivalent to \(\mathit{abs}\).
Datatypes
Built into the SMT-LIB language is a mechanism for defining (algebraic)
datatypes. Datatypes are highly useful in applications for reasoning
about data structures like records, lists, and trees [R7]. The
quantifier-free logic name is QF_DT
.
Example 10. Model a binary tree containing integer data. Find trees \(x\) and \(y\) such that \((i)\) the left subtree of \(x\) is the same as the right subtree of \(y\) and \((ii)\) the data stored in \(x\) is greater than \(100\).
Note that we need both datatypes and integer arithmetic for this example.
cvc5 supports the logic name QF_DTLIA
, but z3 does not.
Fortunately, we can always use ALL
for the logic if a
more specific logic is not available.
1(set-logic QF_DTLIA)
2(set-option :produce-models true)
3
4(declare-datatypes ((Tree 0)) (((node (data Int) (left Tree) (right Tree)) (nil))))
5
6(declare-fun x () Tree)
7(declare-fun y () Tree)
8
9(assert ((_ is node) x))
10(assert ((_ is node) y))
11
12(assert (= (left x) (right y)))
13(assert (> (data x) 100))
14
15(check-sat)
16(get-model)
1from cvc5.pythonic import *
2decl = Datatype("tree")
3decl.declare("node", ("data", IntSort()), ("left", decl), ("right", decl))
4decl.declare("nil")
5Tree = decl.create()
6
7x, y = Consts("x y", Tree)
8
9s = SolverFor('ALL')
10s.add(Tree.is_node(x))
11s.add(Tree.is_node(y))
12s.add(Tree.left(x) == Tree.right(y))
13s.add(Tree.data(x) > 100)
14
15print(s.model() if s.check() == sat else "unsat")
16
The output from cvc5 is:
1sat
2(
3(define-fun x () Tree (node 101 nil (node 0 nil nil)))
4(define-fun y () Tree (node 0 (node 0 nil (node 0 nil nil)) nil))
5)
1[x = node(101, nil, node(0, nil, nil)), y = node(0, node(0, nil, node(0, nil, nil)), nil)]
Exercise 8. Show that a tree cannot be equal to its own left subtree.
Floating-Point Arithmetic
The most common representation of real numbers in hardware and software is
the binary floating-point number representation system as defined by the
IEEE Standard 754-2019 for Floating-Point Arithmetic [R27].
Floating-point numbers are encoded as a triple of bit-vectors: the fractional
part (the significand), the exponent (a power of 10 by which the
significand is multiplied), and a sign bit.
This representation is of limited range and precision, and thus, the domain of
floating-point numbers is finite. It also includes special values for
representing errors as not-a-number and for plus and minus infinity.
In SMT-LIB, the IEEE-754 standard is formalized as the theory of floating-point
arithmetic [R11]. The quantifier-free logic name is QF_FP
.
Example 11. The SMT-LIB standard supports a fused multiplication and addition operator
fp.fma
. Given three single precision floating-point numbers \(a\), \(b\), and \(c\), show that the floating-point fused multiplication and addition of \(a\), \(b\), and \(c\) is different from first multiplying \(a\) and \(b\) and then adding \(c\).
1(set-logic QF_FP)
2(set-option :produce-models true)
3(declare-const a Float32)
4(declare-const b Float32)
5(declare-const c Float32)
6(declare-const rm RoundingMode)
7(assert (distinct (fp.fma rm a b c) (fp.add rm (fp.mul rm a b) c)))
8(check-sat)
9(get-model)
10(get-value ((fp.fma rm a b c)))
11(get-value ((fp.add rm (fp.mul rm a b) c)))
1from cvc5.pythonic import *
2
3a, b, c = FPs("a b c", Float32())
4rm = Const("rm", RNE().sort())
5s = SolverFor('QF_FP')
6s.add(Distinct(fpFMA(rm, a, b, c), fpAdd(rm, fpMul(rm, a, b),c)))
7result = s.check()
8m = s.model()
9print(m)
10print(f'fpFMA(rm, a, b, c) = {m.eval(fpFMA(rm, a, b, c))}')
11print(f'fpAdd(rm, fpMul(rm, a, b),c) = {m.eval(fpAdd(rm, fpMul(rm, a, b),c))}')
The output from cvc5 is as follows.
1sat
2(
3(define-fun a () (_ FloatingPoint 8 24) (fp #b1 #b01111110 #b01010101010101010101011))
4(define-fun b () (_ FloatingPoint 8 24) (fp #b1 #b01111110 #b11111111111111111111111))
5(define-fun c () (_ FloatingPoint 8 24) (fp #b1 #b01111110 #b11111111111111111111111))
6(define-fun rm () RoundingMode roundTowardPositive)
7)
8(((fp.fma rm a b c) (fp #b1 #b01111101 #b01010101010101010101001)))
9(((fp.add rm (fp.mul rm a b) c) (fp #b1 #b01111101 #b01010101010101010101000)))
1[a = -1.3333333730697632*(2**-1), b = -1.9999998807907104*(2**-1), c = -1.9999998807907104*(2**-1), rm = RTP()]
2fpFMA(rm, a, b, c) = -1.333333134651184*(2**-2)
3fpAdd(rm, fpMul(rm, a, b),c) = -1.3333330154418945*(2**-2)
Exercise 9. Modify the solution to Example 11 to show that floating-point addition is not associative, i.e., \(a + (b + c) \not= (a + b) + c\).
Strings
It is often necessary to reason about string data when reasoning about
programs. Reasoning about bit-vector representations of strings has the
disadvantage that it requires fixing the string length up front. Also, the
theory of bit-vectors does not include many of the utility functions for
strings that exist in string libraries in programming languages. The SMT-LIB
theory of strings provides support for variable-length strings and a large set
of string operations. The quantifier-free logic name is QF_S
. Typically,
though, we use QF_SLIA
since we need arithmetic to reason about string
lengths.
Example 12. Given two strings,
x1
andx2
, each consisting of no more than two characters, is it possible to build the string"abbaabb"
using only 3 string concatenations (where each concatenation may use any previous result includingx1
andx2
)?
We can solve this problem by building a circuit of string concatenations and using nondeterministic choice to pick the inputs for each concatenation.
1(set-logic QF_SLIA)
2(set-option :produce-models true)
3
4(declare-const p1 Bool)
5(declare-const p2 Bool)
6(declare-const p3 Bool)
7(declare-const p4 Bool)
8(declare-const p5 Bool)
9(declare-const p6 Bool)
10(declare-const p7 Bool)
11(declare-const p8 Bool)
12(declare-const p9 Bool)
13(declare-const p10 Bool)
14(declare-const p11 Bool)
15(declare-const p12 Bool)
16
17(declare-const x1 String)
18(declare-const x2 String)
19(declare-const x3 String)
20(declare-const x4 String)
21(declare-const x5 String)
22
23(declare-const i1 String)
24(declare-const i2 String)
25(declare-const i3 String)
26(declare-const i4 String)
27(declare-const i5 String)
28(declare-const i6 String)
29
30(define-const result String "abbaabb")
31
32(assert (and (<= (str.len x1) 2) (<= (str.len x2) 2)))
33
34(assert (= i1 (ite p1 x1 x2)))
35(assert (= i2 (ite p2 x1 x2)))
36(assert (= x3 (str.++ i1 i2)))
37
38(assert (= i3 (ite p3 x1 (ite p4 x2 x3))))
39(assert (= i4 (ite p5 x1 (ite p6 x2 x3))))
40(assert (= x4 (str.++ i3 i4)))
41
42(assert (= i5 (ite p7 x1 (ite p8 x2 (ite p9 x3 x4)))))
43(assert (= i6 (ite p10 x1 (ite p11 x2 (ite p12 x3 x4)))))
44(assert (= x5 (str.++ i5 i6)))
45
46(assert (= x5 result))
47
48(check-sat)
49(get-model)
1from cvc5.pythonic import *
2p, x, i = {}, {}, {}
3for k in range(1, 13): p[k] = Bool("p" + str(k))
4for k in range(1, 6): x[k] = String("x" + str(k))
5for k in range(1,7): i[k] = String("i" + str(k))
6
7result = StringVal("abbaabb")
8
9s = SolverFor('QF_SLIA')
10s.add(And(Length(x[1]) <= 2, Length(x[2]) <= 2))
11
12s.add(i[1] == If(p[1], x[1], x[2]))
13s.add(i[2] == If(p[2], x[1], x[2]))
14s.add(x[3] == Concat(i[1], i[2]))
15
16s.add(i[3] == If(p[3], x[1], If(p[4], x[2], x[3])))
17s.add(i[4] == If(p[5], x[1], If(p[6], x[2], x[3])))
18s.add(x[4] == Concat(i[3], i[4]))
19
20s.add(i[5] == If(p[7], x[1], If(p[8], x[2], If(p[9], x[3], x[4]))))
21s.add(i[6] == If(p[10], x[1], If(p[11], x[2], If(p[12], x[3], x[4]))))
22s.add(x[5] == Concat(i[5], i[6]))
23
24s.add(x[5] == result)
25
26print(s.model() if s.check() == sat else "unsat")
The output is as follows:
1sat
2(
3(define-fun p1 () Bool true)
4(define-fun p2 () Bool false)
5(define-fun p3 () Bool true)
6(define-fun p4 () Bool true)
7(define-fun p5 () Bool false)
8(define-fun p6 () Bool false)
9(define-fun p7 () Bool false)
10(define-fun p8 () Bool false)
11(define-fun p9 () Bool true)
12(define-fun p10 () Bool false)
13(define-fun p11 () Bool false)
14(define-fun p12 () Bool false)
15(define-fun x1 () String "a")
16(define-fun x2 () String "bb")
17(define-fun x3 () String "abb")
18(define-fun x4 () String "aabb")
19(define-fun x5 () String "abbaabb")
20(define-fun i1 () String "a")
21(define-fun i2 () String "bb")
22(define-fun i3 () String "a")
23(define-fun i4 () String "abb")
24(define-fun i5 () String "abb")
25(define-fun i6 () String "aabb")
26)
1[i1 = "a", i2 = "bb", i3 = "a", i4 = "abb", i5 = "abb", i6 = "aabb", p1 = True, p10 = False, p11 = False, p12 = False, p2 = False, p3 = True, p4 = True, p5 = False, p6 = False, p7 = False, p8 = False, p9 = True, x1 = "a", x2 = "bb", x3 = "abb", x4 = "aabb", x5 = "abbaabb"]
Exercise 10. Use SMT to determine how many concatenations are needed to get
"abbaabb"
ifx1
andx2
are both restricted to have a length of 1.
Quantifiers
We saw an example of quantified formulas in Example 3.
Quantifiers can be enabled in SMT solvers by dropping QF
from the logic name.
However, enabling quantifiers typically increases the complexity of the
decision problem significantly. In fact, solving UF
problems is equivalent
to solving the decision problem for first-order logic, Hilbert’s original
Entscheidungsproblem, which is undecidable. And although LIA, LRA, and
NRA are decidable, the decision procedures are expensive. For these reasons,
SMT solvers mostly handle quantifiers by attempting to find quantifier
instantiations that, together with the other quantifier-free assertions, are
unsatisfiable. For problems that are expected to be unsatisfiable, this
approach can be quite effective. Moreover, by using different instantiation
techniques and effort levels, a wide variety of problems can be solved.
cvc5 supports several techniques for handling quantified formulas, which can
vary based on the logic. By default, cvc5 limits its effort so that it
usually returns quickly with an answer of either unsat
or
unknown
. For logics that include uninterpreted functions, it uses a
combination of E-matching [R31] and conflict-based instantiation [R40]. In
case the user wants to invest more effort, these techniques can be supplemented
with techniques such as enumerative instantiation [R38] (option
enum-inst
). For logics that admit quantifier elimination (e.g.,
quantified linear arithmetic or bit-vectors), it uses counterexample-guided
quantifier instantiation [R34], [R39], which is a complete procedure for
these logics.
By default, cvc5 will generally not attempt to determine that an input with
quantified formulas is satisfiable. However, more advanced techniques can be
used to answer sat
in the presence of quantified formulas, including
finite model finding [R37] (option finite-model-find
), model-based
quantifier instantiation [R23] (option mbqi
), and syntax-guided
quantifier instantiation [R35] (option sygus-inst
).
In general, to set options that are not on by default, we can use the
setOption
solver method in Python, as shown below.
from cvc5.pythonic import *
s = SolverFor('UF')
s.setOption('enum-inst', True)
s.setOption('finite-model-find', True)
s.setOption('mbqi', True)
s.setOption('sygus-inst', True)
Non-standard Theories
cvc5 and z3 support several theories that are not (yet) part of the SMT-LIB standard. We discuss a few of them briefly here, focusing on those supported by cvc5. More documentation about non-standard theories, including reference tables describing the supported operators can be found on the cvc5 website.
Sequences
The theory of sequences brings together features of the theories of arrays
and strings. Similar to arrays, sequences are parameterized by the sort of
their elements. So we can declare a sequence of integers, a sequence of
bit-vectors, and so on. Like strings, sequences have a variable but finite
length and can be concatenated together. The sequence theory is enabled
whenever the string theory is enabled (e.g., by using the logic name QF_S
or
QF_SLIA
. Note that z3 also supports a theory of sequences that is mostly
(but not entirely) compatible with the cvc5 version.
Example 13. Let
x
be a sequence of integers. Find a value forx
such that the first and last elements sum to 9, and if we concatenatex
with itself, then \((3,4,5)\) appears as a subsequence.
1(set-logic QF_SLIA)
2(set-option :produce-models true)
3
4(declare-const x (Seq Int))
5(declare-const y (Seq Int))
6(declare-const z (Seq Int))
7
8(assert (> (seq.len x) 0))
9(assert (= (+ (seq.nth x 0) (seq.nth x (- (seq.len x) 1))) 9))
10
11(assert (= y (seq.++ x x)))
12(assert (= z (seq.++ (seq.unit 3) (seq.unit 4) (seq.unit 5))))
13(assert (seq.contains y z))
14
15(check-sat)
16(get-model)
1from cvc5.pythonic import *
2x, y, z = Consts("x y z", SeqSort(IntSort()))
3
4s = SolverFor('QF_SLIA')
5
6s.add(Length(x) > 0)
7s.add(x[0] + x[Length(x) - 1] == 9)
8s.add(y == Concat(x,x))
9s.add(z == Concat(Unit(IntVal(3)), Unit(IntVal(4)), Unit(IntVal(5))))
10s.add(Contains(y, z))
11
12print(s.model() if s.check() == sat else "unsat")
13
The output is as follows.
1sat
2(
3(define-fun x () (Seq Int) (seq.++ (seq.unit 4) (seq.unit 3) (seq.unit 4) (seq.unit 5)))
4(define-fun y () (Seq Int) (seq.++ (seq.unit 4) (seq.unit 3) (seq.unit 4) (seq.unit 5) (seq.unit 4) (seq.unit 3) (seq.unit 4) (seq.unit 5)))
5(define-fun z () (Seq Int) (seq.++ (seq.unit 3) (seq.unit 4) (seq.unit 5)))
6)
1[x = (seq.++ (seq.unit 4) (seq.unit 3) (seq.unit 4) (seq.unit 5))(), y = (seq.++ (seq.unit 4) (seq.unit 3) (seq.unit 4) (seq.unit 5) (seq.unit 4) (seq.unit 3) (seq.unit 4) (seq.unit 5))(), z = (seq.++ (seq.unit 3) (seq.unit 4) (seq.unit 5))()]
Exercise 11. Show that it’s not possible to have sequences
x
,y
, andz
such thatx
is a proper prefix ofy
,y
is a proper prefix ofz
, andz
is a proper prefix ofx
.
Finite Fields
cvc5 can reason about constraints over finite fields of order \(p\),
where \(p\) is any prime. It relies on the fact that a field of order
\(p\) is isomorphic to the integers modulo \(p\). The quantifier-free
logic name for finite fields is QF_FF
. At the time of writing, this
theory is not supported by other SMT solvers.
Example 14. In a finite field of order 13, find two elements such that their sum and product are both equal to the multiplicative identity in the field.
Running this example requires a GPL build of cvc5, as explained in the Introduction.
1(set-logic QF_FF)
2(set-option :produce-models true)
3
4(define-sort F () (_ FiniteField 13)) ; F is an alias of sort (_ FiniteField 13)
5
6(declare-const x F)
7(declare-const y F)
8
9(assert (= (ff.add x y) (as ff1 F)))
10(assert (= (ff.mul x y) (as ff1 F)))
11
12(check-sat)
13(get-model)
1from cvc5.pythonic import *
2x, y = FiniteFieldElems("x y", 13)
3s = SolverFor("QF_FF")
4s.add(x + y == 1)
5s.add(x * y == 1)
6print(s.model() if s.check() == sat else "unsat")
7
The output is as follows.
1sat
2(
3(define-fun x () (_ FiniteField 13) #f4m13)
4(define-fun y () (_ FiniteField 13) #f10m13)
5)
1[x = 4, y = -3]
Exercise 12. In a finite field of order 13, find an element such that if you square it twice you get the multiplicative identity.
Finite Sets
cvc5 has support for the theory of finite sets. This theory supports basic
set operations like membership, union, and intersection, as well as constraints
on a set’s cardinality. The quantifier-free logic name is QF_FS
. At
the time of writing, this theory is not supported by other SMT solvers.
Example 15. Verify that union distributes over intersection.
1(set-logic QF_FS)
2
3(declare-sort U 0)
4
5(declare-const A (Set U))
6(declare-const B (Set U))
7(declare-const C (Set U))
8
9(assert (not (= (set.union A (set.inter B C)) (set.inter (set.union A B) (set.union A C)))))
10
11(check-sat)
1from cvc5.pythonic import *
2S = DeclareSort("S")
3A, B, C = [Set(i, S) for i in ["A","B","C"]]
4s = SolverFor('QF_FS')
5s.add(Not((A | (B & C)) == ((A | B) & (A | C))))
6print(s.check())
7
As, expected, the result is unsat
1unsat
1unsat
Exercise 13. Does set difference distribute over intersection? If not, find a counterexample.
Combinations of Theories
So far, we have mostly seen examples of how to pose queries that involve a single theory. Part of the appeal of SMT solvers is their ability to mix reasoning about different theories. This can be done in a natural way. Any well-sorted formula is allowed, and all sort constructors can take any other sort as an argument.
One slight complication is the question of how to specify the logic name. It
is always safe to use ALL
as the logic name, though as mentioned above,
it may be more efficient to give a more precise logic name. When mixing
theories, cvc5 allows any logic name that follows the following rules.
First, the logic name must start with the prefix QF_
if the intent is to
limit reasoning to quantifier-free formulas. The rest of the logic name can
include any of the following components: \((i)\) A
for
arrays; \((ii)\) UF
for uninterpreted functions; \((iii)\)
BV
for bit-vectors; (iv) FP
for floating-point numbers;
\((v)\) DT
for datatypes; (vi) S
for strings and sequences;
\((vii)\) either IDL
, RDL
, LIA
, LRA
,
LIRA
, NIA
, NRA
, or NIRA
for arithmetic;
\((viii)\) FF
for finite fields; and \((ix)\) FS
for
finite sets. Thus, for example, QF_AUFDTBVLRA
allows formulas that are
quantifier-free and mix arrays, uninterpreted functions, datatypes,
bit-vectors, and linear real arithmetic.
Example 10, Example 12, and Example 13 illustrate
combinations of theories.