An efficient open-source automatic theorem prover for satisfiability modulo theories (SMT) problems.
View on GitHub
Downloads
Documentation
Tutorials
Blog
Join the Conversation
People
Publications
Awards
Third Party Applications
Acknowledgements
Newsletter
by Alex Ozdemir
cvc5 is an SMT solver with support for many theories: integers, reals, bit-vectors, algebraic datatypes, separation logic, sets, bags, and more. This means that is can find satisfying solutions for formulas involving these kinds of objects. In this post, I explain the process of adding a new theory to cvc5: the theory of prime-order finite fields.
This post serves as a record of the patches I made to cvc5 to add this theory, and why I made them. This post is not likely to be very comprehensible to someone who doesn’t know both how to use an SMT solver and a bit about how they are implemented. For information about how they are implemented, the textbook “A Handbook of Satifiability” is a good place to start. For more information on how to use cvc5, check out our examples.
The theory we’d like to add is that of finite fields. A finite field is a finite set of values with \(\times\) and \(+\) operations which respect the usual properties: associativity, inverses, commutativity, and distributivity.
Our focus will be on prime fields, which can be represented as integers modulo a fixed prime \(p\), written \(\mathbb{F}_p\) or \(\mathbb{F}\) for brevity.
So, we’d like cvc5 to be able to determine things like this:
Our first task is to identify the sorts, operators, predicates, and constants for this theory:
We begin by creating a concrete value type for prime field elements.
This is just adding a file & class to /util
. Roughly, the class looks like
this:
class FiniteFieldValue {
public:
FiniteFieldValue(const Integer& val, const Integer& size);
const Integer& getValue();
const Integer& getFieldSize();
friend bool operator==(const FiniteFieldValue&, const FiniteFieldValue&);
friend bool operator!=(const FiniteFieldValue&, const FiniteFieldValue&);
friend bool operator<(const FiniteFieldValue&, const FiniteFieldValue&);
friend bool operator>(const FiniteFieldValue&, const FiniteFieldValue&);
friend bool operator<=(const FiniteFieldValue&, const FiniteFieldValue&);
friend bool operator>=(const FiniteFieldValue&, const FiniteFieldValue&);
friend FiniteFieldValue operator+(const FiniteFieldValue&, const FiniteFieldValue&);
friend FiniteFieldValue operator-(const FiniteFieldValue&, const FiniteFieldValue&);
friend FiniteFieldValue operator-(const FiniteFieldValue&);
friend FiniteFieldValue operator*(const FiniteFieldValue&, const FiniteFieldValue&);
friend FiniteFieldValue operator/(const FiniteFieldValue&, const FiniteFieldValue&);
FiniteFieldValue recip() const;
// ...
};
The PR that creates the class is here. In that PR, it’s name is
FiniteField
, which gets changed to FiniteFieldValue
later.
Note that Integer
is cvc5’s internal multiprecision integer type.
Next, we add boilerplate that defines a new theory in cvc5. The PR is here.
This boilerplate is organized around a shell-script called the “kinds file”. It declares basic theory and term utilities.
Term utilities include:
FfSize
: a type for the cardinality of a fieldFiniteFieldProperties::computeCardinality
: computes the aboveFiniteFieldEnurator
: enumerates all elements of a fieldCONST_FINITE_FIELD
FiniteFieldValue
from the last PRFINITE_FIELD_MULT
FINITE_FIELD_ADD
FINITE_FIELD_NEG
Each theory utility is materialized as a class. We’ll discuss them in more detail in a moment.
TheoryFiniteFieldsRewriter
: term rewriterTheoryFiniteFields
: the theory solverFiniteFieldConstantTypeRule
: for CONST_FINITE_FIELD
FiniteFieldFixedFieldTypeRule
: for the other kinds (multiplication and addition)The kinds file itself is a shell script that uses a number of cvc5-defined shell functions to declare all of the above information. So, essentially the file contains a bunch of pointers to difference C++ classes that implement different parts of the theory. Below, there is a rendering of the file with additional comments (and some omissions).
theory THEORY_FF \
::cvc5::internal::theory::ff::TheoryFiniteFields \
"theory/ff/theory_ff.h"
typechecker "theory/ff/theory_ff_type_rules.h"
rewriter ::cvc5::internal::theory::ff::TheoryFiniteFieldsRewriter \
"theory/ff/theory_ff_rewriter.h"
# declare a kind (FINITE_FIELD_TYPE) for the field sort itself
constant FINITE_FIELD_TYPE \
struct \
FfSize \
"::cvc5::internal::FfSizeHashFunction" \
"util/ff_val.h" \
"finite field type"
# meta-data about the field sort
cardinality FINITE_FIELD_TYPE \
"::cvc5::internal::theory::ff::FiniteFieldProperties::computeCardinality(%TYPE%)" \
"theory/ff/theory_ff_type_rules.h"
enumerator FINITE_FIELD_TYPE \
"::cvc5::internal::theory::ff::FiniteFieldEnumerator" \
"theory/ff/type_enumerator.h"
well-founded FINITE_FIELD_TYPE \
true \
"(*cvc5::internal::theory::TypeEnumerator(%TYPE%))" \
"theory/type_enumerator.h"
# declare a kind (CONST_FINITE_FIELD) for field values
constant CONST_FINITE_FIELD \
class \
FfVal \
::cvc5::internal::FfValHashFunction \
"util/ff_val.h" \
"a finite-field constant; payload is an instance of the cvc5::internal::FfVal class"
## declare other ff kinds and typing rules
operator FINITE_FIELD_MULT 2: "multiplication of two or more field elements"
operator FINITE_FIELD_NEG 1 "unary negation of a field element"
operator FINITE_FIELD_ADD 2: "addition of two or more field elements"
typerule CONST_FINITE_FIELD ::cvc5::internal::theory::ff::FiniteFieldConstantTypeRule
typerule FINITE_FIELD_MULT ::cvc5::internal::theory::ff::FiniteFieldFixedFieldTypeRule
typerule FINITE_FIELD_NEG ::cvc5::internal::theory::ff::FiniteFieldFixedFieldTypeRule
typerule FINITE_FIELD_ADD ::cvc5::internal::theory::ff::FiniteFieldFixedFieldTypeRule
endtheory
A rewriter extends TheoryRewriter
and must implement:
preRewrite
and postRewrite
.
For now we implement both trivially, returning
RewriteResponse(REWRITE_DONE, original_term)
.
The theory solver extends Theory
and is considerably more complex.
Again, we implement it trivially for now:
in void
methods we do nothing, and in methods that return something, we
return a default/null value.
One set of non-trivial methods is those that interact with the theory’s
equality engine.
We create members of of type
TheoryInferenceManager
and
TheoryEqNotifyClass
, initialize them, and implement the following:
bool TheoryFiniteFields::needsEqualityEngine(EeSetupInfo& esi)
{
esi.d_notify = &d_eqNotify;
esi.d_name = "theory::ff::ee";
return true;
}
void TheoryFiniteFields::finishInit()
{
Assert(d_equalityEngine != nullptr);
d_equalityEngine->addFunctionKind(Kind::FINITE_FIELD_MULT);
d_equalityEngine->addFunctionKind(Kind::FINITE_FIELD_NEG);
d_equalityEngine->addFunctionKind(Kind::FINITE_FIELD_ADD);
}
The first method says that we need an equality engine for out theory. The second registers our function symbols with that engines (so it can do congruence closure).
The type checker is a collection of type checking rules.
Each rule is a class with a static, public method called computeType
.
Here is an example: our type checking rule for addition and multiplication
nodes:
TypeNode FiniteFieldFixedFieldTypeRule::computeType(NodeManager* nodeManager,
TNode n,
bool check)
{
TNode::iterator it = n.begin();
TypeNode t = (*it).getType(check);
if (check)
{
if (t.getKind() != kind::FINITE_FIELD_TYPE)
{
throw TypeCheckingExceptionPrivate(n, "expecting finite-field terms");
}
TNode::iterator it_end = n.end();
for (++it; it != it_end; ++it)
{
if ((*it).getType(check) != t)
{
throw TypeCheckingExceptionPrivate(
n, "expecting finite-field terms from the same field");
}
}
}
return t;
}
The arguments have the following meaning:
nodeManager
is a Node Manager that can be used to build terms.n
is the node that we are type-checkingcheck
is true if we should check all children’s type for consistency with
this node’s type. Otherwise, we should just compute this node’s type.Now, it’s time to implement a non-trivial rewriter. The PR is here. A rewriter is responsible for normalizing and/or simplifying the representation of theory terms. Finite field terms contain negations, additions, and multiplications. For example
\[xy + (-xy) + 7 + (x + 2)(y + 3) + 2\]The goals for our rewriter are:
So we might rewrite the above to
\[(x + 2)(y + 3) + 9\]The SMT-wide rewriter rewrites terms in a downward pass, followed by an upward
pass. For each term, it allows the theory for that term to specify a rewrite.
A theory specifies a rewriter by implementing a subclass of TheoryRewriter
:
class TheoryFiniteFieldsRewriter : public TheoryRewriter
{
public:
RewriteResponse postRewrite(TNode node) override;
RewriteResponse preRewrite(TNode node) override;
};
The downward pass hook is TheoryRewriter::preRewrite
.
The upward pass hook is TheoryRewriter::postRewrite
.
On our downward pass, we eliminate negation.
On our upward pass, we normalize \(n\)-ary operators and combine like terms.
Inside cvc5, a statistics system is used to record and review dynamic information about a solve attempt. For example, there are statistics that record how often each theory reports a conflict, reports a lemma, and propagates a literal. There are statistics that count total node allocations and measure total solve time. And many more.
Pre-emptively, we create an object to hold the finite field solver’s statistics, here.
At some point, you need to implement the internal logic of your theory. For a simple theory, this means taking a conjunction of theory literals, and determining:
In the case of finite-fields, you can map the literals to set of multivariate polynomials \(P\) such that a model corresponds to a common root \(\vec x\). Each polynomial corresponds to one field literal; if a subset of polynomials has no common roots, then the corresponding subset of literals is an UNSAT core.
Anyways, our algebraic task is to find a common root for \(P\), and if not, attempt to identify a root-free subset of \(P\). To do this, we build a few pieces of machinery:
This machinery makes up the core logic of our theory. However, we have not yet connected that logic to the Theory interface.
The theory of finite fields is responsible for fields of all sizes. However, each field is independent of the others: there are no conversion operators between fields, and the only predicate (equality) takes arguments of the same field. Thus, we implement the theory of finite fields by implementing a “sub-theory” that handles the field \(\mathbb{F}_p\), where \(p\) is fixed.
This class is worth discussing in more detail because its interface foreshadows the larger theory. Here is the public interface:
class SubTheory : protected EnvObj, protected context::ContextNotifyObj
{
public:
SubTheory(Env& env, FfStatistics* stats, Integer modulus);
void preRegisterTerm(TNode term);
void notifyFact(TNode fact);
void postCheck(Theory::Effort effort);
bool inConflict() const;
const std::vector<Node>& conflict() const;
const std::unordered_map<Node, Node>& model() const;
private:
// ...
};
Now, let’s discuss each method:
SubTheory(env, stats, modulus)
modulus
: the integer \(p\)preRegisterTerm(term)
(= x y)
might not be pre-registered, but x
and y
will
benotifyFact(fact)
postCheck
).postCheck(effort)
STANDARD
: after Boolean constraint propagation in the SAT solver; a
full Boolean assignment has not yet been found.FULL
: after a full Boolean assignment has been found
LAST_CALL
: after a model is availablebool inConflict()
const std::vector<Node>& conflict()
const std::unordered_map<Node, Node>& model()
We implement our subtheory in this PR,
with a few fixes in a later PR.
Essentially, our implementation waits
for FULL
effort, and then runs the common root finder
from the previous section.
The final step is implementing the theory solver itself and modifying cvc5’s APIs and parser to make the the theory solver accessible to cvc5’s users. The PR is here. This is the PR where we change the name of the class that represents finite field values, to the diff is quite messy. A cleaner (but incomplete) diff can be found here: the initial commit of the PR.
A laundry list of changes that we make:
TheoryFiniteFields
:
InferenceId::ARITH_FF
In the foregoing, we developed a number of FF-internal data structures that
were incremental.
Specifically, we built data structures for incrementally building a set of polynomials,
finding their common roots,
and blaming the lack of common roots on a subset.
However, in our implementation of the sub-theory,
we didn’t actually use these data structures very incrementally:
we pushed polynomials to them incrementally (as we we passed new FF facts),
but we only triggered a search for common roots at FULL
effort.
Our choice to not use incrementality was guided by experiments
that showed it didn’t improve performance.
In a later PR, we made the data structures non-incremental,
which simplified the codebase significantly
and also improved performance a bit.
It didn’t affect the interface between the theory of finite fields and the
rest of the SMT solver at all.
To give some more detail, before this patch, the core data structure was an incremental Ideal:
class IncrementalIdeal : EnvObj
{
public:
IncrementalIdeal(Env& env, CoCoA::ring polyRing);
// Add new generators
void pushGenerators(std::vector<CoCoA::RingElem>&& gens);
// Remove the last batch of generators
void pop();
// Is the ideal the whole ring?
bool idealIsTrivial();
// For a trivial ideal, return a (sub)list of generator indices that generate it.
const std::vector<size_t>& trivialCoreIndices();
// For a trivial ideal, return a (sub)list of generators that generate it.
std::vector<CoCoA::RingElem> trivialCore();
// For a non-trivial idea, check whether there is a base-field variety member.
bool hasSolution();
// For a non-trivial idea with a base-field variety member, get it.
const std::vector<CoCoA::RingElem>& solution();
private:
// ...
};
After this patch, that data structure was eliminated, and replaced by a single function that attempted to find a zero for a set of polynomials: returning either a zero or a subset of the polynomials that had no zero.
Note: this content is cross-posted to our wiki page and to Alex’s blog.
Check out our GitHub Discussions if you have any questions.
Back to All Blog Posts