1 Introduction: from algebraic soft constraints to practical solvers

Many (perhaps most) industrial combinatorial optimization problems in practice tend to be over-constrained, according to [67]. A most common remedy is to iteratively refine the initial constraint model by manually weakening or dropping constraints until a solution can be found. However, this approach does not work if the actual problem instance, i.e., all input parameters, is only available at runtime. This is the case, for instance, when a system is intended to act autonomously (e.g., smart homes or smart grids). Simply failing with unsatisfiable is not an option—instead, a compromise solution is necessary. To accomplish this, a constraint model has to be written with the intention of graceful degradation in the first place. Some constraints have to be softened if necessary or even ignored.

A second driving force is the need to model users’ preferences to discriminate the set of feasible solutions. In discrete optimization and decision theory, this role is assumed by the objective (or, utility), a function mapping variable assignments to some numeric codomain such as the rational numbers. The goal is to minimize or maximize the function’s value. While many optimization problems feature a natural choice for a numeric objective (e.g., minimizing the makespan in scheduling, or minimizing the encompassing area in packing), adequately eliciting and modeling preferences in real-world settings is generally more involved and spans its own area of research [37]. The essence is an ordering relation over the available options, i.e., solutions.

Directly assessing this relation is impractical due to the exponential number of choices in a combinatorial setting [37]. Instead, several more compact preference formalisms for various use cases have been proposed. For example, preferences could be stated numerically as penalties incurred for violated soft constraints (as in weighted constraints [2]) or as degrees ranging from 0% to 100% satisfaction (as in fuzzy constraints [52]). Alternatively, one could state comparative constraint preferences [55], such as that constraint “two nurses should be on night shift” is more important than constraint “nurse Jones should be off-duty” if no solution satisfies both desirable constraints simultaneously (see Fig. 1 for an example), or that the value “white wine” is more desirable than “red wine” for a variable “drink” if another variable “meal” is assigned value “fish”, as in CP-nets [16]. For these comparative formalisms, inventing a numeric objective function that represents the modeler’s ordering relation is a cognitively demanding task in its own right. If users fail to easily assign numeric values, this quantification should be distinct from the elicitation of preferences. Adequately modeling preferences then amounts to an exercise in “objective engineering”.

Fig. 1
figure 1

Left: A rostering problem involving three nurses ni with (comparative) constraint preferences. Not all three constraints can be satisfied simultaneously, e.g., c1 forces that either n2 or n3 take the night shift which conflicts with c2 or c3. There are solutions satisfying two out of three constraints. The graph depicts a partial ordering U of the constraints with c1 being most important and c2 being incomparable to c3. Center: An ordering over sets of violated constraints defined inductively by the two above rules (called the Smyth-ordering). Right: The Hasse diagram of ≺ over the valuation space: No violation () is best, and, e.g., {c2, c3} is better than {c1, c2} since it violates the less important constraint c3 instead of c1

Both motivations have been recognized for many years (see Section 2), leading to a unified theory of soft constraints that subsumes over-constrained problems and preferences [43]. It offers a more general treatment of satisfaction (or violation) degrees as an ordered set accompanied by a combination operation (to combine the valuations of several soft constraints for an assignment) and dedicated top and bottom elements, i.e., an algebraic structure. Instead of working with well-known specific orderings, such as \((\mathbb {N}, \leq )\), calculations and orderings are studied from an abstract algebra perspective (see Fig. 1 right for an example). The leading frameworks are c(onstraint)-semirings [12] and (totally ordered) valuation structures [58], i.e., ordered monoids. These abstractions serve to both find general complexity-theoretic results and devise search and propagation algorithms for a broad class of problems. Moreover, by means of product operators such as a direct product (for Pareto orderings) and a lexicographic product, complex valuation structures can be formed from elementary ones, allowing for modular specification and runtime combinations [29, 57]. Generalized variants of branch-and-bound, soft arc consistency [19], and non-serial dynamic programming techniques [10] such as bucket elimination or cluster tree elimination [22] use these structures. In addition, some global constraints (including alldifferent and gcc) with dedicated propagators have been generalized to a soft variant [67], usually by considering one (integer) cost variable that measures violation.

However, ready-to-use implementations for modern constraint platforms are rare—with Toulbar2 [2], a solver designed for (integer) cost function networks as a specific valuation structure, being the exception to the rule.Footnote 1 Using cost function networks is justified since many concrete formalisms can be reduced to them in polynomial time [58]—possibly at the expense of totalizing a partial order. Still, to use Toulbar2, the problem has to be encoded in a flat format, lacking the convenient high-level abstractions that constraint modeling languages such as Essence [28], OPL [66], or MiniZinc [44] provide. Consider, e.g., reusable predicates and (partial) functions [63], option types [42], or safe indexation with decision variables, to name a few.

For novice constraint modelers, crafting new soft constraint models is hard without language tools and the algorithmic potential of solvers remains restricted to expert users. To improve the situation for Toulbar2, a Python interface has recently been provided via the Numberjack platform [34]. On the other hand, to express soft constraints in conventional constraint models, constraint violations have to be reified in one or more variables to be, e.g., minimized in an ad-hoc fashion. In any case, users either have to use cost function networks or encode soft constraints in existing languages without any support. For practical purposes, previous efforts in designing abstract frameworks for a variety of soft constraint formalisms are thus, unfortunately, nullified.

Therefore, this paper reduces the gap between abstract soft constraint frameworks and practical solvers by exploiting and extending MiniZinc [44], a constraint modeling language and MiniSearch [49], a script language for customized MiniZinc searches, and contributes:

  • MiniBrass,Footnote 2 a modeling language for soft constraint problems based on algebraic structures that is compiled into MiniZinc/MiniSearch code to inherit its support for a broad variety of solvers. Our language comes with an extensible type system including common types (weighted constraints [60], cost function networks, fuzzy constraints [52], constraint preferences [55], probabilistic constraints [24]) in the literature.

  • A formal foundation for the semantics of the types implemented in MiniBrass which includes the systematic derivation of partially-ordered valuation structures from partial orders using category-theoretical arguments. In the course of the derivation, we survey the adequacy of abstract frameworks in the literature with respect to model expressiveness and algorithmic efficiency—with an emphasis on expressiveness. Our results show how to extend any partially-ordered valuation structure to a c-semiring, if needed.

  • The MiniBrass library providing reusable order predicates and combination functions, implemented in MiniZinc with existing global constraints.

  • An empirical evaluation using modified benchmark problems from the MiniZinc benchmark library that are supplemented with explicit soft constraints in different formalisms. We compare the solving performance of classical constraint solvers working on encoded soft constraint problems to that of a dedicated soft constraint solver (Toulbar2), the influence the used formalism has on solving time, and the efficiency of generic soft constraint search heuristics.

After giving a broad overview of related work in Section 2, we survey common definitions of soft constraint problems using algebraic structures in Section 3 and conclude that partially ordered valuation structures (PVS, [29]) are well-balanced in terms of generality and specificity. Therefore, we show how any partial order over solutions can be canonically transformed into a PVS with the “lowest overhead”. This construction of the free PVS employs the language of category theory and has been presented, in brevity, in [39] without any rationale about the construction. Furthermore, we shed light on the relationship between PVS and c-semirings by exemplifying bucket elimination on the free c-semiring. Section 4 describes the design of the MiniBrass language including its type system, concepts, and toolchain. We provide use cases and examples of existing soft constraint formalisms in MiniBrass, including structured types such as the free PVS or constraint preferences, combinations of PVS, and morphisms.

Section 5 concludes by evaluating MiniBrass on a set of (slightly modified) benchmark problems taken from the MiniZinc challenges [65] to show its feasibility. The central questions are whether an encoding-based approach using a set-based ordering is i) feasible and ii) competitive to integer minimization and dedicated soft constraint solving.

2 Related work

Pioneering attempts to generalize hard constraints were discussed in partial constraint satisfaction [27]: A metric measures the distance of an assignment to the solution space of the original problem. Proposed distance choices included the number of required domain items to “patch” the assignment or the number of violated constraints. The latter is better known as Max-CSP. To distinguish constraints, various formalisms have been explored. In weighted constraint problems (WCSP), each constraint is assigned a fixed penalty that is incurred if it is violated. As mentioned before, Toulbar2 is a dedicated WCSP solver using search strategies (e.g., [3, 53]) and soft constraint propagation and filtering [19]. Also, [6] offers WSimply, a specification language for weighted CSP with a transformation to SMT.

Similar to WCSP, constraints can also be placed qualitatively: either in layers of importance, such as in constraint hierarchies [15] or only comparatively, as done in the aforementioned constraint preferences (see Fig. 1). [46] introduced the notion of meta-constraints to explicitly talk about constraints in other constraints, such as “B has to hold only if A is violated”. Solvers provide reified variants for several cost values, MiniBrass relies on that technique. Although this seems as if we are restricted to relatively simple arithmetic constraints, recent efforts have been made to systematically provide more reified variants of global constraints [9].

Instead of placing weights on constraints themselves, fuzzy constraints [52] consider the constraints’ relation of valid assignments as fuzzy sets with a membership function declaring how strongly an assignment satisfies the constraint, ranging from 0 to 1. Another proposal suggested to interpret soft constraints probabilistically, leading to probabilistic constraints [24]: For every soft constraint, we have a probability \(p_{i}\) that the constraint is actually present. An assignment \(\theta \) is judged by the probability of it being an “actual solution”. Assuming independence, we obtain this probability by multiplying \(1-p_{i}\) for all violated soft constraints (i.e., all violated constraints have to be absent if \(\theta \) still counts as a solution).

As mentioned in Section 1, valued constraints map an assignment to a totally ordered monoid (i.e., a valuation structure) [58]. Similarly, the c-semiring soft constraint frameworks labels each assignment with a value in a semiring. In addition to the multiplication operator, it foresees an additive operation acting as the supremum of the induced partial order (i.e., \(a \leq b \Leftrightarrow a + b = b\)) for comparing solutions. Having a supremum operation available is particularly useful for variable elimination approaches, as we discuss in Section 3.5. However, many relevant partial orders do not admit a least upper bound such as, e.g., a Pareto-ordering of two orders or the Smyth-ordering presented in Fig. 1.

To compare frameworks, [33] presented an encoding of some types of constraint hierarchies as c-semirings—except for a “worst case” semantics. Later, [57] provided a mathematical explanation for the impossibility of directly encoding this missing constraint hierarchies type by noting the presence of so-called “collapsing elements” (introduced by [29]) that equalize distinct elements (i.e., \(a < b\) but \(c \cdot a = c \cdot b\)) and are prohibitive for lexicographic products. However, this line of reasoning was purely theoretical and has not yet been implemented and evaluated empirically.

As previously stated, we noted the lack of solvers for general semiring or valued constraints. Most papers offer closed ad-hoc implementations focusing on one particular type such as [51] or [14] for fuzzy CSP. By contrast, [41] provides a formulation of c-semiring-based soft constraint problems as “weighted semiring Max-SAT” that uses the semiring values and ordering as “weights”. The encoded problems are solved with basic implementations of branch-and-bound and GSAT, outperforming the fuzzy solver CONFLEX (which is not available anymore). However, these algorithms do not rely on the supremum operator of a c-semiring and could be run as well with partial valuation structures (see Section 3.5). In addition, the approach remained rather prototypical (random instances with up to 120 variables and 20 constraints), only supported strict domination search for partially ordered search spaces (see Section 4.5), and did not offer a public API to their system—which brings us to modeling languages.

MiniBrass is built on top of the MiniZinc environment which itself is a subset of the Zinc language. There are variations and extensions such as stochastic MiniZinc [48] for problems involving uncertainties, MiningZinc for constraint-based data mining [31], and MiniSearch [49] for customizable search. MiniZinc is a high-level modeling language that is compiled to the flat file format FlatZinc that is understood by many constraint, MIP, or SAT solvers. MiniSearch provides facilities to access a search tree at the solution level, making queries such as “fetch the next solution; when found, constrain the next solution to have to improve” (in terms of, e.g., some partial order)—effectively resulting in a form of propagation-based branch-and-bound. For abstract soft constraint models, we found this to be the right level of granularity—as opposed to a fine-tuned programmable search since we can only rely on the existence of an ordering predicate and a way to combine individual valuations. Moreover, with MiniSearch, a search strategy has to be defined just once and can be used by any FlatZinc solver instead of implementing custom search for each solver. MiniSearch does so by generating multiple FlatZinc files. In addition, there is native search tree interaction for Gecode [59].

Probably closest to our approach, [5] proposed the higher-level language concept “w-MiniZinc” which would extend MiniZinc to weighted CSP (but only weighted CSP). However, their approach was never implemented. The same authors provide a similar concept with WSimply [6] but, again, did not involve other (abstract) types of soft constraints such as those defined by c-semirings and valuation structures and only stuck to weighted constraints. MiniBrass, by contrast, is designed to be easily extended with new types (fuzzy, probabilistic, constraint preferences, etc.) and puts a layer of abstraction on top of MiniZinc, being its target language of compilation. Moreover, modular specifications such as those offered by products in MiniBrass are not available in either w-MiniZinc or WSimply. In addition, the syntactical features offered by w-MiniZinc are also available in MiniBrass (see Section 4.2.1)—along with more pre-defined types to choose from, the ability to define new types and complex preference structures assembled from smaller preference structures.

Other constraint modeling languages include Essence [28] or OPL [66]. While due to the existence of OPL script, OPL would be suited for a soft constraint modeling language as well, Essence does not offer search combinators or programmable search. We could only work with repeated solver calls or numeric (integer) objectives— effectively recreating the facilities that MiniSearch already offers, albeit having fewer compatible solvers. OPL, on the other hand, is tied to the CP/MIP solver IBM ILOG CPLEX whereas MiniZinc supports a broad variety of solvers—a property found useful in our evaluation in Section 5.

Clearly, other areas study preferences with different emphases, ranging from game theory, databases [38], the social sciences [1], mechanism design [45] to multi-agent systems, in general [62]. Often, a preference relation is represented by numeric utilities that can be translated to weighted or fuzzy constraints. CP-nets [16] provide the most common qualitative preference language used in the above domains. Users specify total orders over the domain of a variable depending on an assignment to other variables. For instance, \(x_{1} = d_{1}, \dots , x_{n} = d_{n} : y = w_{1} \succ {\dots } \succ y = w_{k}\) indicates that if variables \(x_{i}\) are assigned to \(d_{i}\), then variable y should preferably be assigned \(w_{i}\) than \(w_{i + 1}\). By applying these rules transitively under a ceteris paribus assumption (all other things being equal), generally a preorder over assignments is induced. In terms of solution ordering, it is well-known that soft constraints and CP-nets are formally incomparable [43]. Compared to constraint preferences, CP-nets require users to rank domain values whereas constraint preferences are settled on a coarser level: solutions satisfying an important constraint A are better than solutions satisfying a less important constraint B—ceteris paribus. The former is obviously better suited in problems involving rather small domains whereas the latter aims at ordering a large number of solutions in equivalence classes of manageable size.

Regarding the specification and aggregation of preferences in multi-agent settings, (computational) social choice provides formal foundations by means of axiomatizing desirable properties and postulating appropriate voting rules [17, 62]. Little attention has yet been devoted to the combination of social choice with soft constraint problems consisting of n preference structures [20] even though the prevalent heterogeneity calls for such approaches. As of now, MiniBrass only supports Pareto-style and lexicographic combinations but has voting-based aggregation as a future goal. By contrast, the overall objective in distributed constraint optimization problems (DCOP) is usually a sum of local cost functions which amounts to the special case of a weighted CSP [25] as opposed to more generic frameworks.

3 Formal foundations: soft constraints and algebraic structures

We begin by reviewing our notation for conventional constraint satisfaction problems as well as soft constraint problems and then discuss how the algebraic structures underlying MiniBrass are obtained from partial orders. Although these sections are rather formal, the presented constructions and orders are implemented in the MiniBrass library (see Section 4).

3.1 Soft constraint satisfaction and optimization on partial valuation structures

As usual, a constraint (satisfaction) problem \(\mathit {CSP} = (X, D, C)\) is described by a set of decision variables X, their associated family of domains \(D = (D_{x})_{x \in X}\) containing possible values, and a set of (hard) constraints C that restrict valid assignments. An assignment \(\theta \) over scope X is a mapping from X to D, written as \(\theta \in [X \to D]\), such that each variable x maps to a value in \(D_{x}\). A (hard) constraint \(c \in C\) is understood as a map \(c : [X \to D] \to \mathbb {B}\) where we write \(\theta \models c\) to express that \(\theta \) satisfies c (i.e., \(c(\theta ) = \mathit {true}\)) and \(\theta \not \models c\) to express that \(\theta \) violates c. Each constraint has a scope \(\mathit {sc}(c) \subseteq X\), i.e., the variables that actually influence its truth value. An assignment \(\theta \) is a solution if \(\theta \models c\) holds for all \(c \in C\). The restriction of an assignment \(\theta \) to a scope \(X^{\prime } \subseteq X\) is explicitly written as \(\theta \mathord {\downarrow }X^{\prime }\).

We obtain constraint optimization problems (COP) by adding an objective function f : [XD] → P where \((P, {\leq _{P}})\) is a partial order, that is, \(\leq _{P}\) is a reflexive, antisymmetric, and transitive relation over P. Elements of P are interpreted as solution degrees, denoting quality. Without loss of generality, we interpret \(m <_{P} n\) as solution degree m being strictly worse than n and restrict our attention to maximization problems. Hence, a solution degree m is optimal with respect to a constraint optimization problem \(\mathit {COP}\) if for all solutions \(\theta \) it holds either that \(f(\theta ) \leq _{P} m\) or f(𝜃) ∥ P m, expressing incomparability w.r.t. \(\leq _{P}\). It is reachable if there is a solution \(\theta \) such that \(f(\theta ) = m\). Non-reachable optimal solution degrees appear, e.g., as upper bounds. A solution \(\theta ^*\) is optimal if \(f(\theta ^*)\) is optimal.

A soft constraint satisfaction problem (SCSP) is defined as a COP where i) the objective is decomposable into multiple objectives (i.e., soft constraints) defined on their respective scopes and ii) the codomain of the objective admits additional algebraic and ordered structure for modeling purposes, such as valuation structures [58] or c-semirings [12]. Minimal requirements are that solution degrees obtained from the soft constraints should be combined using a binary operation, called multiplication, that there should be a neutral element representing complete satisfaction, and that combination should be monotone with respect to multiplication to denote that additional violation can only harm the quality further. These properties are captured by partially ordered valuation structures (PVS).

Definition 1 (PVS)

A PVS \((M, {\cdot _{M}}, \varepsilon _{M}, {\leq _{M}})\) is a partially ordered commutative monoid where the multiplication \(\cdot _{M}\) is monotone w.r.t. the partial ordering ≤M and \(\varepsilon _{M} \in M\) is both the neutral element w.r.t. \(\cdot _{M}\) and the top element w.r.t. \(\leq _{M}\). That is, \((M, {\leq _{M}})\) is a partial order and the following axioms hold for all \(m, n, o \in M\): (1) mMn = nMm; (2) mM(nMo) = (mMn) ⋅ o; (3) mMεM = m; (4) mMεM; (5) mMnmMoMnMo.

A PVS M is bounded if there also exists a minimal element \(\bot \in M\) to represent complete dissatisfaction. A valuation structure [58] is a bounded PVS where \(\leq _{M}\) is a total ordering. If M and N are PVS, so are \(M \times N\), the direct (Cartesian) product, and \(M \ltimes N\), the lexicographic product—as long as some conditions on M hold, as was shown in [29, 57]. Both products have pairs of elements of the underlying sets of M and N as their elements and combination is applied component-wise. For two PVS M and N, we can construct the ordering of the direct product as follows:

$$(m, n) \leq_{M \times N} (m^{\prime}, n^{\prime}) \Leftrightarrow m \leq_{M} m^{\prime} \wedge n \leq_{N} n^{\prime} $$

which corresponds to a Pareto-ordering over the underlying orderings of M and N. Similarly, the ordering of the lexicographic product is defined as

$$(m, n) \leq_{M \ltimes N} (m^{\prime}, n^{\prime}) \Leftrightarrow (m <_M m^{\prime}) \vee (m = m^{\prime} \wedge n \leq_N n^{\prime}) $$

It allows us to express hierarchical relationships between PVS (we discuss products in MiniBrass further in Section 4.4).

Furthermore, to allow for structure-preserving mappings between PVS, we define a PVS-homomorphism from a PVS \((M, {\cdot _{M}}, \varepsilon _{M}, {\leq _{M}})\) to a PVS \((N, {\cdot _{N}}, \varepsilon _{N}, {\leq _{N}})\) to be given by a mapping \(\varphi : M \to N\) such that \(\varphi (\varepsilon _{M}) = \varepsilon _{N}\), \(\varphi (m \cdot _{M} n) = \varphi (m) \cdot _{N} \varphi (n)\), and \(m \leq _M n \Rightarrow \varphi (m) \leq _N \varphi (n)\) (order-preservation). For a c-semiring, we need an idempotent additive operation \(\oplus \) that is used to induce the ordering \(\leq \) by letting \(m \leq n \Leftrightarrow m \oplus n = n\). Due to generality, we first restrict our attention to PVS-based soft constraints and extend our discussions to c-semirings in Section 3.4. Thus, we define a soft constraint \(\mu \) over a PVS M as a map \(\mu : [X \to D] \to M\), we denote the set of soft constraints by S and write a \(\mathit {SCSP}\) as \((X, D, C, (M, {\cdot }_{M}, \varepsilon _{M}, {\leq _{M}}), S)\) which can be seen as a \(\mathit {COP}\) \((X, D, C, (M, {\leq _{M}}), f)\) where

$$ f(\theta) = {\Pi}_{M} \{ \mu(\theta) \mid \mu \in S \} $$
(1)

using \(\cdot _{M}\) to aggregate solution degrees of all soft constraints evaluated on an assignment.

Example 1

Consider again the rostering problem in Fig. 1 and let \((X, D)\) be as depicted and use \(\mathsf {U} = (\{ c_{1}, c_{2}, c_{3} \}, {\leq _{\mathsf {U}}})\) with \({\leq _{\mathsf {U}}} = \{(c_{2}, c_{1}), (c_{3}, c_{1})\}^{*}\) as a partial order denoting urgency of constraints. For \(C = \{c_{1}, c_{2}, c_{3}\}\) as hard constraints, the solution space is empty. Instead, we can convert each hard constraint \(c_{i}\) into a soft constraint \(\mu _{i}\) by choosing a suitable PVS M. For instance, we could use the PVS \((\mathbb {N}, +, 0, {\geq })\) and interpret each valuation as a penalty incurred for a violated soft constraint. The sum of penalties ought to be minimized. With weights w = [2,1,1], we define \(\mu _{i}(\theta ) = w_{i}\) if \(\theta \not \models c_{i}\) and \(\mu _{i}(\theta ) = 0\) otherwise. Letting \(C = \emptyset \) and S = {μ1, μ2, μ3}, the solution \(\theta = \{ n_{1} \mapsto \mathtt {night}, n_{2} \mapsto \mathtt {night}, n_{3} \mapsto \mathtt {off} \}\) is optimal with \(f(\theta ) = {\sum }_{\mu _{i} \in S} \mu _{i}(\theta ) = 1\). The solution degree 0, being top in M, is not reachable.

3.2 Looking for free partial valuation structures

In Example 1, the choice of \(M = (\mathbb {N}, +, 0, {\geq })\) seems rather obvious, given that the weighting \(\mathbf {w} = [2,1,1]\) is consistent with the intuition that \(c_{1}\) should be weighted higher than \(c_{2}\) and \(c_{3}\). However, interpreting \(\mathbf {w}\) as a function \(w : S \to \mathbb {N}\), we see that w clearly is only a monotone (not isomorphic) function. It totalizes \(\mathsf {U}\) by making the incomparable elements \(c_{2}\) and \(c_{3}\) equal. Alternatively, \(\mathbf {w} = [3,1,1]\) would be consistent, as would be \(\mathbf {w} = [3,2,1]\), although our intuition tells us that making \(c_{2}\) more important than \(c_{3}\) certainly is a bad idea. We could try another PVS, say \(M^{\prime } = (2^{S}, {\cup }, \emptyset , {\supseteq })\) to denote solution degrees as sets of violated soft constraints that are combined by union. Then, however, \(c_{1}\), \(c_{2}\), and \(c_{3}\) would each contribute equally to a solution’s quality, contrary to our model marking \(c_{2}\) and \(c_{3}\) as less urgent than \(c_{1}\). Certainly, we want any mapping \(\varphi \) into a PVS N to preserve the given order: \(\varphi (p) \leq _{N} \varphi (q)\) whenever \(p \leq _{\mathsf {U}} q\); otherwise we invert ordering decisions.

Our point is, there is an infinite number of PVS that represent \(\mathsf {U}\) to a certain degree—but the essential question is what are the minimum requirements in terms of comparability any PVS have to fulfill? Which PVS is, in this sense, the best, i.e., most general, one?

To find an answer, we consider the more general question of how to “convert” any partial order P into a partial valuation structure. As first presented in [39] and proved in Section 3.3, we can indeed lift any P to \(\mathit {PVS}\langle {P}\rangle \)—i.e., construct a suitable combination operation and neutral element: We take as elements \(\mathcal {M}_{\text {fin}}(P)\), the set of finite multisets composed from elements in P. For instance, . Two multisets are combined using multiset-union with being the neutral element. Finally, a compatible ordering (with being top) is found inductively by applying the Smyth-ordering on sets (see Fig. 1) to multisets (then written as \(\preceq ^{P}\)):Footnote 3

Definition 2 (Smyth-ordering over Multisets)

The Smyth-ordering on \(\mathcal {M}_{\text {fin}}(P)\) is the binary relation \({ \preceq ^{P} } \subseteq \mathcal {M}_{\text {fin}} (P) \times \mathcal {M}_{\text {fin}} (P)\), given by the reflexive-transitive closure of

figure d

Intuitively, when we compare two multisets according to \(\preceq ^{P}\), we have to match every element q on the right side with a dominated element \(p = h(q)\) on the left side such that \(p \leq _{P} q\) and h is injective (see Lemma 1). There may be additional elements on the left.

For any elements \(p, p^{\prime }\) in a partial order P, we have Note the monotonicity of the Smyth-ordering with respect to multiset union; if \(T \preceq ^{P} U\), then , since this holds for both defining clauses of the ordering. Antisymmetry is proved in Section 3.3. As an example, we have , if we read \(c_{2} \rightarrow c_{1}\) as \(c_{1} <_{\mathsf {U}} c_{2}\). In conclusion, .

Since \(\mathit {PVS}\langle {P}\rangle \) can be the codomain of any \(\mathit {SCSP}\), soft constraints \(\mu _{i}\) can arbitrarily map to \(\mathcal {M}_{\text {fin}}(P)\), e.g., . We derive a particularly interesting instance instead (constraint preferences), if we convert a boolean soft constraint \(c_{i}\) into \(\mu _{i}(\theta )\) which maps to if \(\theta \) satisfies \(c_{i}\) and otherwise. In the context of constraint preferences, the Smyth-ordering is called single-predecessor-dominance in Section 4 since—everything else being equal—a single predecessor can be dominated by a more important constraint due to the first clause of the ordering.

Figure 2 displays how we can encode a partial order P as either a weighted PVS or using \(\mathit {PVS}\langle {P}\rangle \) by, e.g., representing \(c_{1}\) as \(w(c_{1}) = 1\) or as , respectively. Notice that a weighting \(w : P \to \mathbb {N}\) can be “emulated” from \(\mathcal {M}_{\text {fin}}(P)\) by defining its “lifted” version \(w^{\sharp }\): \(\mathcal {M}_{\text {fin}}(P) \to \mathbb {N}\) on the level of multisets: (arrow from right to left in the diagram). The converse, however, does not work: Once, e.g., \(c_{2}\) and \(c_{3}\) are both mapped to 1, we cannot “extract” information back about the origin to design a mapping \({\eta }^{\sharp }\) that can map from \(\mathbb {N}\) into \(\mathcal {M}_{\text {fin}}(\mathsf {U})\) since \({\eta }^{\sharp }(1)\) would have to simultaneously be equal to . This tells us (not surprisingly) that \(c_{2}\) and \(c_{3}\) do not necessarily have to be treated as equal elements, i.e., there exist other, more general, partial valuation structures encoding U d that keep them as distinct elements.

Fig. 2
figure 2

Encoding preferences given by the partial order U as two different PVS: Weighted(U) and PVSU〉. Highlighted paths show possible improvement steps during optimization. There can be no mapping η since distinct elements c2 and c3 are unified to 1 in Weighted(U) and would need to be represented by and in PVSU〉, respectively

It was not a coincidence that we found a lifted mapping \({w}^{\sharp }\) from \(\mathit {PVS}\langle {P}\rangle \) to \(\mathit {Weighted}(P)\) that is equal to applying w directly from P. Instead, \(\mathit {PVS}\langle {P}\rangle \) has the universal mapping property [39]: Any order-preserving function \(\varphi \) from P into the underlying partial order of a PVS can be decomposed into the form \({\varphi }^{\sharp } \circ \eta \) in a unique way. Thus, \(\mathit {PVS}\langle {P}\rangle \) is also called the “free PVS over P”. Practically, this means that we can always safely convert P into the free PVS before mapping to another PVS (e.g., if we need an integer objective for our implementation, see Section 4.3) without losing any information. Conversely, \(\mathit {Weighted}(P)\) is not free as we cannot return to \(\mathit {PVS}\langle {P}\rangle \) once P is mapped to \(\mathit {Weighted}(P)\). Since free objects are unique up to isomorphism [54, p. 147], \(\mathit {PVS}\langle {P}\rangle \) can be seen as the most general PVS over a partial order. We prove this fact in Lemma 2 in Section 3.3.

Our original question, “how to formulate an ordering over constraints as a PVS with the least overhead”, thus boils down to the search for a free construction. Similar instances are the free monoid or the free group over a set. We can capture this task formally using the language of category theory (hinted in Fig. 2) which studies, inter alia, algebraic structures along with their structure-preserving mappings. This perspective further enables us to treat the transformation from a partial order into a PVS and that from a PVS into a c-semiring uniformly. The subsequent sections hence draw on basic knowledge of category theory when they offer the derivation of the free PVS and the free c-semiring, respectively. In Appendix A, we introduce categorical concepts relevant to free constructions with the well-known free monoid over a set. Readers familiar with basic category theory may safely skip the appendix and readers familiar with term algebras may check the categorical presentation. As category theory has not been used extensively in constraint programming (except for [23]), we refer to excellent introductory material, e.g., [7, 8, 47].

As a very brief introduction to follow the proof obligations of the subsequent sections, we note that a category \(\mathcal {C}\) refers to a collection of so-called objects and morphisms that generalize functions. For instance, the category \(\text {Set}\) has conventional sets as objects and functions as morphisms, whereas the category \(\text {Mon}\) has monoids as objects and monoid-homomorphisms as morphisms. A functor F between categories \(\mathcal {C}\) and \(\mathcal {D}\) is a mapping that sends every \(\mathcal {C}\)-object A to a \(\mathcal {D}\)-object \(F(A)\) and every \(\mathcal {C}\)-morphism \(\varphi : A \to B\) to a \(\mathcal {D}\)-morphism \(F(\varphi ) : F(A) \to F(B)\). For example, for every set A there is an associated monoid \((A^{*}, {::}, \varepsilon )\) with words over A and concatenation. We can use this to define a functor F : Set →Mon by \(F(A) = (A^{*}, {::}, \varepsilon )\) and \(F(f : A \to B) : A^{*} \to B^{*}\) with \(F(f)([a_{1}, \ldots , a_{n}]) = [f(a_{1}), \ldots , f(a_{n})]\), such that, in particular, F(f)(w1 :: w2) = F(f)(w1) :: F(f)(w2). Conversely, there is the underlying functor \(|{{-}}| : \text {Mon} \to \text {Set}\) with \(|{(A, {\cdot }, \varepsilon )}| = A\) and \(|{\varphi : (A_{1}, {\cdot _{1}}, {\varepsilon _{1}}) \to (A_{2}, {\cdot _{2}}, {\varepsilon _{2}})}| = \varphi : A_{1} \to A_{2}\) yielding the underlying set of a monoid.

This operator \(|{-}|\) is a convention present in category-theoretical arguments. It allows to distinguish structures and sets and must not be confused with set cardinality. We follow this convention in the remainder of the paper and, e.g., will write a partial order as \(P = (|{P}|, \leq _{P})\).

Using the above notions, we can now formally state what a free object is:

Definition 3 (Free object)

Given two categories \(\mathcal {A}\) and \(\mathcal {B}\) and a functor \(G : \mathcal {B} \to \mathcal {A}\), the free object \(F(A)\) in \(\mathcal {B}\) over an object A of \(\mathcal {A}\) is characterized by a unit morphism ηA : AG(F(A)) in \(\mathcal {A}\) such that for every \(\mathcal {A}\)-morphism \(\varphi : A \to G(B)\) with B an object of \(\mathcal {B}\), there is a unique lifting \(\mathcal {B}\)-morphism φ : F(A) → B satisfying G(φ)∘ηA = φ.

A free object \(F(A)\) is unique up to isomorphism and the composition of two free constructions yields another free construction [54, Ch. 3]. Incidentally, the monoid \((A^{*}, \cdot , \varepsilon )\) is the free monoid over a set A (see Appendix A). A free object does not have to exist. We need to prove a particular free construction (e.g., free monoid or free PVS) by choosing the appropriate categories, functors, and, of course, the free object itself.

3.3 The free partial valuation structure over a partial order

Motivated by the goal of finding the most general PVS to encode constraint preferences, the search for the free PVS over a partial order P answers a more fundamental problem:

ᅟ:

Which ordering decisions always have to hold if we extend any partial order with a combination operation (multiplication) and neutral top element?

More formally, this is the case if we have several soft constraints \(\mu _{1}, \ldots , \mu _{n}\) that each grade an assignment \(\theta \) in the same partial order P and we take a product \(\mu _{1}(\theta ) \cdot {\ldots } \cdot \mu _{n}(\theta )\). Which ≼-relations must certainly hold if we compare \(\mu _{1}(\theta ) \cdot {\ldots } \cdot \mu _{n}(\theta )\) with \(\mu _{1}(\theta ^{\prime }) \cdot {\ldots } \cdot \mu _{n}(\theta ^{\prime })\)? How shall we even represent these products?

A seemingly obvious choice would be to collect all soft constraints’ valuations as a set, i.e., \(\{p_{1},\ldots ,p_{n}\}\). Each \(p \in P\) could then individually be represented by the unit morphism \(\eta (p) = \{p\}\) and then combined using set union. Since \(\emptyset \) should be top in a PVS, we aim to order the sets by size and according to P. That means, we want \(X \preceq \emptyset \) for any set X and \(\eta (p_{1}) = \{p_{1}\} \preceq \{ p_{2} \} = \eta (p_{2})\) if \(p_{1} \leq _{P} p_{2}\). Both cases are covered by the Smyth-ordering over sets (cf. Section 3.2). However, that approach does not yield a proper PVS if we consider that we can multiply elements \(\{p_{1}\}\) with themselves: Assuming \(p_{1} \leq _{P} p_{2}\), also \(\{p_{1}\} \preceq \{p_{2}\}\) holds. Combining with \(\{p_{1}\}\) on both sides yields \(\{p_{1}\} \preceq \{p_{1}, p_{2}\}\), by monotonicity. But, by the definition of the Smyth-ordering, we also have \(\{p_{1}, p_{2}\} \preceq \{p_{1}\}\) and thus antisymmetry is violated.

It turns out that the idempotency of set union is the culprit, in particular the fact that \(\eta (p_{1}) \cup \eta (p_{1}) = \eta (p_{1})\). This fact is not required by PVS axioms. Instead, commutativity and associativity provide a hint about the underlying set of the free PVS: The free monoid over a set A uses \(A^{*}\), finite lists over A, embedded by \(\eta ^{\prime }(a) = [a]\) and combined with concatenation \({::}\) since we only need associativity: \(\eta ^{\prime }(a) :: (\eta ^{\prime }(b) :: \eta ^{\prime }(c)) = (\eta ^{\prime }(a) :: \eta ^{\prime }(b) ) :: \eta ^{\prime }(c) = [a,b,c]\) (see Appendix A). For the free PVS, we additionally need to equate \(\eta (a) \cup \eta (b)\) with \(\eta (b) \cup \eta (a)\), but again, not necessarily \(\eta (a) \cup \eta (a)\) with \(\eta (a)\). This is precisely what we achieve with \(\mathcal {M}_{\text {fin}}(P)\), finite multisets over P and . Taking plain sets over P would additionally assume idempotency and is thus too specific.Footnote 4

Figure 3 instantiates Definition 3 for the task of proving that \(\mathit {PVS}\langle {P}\rangle \) is indeed the free PVS over a partial order P. We start in the category \(\text {PO}\) of partial orders as objects and monotone functions as morphisms and map to \(\text {PVS}\), the category of partial valuation structures as objects and PVS-homomorphisms as morphisms.

Fig. 3
figure 3

Diagram of the free PVS over a partial order. For an arbitrary PVS M that we map into from a partial order P using φ, we can lift this mapping to φ : PVSP〉→ M such that φ = PO(φ) ∘ ηP. Consequently, PVSP〉 only identifies and orders elements as absolutely required by PVS axioms—it is most general

To switch between partial orders and partial valuation structures we need appropriate functors. First, the (free) functor \(\mathit {PVS}\langle {P}\rangle \):

figure r

In the other direction, the (forgetful) functor \(\mathit {PO} : \text {PVS} \to \text {PO}\) is defined by

$$\begin{array}{@{}rcl@{}} \mathit{PO}(M) = (|{M}|, {\leq_{M}}) \text{,} \\ \mathit{PO}(\varphi : M \to N) = \varphi \text{.} \end{array} $$

Starting from a partial order P, commutativity and associativity motivate the underlying set \(\mathcal {M}_{\text {fin}}(P)\). We can also justify each rule of the Smyth-ordering over multisets by applying Definition 3. First, as each \(p \in |{P}|\) is found in and \(\eta _{P}\) is a monotone function, we have that . This ensures that P is preserved over their embedded counterparts. The other rule stems from the fact that the neutral element is the top of the ordering in a PVS—which is the most prevalent choice in soft constraints [43]. This implies \(m \cdot _{M} n \leq _{M} m\) since \(n \leq _{M} \varepsilon _{M} \Rightarrow m \cdot n \leq _{M} m\), by monotonicity. Consequently, for the free PVS, needs to hold, as does , both of which are represented by the above rule. Dually, we would have , had we defined the neutral element to be bottom of the ordering.

Next, we have to confirm that is a partial valuation structure, to begin with. Associativity and commutativity of and neutrality of with respect to \(\mathcal {M}_{\text {fin}}(P)\) are obvious, we have already discussed reflexivity and transitivity of \(\preceq ^{P}\) as well as monotonicity of with respect to \(\preceq ^{P}\) in Section 3.2. To show antisymmetry of \(\preceq ^{P}\), we prove a result that also turns out to be useful later on when we implement the Smyth-ordering as a MiniZinc predicate to be used in search. To do so, we introduce a bit of notation to “unfold” a multiset T into a set representation . Formally, for a multiset with \(l_{1}, \dots , l_{n} > 0\) and \(x_{i} \neq x_{j}\) if \(i \neq j\), let \(\mathcal {S}(T) = \bigcup _{1 \leq i \leq n} \{ (j, x_{i}) \mid 1 \leq j \leq l_{i} \}\).

Lemma 1 (Witness for[[spiespace)

]\(\preceq ^{P}\)]TPU if, and only if, there is an injective map \(h : \mathcal {S}(U) \to \mathcal {S}(T)\) (called a witness function) with \(p \leq _{P} q\) if \(h(j, q) = (k, p)\) for all \((j, q) \in \mathcal {S}(U)\).

Proof

Let first \(T \preceq ^{P} U\) hold. We restrict our attention w.l.o.g. to the case \(T \neq U\) as otherwise the claim trivially holds. Then there is a sequence of multisets \(T_{1}, \ldots , T_{n} \in \mathcal {M}_{\text {fin}}(P)\) with \(n > 1\) such that \(T_{1} = T\), \(T_{n} = U\), and for each \(1 \leq i < n\), either or and with \(p \leq _{P} q\). As required in the claim, for each \(1 \leq i < n\) there is a witness \(h_{i} : \mathcal {S}(T_{i + 1}) \to \mathcal {S}(T_{i})\) as follows: If , then we choose \(h_{i} = \text {id}_{{\mathcal {S}(T_{i})}}\). If and with \(p \leq _{P} q\), then we choose \(h_{i} = \text {id}_{{\mathcal {S}(T_{i}^{\prime })}} \cup \{ (j, p) \mapsto (k, q) \}\) where \(j = \max \{ l \mid (l, p) \in \mathcal {S}(T_{i}^{\prime }) \} + 1\) and \(k = \max \{ l \mid (l, q) \in \mathcal {S}(T_{i}^{\prime }) \} + 1\). Then \(h_{1} {\circ } {\ldots } {\circ } h_{n-1} : \mathcal {S}(U) \to \mathcal {S}(T)\) is a witness function.

For the converse, we prove that if \(h : \mathcal {S}(U) \to \mathcal {S}(T)\) is a witness function, then \(T \preceq ^{P} U\) by induction on the cardinality of \(\mathcal {S}(U)\). Let \(h : \mathcal {S}(U) \to \mathcal {S}(T)\) be given. If \(|\mathcal {S}(U)| = 0\), then . Now let \(|\mathcal {S}(U)| > 0\) and let \((j, q) \in \mathcal {S}(U)\) such that j is maximal. Then \(h(j, q) = (k, p)\) with pPq. Let \(T^{\prime }, U^{\prime } \in \mathcal {M}_{\text {fin}}(P)\) be defined by and . To construct a witness function between \(T^{\prime }\) and \(U^{\prime }\) and apply the induction hypothesis, we define \(g : \mathcal {S}(T) \to \mathcal {S}(T^{\prime }) \) by \(g(l, r) = (l, r)\) if \(r \neq p\) or \(l < k\), and \(g(l, p) = (l-1, p)\) if \(l > k\). Essentially, g closes possible “gaps” in the image of h. Then \(\mathcal {S}(U^{\prime }) = \mathcal {S}(U) \setminus \{ (j, q) \}\) and \(h^{\prime } : \mathcal {S}(U^{\prime }) \to \mathcal {S}(T^{\prime })\) defined as \(h^{\prime } = g \circ h\) is a witness function between \(T^{\prime }\) and \(U^{\prime }\). By induction hypothesis, hence, \(T^{\prime } \preceq ^{P} U^{\prime }\) and thus, by the monotonicity of \({\preceq ^{P}}\) (see Section 3.2), . □

The witness function can be interpreted as assigning an “inferior” to every element on the right-hand side. To see the antisymmetry of the Smyth-ordering,Footnote 5 assume for a contradiction that there are T and U with both \(T \preceq ^{P} U\) and UPT, but \(T \neq U\) and choose one T with minimal cardinality satisfying this property. Then T has to be non-empty. Let \(f : \mathcal {S}(U) \to \mathcal {S}(T)\) and \(g : \mathcal {S}(T) \to \mathcal {S}(U)\) be witnessing maps for \(T \preceq ^{P} U\) and \(U \preceq ^{P} T\), respectively. Choose an element \((j, q) \in \mathcal {S}(U)\) such that q is minimal w.r.t. \(\leq _{P}\) in U. Then there is an inferior \((k,p) = f(j, q)\) in \(\mathcal {S}(T)\) with \(p \leq _{P} q\). If \(p \neq q\), as \(U \preceq ^{P} T\) holds as well, there would be yet another inferior \(g(k,p) = (j^{\prime }, q^{\prime }) \in \mathcal {S}(U)\) such that \(q^{\prime } \leq _{P} p\) and thus \(q^{\prime } \leq _{P} p <_{P} q\), contradicting the minimality of q in U; thus \(f(j, q) = (k, q)\). Assume, without loss of generality, that j and k are maximal. Remove the occurrence of p from T and U, obtaining \(T^{\prime }\) and \(U^{\prime }\), respectively. Then \(T^{\prime } \preceq ^{P} U^{\prime }\) and \(U^{\prime } \preceq ^{P} T^{\prime }\) hold as well, since the reduced-domain functions \(f^{\prime } : \mathcal {S}(T^{\prime }) \to \mathcal {S}(U^{\prime })\) with \(f^{\prime }(l,p) = f(l,p)\) and, similarly, \(g^{\prime } : \mathcal {S}(U^{\prime }) \to \mathcal {S}(T^{\prime })\), are witnessing maps. This contradicts the assumed minimality of T. Thus, \(\mathit {PVS}\langle {P}\rangle \) fulfills all axioms of a partial valuation structure and we are ready to show that it is indeed free.

Lemma 2 (Free PVS)

PVSPis the free PVS over the partial order P.

Proof

Let P be a partial order \((|{P}|, \leq _{P})\) and \(\varphi : P \to \mathit {PO}(M)\) be a PVS-homomorphism into the underlying partial order of any PVS M. To show the existence of a lifted variant of \(\varphi \), we define \({\varphi }^{\sharp } : \mathit {PVS}\langle {P}\rangle \to M\) as a \(\text {PVS}\)-morphism by

for all , where, if . This is well-defined, i.e., \({\varphi }^{\sharp }\) is indeed a PVS-homomorphism, since , and, if \(T \leq _{\mathit {PVS}\langle {P}\rangle } U\), then \({\varphi }^{\sharp }(T) \leq _{M} {\varphi }^{\sharp }(U)\): We consider the generating cases for one step of the Smyth-ordering as it is straightforward to consider the extension to sequences \(T_{1}, \ldots , T_{n}\) as done in the proof of the witness function. Assume \(T \leq _{\mathit {PVS}\langle {P}\rangle } U\). Either and with \(p \leq _{P} q\). Then \({\varphi }^{\sharp }(T) = \varphi (p) \cdot _{M} {\varphi }^{\sharp }(T^{\prime })\) and \({\varphi }^{\sharp }(U) = \varphi (q) \cdot _{M} {\varphi }^{\sharp }(T^{\prime })\). And since φ(p) ≤Mφ(q) due to \(\varphi \) being a \(\text {PO}\)-morphism, we have \({\varphi }^{\sharp }(T) \leq _{M} {\varphi }^{\sharp }(U)\), by monotonicity of \(\cdot _{M}\). Or, it is the case that . Then (\(T^{\prime }\) may be empty) and thus \({\varphi }^{\sharp }(T) = {\varphi }^{\sharp }(T^{\prime }) \cdot _{M} {\varphi }^{\sharp }(U) \leq _{M} {\varphi }^{\sharp }(U)\), by the PVS axiom \(m \cdot n \leq _{M} m\). Consequently \({\varphi }^{\sharp }\) is a PVS-homomorphism.

Moreover, \(\varphi = \mathit {PO}({\varphi }^{\sharp }) \circ \eta _{P}\) with for all ; hence the diagram in Fig. 3 commutes.

Finally, \({\varphi }^{\sharp }\) is unique with this property: Assume there would be another PVS-homomorphism \(\psi : \mathit {PVS}\langle {P}\rangle \to M\) that satisfies PO(ψ) ∘ ηP = φ. Due to this requirement, we have for every \(p \in |{P}|\). Thus, for \(\psi \), we have and

figure bb

since \(\psi \) is a PVS-homomorphism and by the previous remark. Hence \({\varphi }^{\sharp } = \psi \), as claimed, and \(\mathit {PVS}\langle {P}\rangle \) is indeed the free partial valuation structure over a partial order. □

This concludes our theoretical considerations of the free partial valuation structure. An example of a \(\mathit {SCSP}\) employing the free PVS is depicted in Fig. 5 in Section 3.5. It actually uses soft constraints that directly map to the free PVS (i.e., \(\mathcal {M}_{\text {fin}}(P\)) rather than P). For constraint preferences, we only distinguish between . Both the free PVS and a specialized type for constraint preferences are available in MiniBrass (cf. Section 4.2.2).

3.4 The free c-semiring over a partial valuation structure

As mentioned before, (partial) valuation structures are not the only abstract algebraic framework for soft constraints in the literature. C-semirings constitute a particularly popular choice. They are purely algebraic by requiring a second “additive” operation instead of a partial ordering to form an (upper semi” )lattice. This idempotent, commutative, and associative operation is then used to induce a partial ordering. Moreover, any c-semiring is bounded above and below by two designated constants. We will proceed to show that every c-semiring gives rise to a bounded PVS, and, conversely, every PVS can be extended to a c-semiring by means of another free construction—although not every PVS is a c-semiring since the additive operation in fact returns a supremum which need not exist in a PVS.

This section therefore extends previous work that examined the similarities between c-semirings and (totally ordered bounded) valuation structures [13]. The authors identified a valuation structure with every totally ordered c-semiring only. For branch-and-bound and similar search algorithms, a partial ordering indeed suffices (see Section 4 or [35, 43]). The main algorithmic advantage of having a second algebraic operation instead of the partial ordering lies in the thereby guaranteed existence of a supremum. This least upper bound can be used for non-serial dynamic programming, i.e., variable elimination. These algorithms may, however, return an unreachable optimal solution degree (e.g., the supremum of all reachable optima). From a practical perspective, this free construction of a c-semiring from a PVS alleviates the need to model in c-semirings in the first place. If a fruitful algorithmic technique for c-semirings (relying on the addition) is discovered, it can also be applied to a PVS when raised to the free c-semiring. We sketch such an application in Section 3.5 but first actually derive the free c-semiring over a PVS.

Formally, a c-semiring [12] \(A = (|{A}|, {\oplus _{A}}, {\otimes _{A}}, \mathbf {0}_{A}, \mathbf {1}_{A})\) is given by an (underlying) set \(|{A}|\), two binary operations \({\oplus _{A}}, {\otimes _{A}} : |{A}| \times |{A}| \to |{A}|\), and two constants \(\mathbf {0}_{A}, \mathbf {1}_{A} \in |{A}|\) such that the following axioms are satisfied:

  • \(\oplus _{A}\) is associative and commutative and has \(\mathbf {1}_{A}\) as annihilator and \(\mathbf {0}_{A}\) as neutral element

  • \(\otimes _{A}\) is associative and commutative, has \(\mathbf {0}_{A}\) as annihilator and \(\mathbf {1}_{A}\) as neutral element

  • \(\otimes _{A}\) distributes over \(\oplus _{A}\)

To preserve this structure, a c-semiring homomorphism \(\varphi : A \to B\) from a c-semiring A to a c-semiring B is given by a map \(\varphi : |{A}| \to |{B}|\) such that for all \(a_{1}, a_{2} \in |{A}|\):

  1. [spiespace

    ]\(\varphi (a_{1} \oplus _{A} a_{2}) = \varphi (a_{1}) \oplus _{B} \varphi (a_{2})\), \(\varphi (a_{1} \otimes _{A} a_{2}) = \varphi (a_{1}) \otimes _{B} \varphi (a_{2})\)

  2. [spiespace

    ]\(\varphi (\mathbf {0}_{A}) = \mathbf {0}_{B}\), \(\varphi (\mathbf {1}_{A}) = \mathbf {1}_{B}\)

Consequently, the category \(\text {cSRng}\) of c-semirings has the c-semirings as objects and the c-semiring homomorphisms as morphisms. Note that in a c-semiring A the operation ⊕A is idempotent:

$$a \oplus_{A} a = (a \otimes_{A} \mathbf{1}_{A}) \oplus_{A} (a \otimes_{A} \mathbf{1}_{A}) = a \otimes_{A} (\mathbf{1}_{A} \oplus_{A} \mathbf{1}_{A}) = a \otimes_{A} \mathbf{1}_{A} = a \text{.} $$

Hence, \(\oplus _{A}\) can be used to induce a partial ordering \(\leq _{A}\) by interpreting it as the least upper bound: \(a \leq _A b \Leftrightarrow a \oplus _A b = b\). Clearly, ≤A is reflexive due to the idempotency, transitive due to associativity, and antisymmetric due to commutativity of \(\oplus _{A}\). With this definition, for all a, b, c ∈|A| it holds that

  1. [spiespace

    ]\(\mathbf {0}_{A} \leq _{A} a \leq _{A} \mathbf {1}_{A}\);

  2. [spiespace

    ]\(a \leq _{A} a \oplus _{A} b\) and \(b \leq _{A} a \oplus _{A} b\);

  3. 1.

    if \(a \leq _{A} c\) and \(b \leq _{A} c\), then \(a \oplus _{A} b \leq _{A} c\).

In particular, \(a \oplus _{A} b\) is the supremum of a and b with respect to \(\leq _{A}\). Also \(\oplus _{A}\) is monotone w.r.t. \(\leq _{A}\) in both arguments, i.e., \(a \leq _{A} a^{\prime }\) and \(b \leq _{A} b^{\prime }\) implies \(a \oplus _{A} b \leq _{A} a^{\prime } \oplus _{A} b^{\prime }\). Additionally, the combination operation \(\otimes _{A}\) is monotone w.r.t. the induced ordering \(\leq _{A}\), since if \(a \leq _{A} a^{\prime }\) (i.e., \(a \oplus _{A} a^{\prime } = a^{\prime }\)) then \((a \otimes _{A} b) \oplus _{A} (a^{\prime } \otimes _{A} b) = (a \oplus _{A} a^{\prime }) \otimes _{A} b = a^{\prime } \otimes _{A} b\), i.e., \(a \otimes _{A} b \leq _{A} a^{\prime } \otimes b\), from which it follows that \(a \leq _{A} a^{\prime }\) and \(b \leq _{A} b^{\prime }\) implies \(a \otimes _{A} b \leq _{A} a^{\prime } \otimes b^{\prime }\). Furthermore, for all a, b ∈|A|, it holds that \(a \otimes _{A} b \leq _{A} a\) and \(a \otimes _{A} b \leq _{A} b\), since (aAb) ⊕Aa = (aAb) ⊕A(aA1A) = aA(bA1A) = aA1A = a.

As a consequence, we can easily convert any c-semiring into a PVS by defining the functor \(\mathit {PVS} : \text {cSRng} \to \text {PVS}\):

$$\begin{array}{@{}rcl@{}} &&\mathit{PVS}(A) = (|{A}|, {\otimes_{A}}, \mathbf{1}_{A}, {\leq_{A}}) \text{,} \\ &&\mathit{PVS}(\varphi : A \to B) = \varphi \text{.} \end{array} $$

Note that \(\mathit {PVS}(A)\) is a bounded PVS with \(\bot _{\mathit {PVS}(A)} = \mathbf {0}_{A}\). This leaves us with the first part of a free construction between categories \(\text {PVS}\) and cSRng (cf. Definition 3). The opposite direction, constructing a c-semiring starting from a PVS, is not as obvious since the partial order of a PVS need not show suprema that are required to exist for the \(\oplus \) operator (they clearly exist in total orders, making the conversion from totally ordered c-semirings to valuation structures more straightforward [13]). For instance, in Fig. 1, we saw that both are upper bounds of but they are incomparable.

When allowing partiality, we can always find an “artificial” supremum by collecting all (incomparable) valuations in a set and ordering these sets appropriately. Consider an arbitrary PVS \(M = (|{M}|, \cdot _{M}, \varepsilon _{M}, \leq _{M})\). We write \(\mathcal {I}_{\text {fin}}(M)\) to denote the set of finite sets composed of incomparable elements from |M| (i.e., if \(X \in \mathcal {I}_{\text {fin}}(M)\) then for any \(x \neq y \in X\) we have \(x {\Vert _{M}} y\)) and \(\text {Max}^{{\leq _{M}}}(X)\) to denote the maximal elements of X with respect to ≤M. For instance, if \(|{M}| = \{1,2, \textup {\uppercase \expandafter {\romannumeral 3}} , \textup {\uppercase \expandafter {\romannumeral 4}} \}\) and \(1 <_{M} 2\), \( \textup {\uppercase \expandafter {\romannumeral 3}} <_{M} \textup {\uppercase \expandafter {\romannumeral 4}} \) , the sets \(\{2, \textup {\uppercase \expandafter {\romannumeral 4}} \}\) or \(\{1, \textup {\uppercase \expandafter {\romannumeral 3}} \}\) are in \(\mathcal {I}_{\text {fin}}(M)\) but \(\{1, \textup {\uppercase \expandafter {\romannumeral 3}} , \textup {\uppercase \expandafter {\romannumeral 4}} \}\) is not and \(\text {Max}^{\leq _{M}}(|{M}|) = \{2, \textup {\uppercase \expandafter {\romannumeral 4}} \}\). We define the binary operations \({\tilde {\cup }_{M}}\) and \({\tilde {\cdot }_{M}}\) over \(\mathcal {I}_{\text {fin}}(M)\) by

$$\begin{array}{@{}rcl@{}} &&I {\tilde{\cup}_{M}} J = \text{Max}^{\leq_{M}} (I \cup J) \text{,}\\ &&I {\tilde{\cdot}_{M}} J = \text{Max}^{\leq_{M}} \{ m \cdot_{M} n \mid m \in I, n \in J \} \text{.} \end{array} $$

Clearly, \({\tilde {\cup }_{M}}\) inherits commutativity from ∪, and is idempotent since \(\text {Max}^{\leq _{M}}(I) = I\) for any set I consisting of already incomparable elements. It is easy to check that it is also associative. Further, \(\{\varepsilon _{M}\}\) is an annihilator for \({\tilde {\cup }_{M}}\) since \(\varepsilon _{M}\) is the greatest element of \(|{M}|\) with respect to ≤M, and \(\emptyset \) is its neutral element.

Similarly, \({\tilde {\cdot }_{M}}\) is obviously commutative since \(\cdot _{M}\) is commutative. Dually to \({\tilde {\cup }_{M}}\), it has \(\{\varepsilon _{M}\}\) as neutral element (since \(\varepsilon _{M}\) is neutral in M) and \(\emptyset \) as annihilator. For the associativity of \({\tilde {\cdot }_{M}}\), we have

$$\begin{array}{@{}rcl@{}} I {\tilde{\cdot}_{M}} (J {\tilde{\cdot}_{M}} K) &\!\,=\,& \text{Max}^{\leq_{M}} \{ m_{I} \!\cdot_{M} m_{JK} \!\mid \!m_{I} \in I, \!m_{JK} \!\in \mathrm{\!Max}^{\leq_{M}}\{ m_{J} \!\cdot_{M} m_{K} \!\mid\! m_{J} \!\in \!J, \!m_{K} \!\in \!K\} \} \\ &\!\,=\,&\text{Max}^{\leq_{M}} \{ m_{I} \!\cdot_{M} m_{J} \!\cdot_{M} \!m_{K} \mid m_{I} \in I, m_{J} \in J, \!m_{K} \in K \} \} \\ &\!\,=\,& \text{Max}^{\leq_{M}} \{ m_{IJ} \!\cdot_{M} m_{K} \!\mid \!m_{IJ} \!\in \text{Max}^{\leq_{M}}\{ m_{I} \!\cdot_{M} m_{J} \mid m_{I} \!\in \!I, \!m_{J} \!\in \!J \}, \!m_{K} \!\in K \} \\ &\!\,=\,& (I {\tilde{\cdot}_{M}} J) {\tilde{\cdot}_{M}} K \text{,} \end{array} $$

since \(\text {Max}^{\leq _{M}}\{ m \cdot _{M} n \mid m \in I, n \in \text {Max}^{\leq _{M}}(X) \} = \text {Max}^{\leq _{M}}\{ m \cdot _{M} n \mid m \in I, n \in X \}\) for all finite sets \(X \subseteq |{M}|\). Finally, \({\tilde {\cdot }_{M}}\) distributes over \({\tilde {\cup }_{M}}\):

$$\begin{array}{@{}rcl@{}} I {\tilde{\cdot}_{M}} (J {\tilde{\cup}_{M}} K) &\,=\,& \text{Max}^{\leq_{M}} \{ m_{I} \cdot_{M} m_{JK} \mid m_{I} \in I, m_{JK} \in \text{Max}^{\leq_{M}}(J \cup K) \} \\ &\,=\,& \text{Max}^{\leq_{M}} \{ m_{I} \cdot_{M} m_{JK} \mid m_{I} \in I, m_{JK} \in J \cup K \} \\ &\,=\,& \text{Max}^{\leq_{M}}(\{ m_{I} \cdot_{M} m_{J} \!\mid m_{I} \in I, m_{J} \!\in J \} \cup \{ m_{I} \!\cdot_{M} m_{K} \mid m_{I} \!\in I, \!m_{K} \in K \}) \\ &\,=\,&\begin{array}{ll} \text{Max}^{\leq_{M}}(\text{Max}^{\leq_{M}} \{ &m_{I} \cdot_{M} m_{J} \mid m_{I} \in I, m_{J} \in J \} \cup{}\\ \text{Max}^{\leq_{M}} \{ &m_{I} \cdot_{M} m_{K} \mid m_{I} \in I, m_{K} \in K \}) \end{array}\\ &\,=\,& (I {\tilde{\cdot}_{M}} J) {\tilde{\cup}_{M}} (I {\tilde{\cdot}_{M}} K) \text{,} \end{array} $$

since \(\text {Max}^{\leq _{M}}(I \cup \text {Max}^{\leq _{M}}(X)) = \text {Max}^{\leq _{M}}(I \cup X)\) for all finite \(X \subseteq |{M}|\). Thus, we conclude

Lemma 3

\((\mathcal {I}_{\text {fin}}(M), {\tilde {\cup }_{M}}, {\tilde {\cdot }_{M}}, \emptyset , \{ \varepsilon _{M} \})\) is a c-semiring.

This structure will serve to define the object part of a free functor from \(\text {PVS}\) to \(\text {cSRng}\).

At this point, it is worth noting that a similar construction of c-semiring addition and multiplication operations has been introduced by Rollón [50], although starting from a given c-semiring instead of a PVS. She proves that when A is a c-semiring, its so-called frontier algebra \(\mathcal {A} = (\mathcal {I}(A) \setminus \{ \emptyset \}, {\tilde {\oplus }_{A}}, {\tilde {\otimes }_{A}}, \{ \mathbf {0}_{A} \}, \{ \mathbf {1}_{A} \})\) again is a c-semiring, where \(\mathcal {I}(A)\) are (possibly infinite) subsets of \(|{A}|\) containing only pairwise incomparable elements w.r.t. \(\leq _{A}\), and,

$$\begin{array}{@{}rcl@{}} I {\tilde{\oplus}_{A}} J &=& \text{Max}^{\leq_{A}} (I \cup J) \text{,}\\ I {\tilde{\otimes}_{A}} J &=& \text{Max}^{\leq_{A}} \{ i \otimes_{A} j \mid i \in I, j \in J \} \end{array} $$

for all \(I, J \in \mathcal {I}(A) \setminus \{ \emptyset \}\). The underlying set of the frontier algebra thus contains sets of arbitrary cardinality, not only finite sets as in our approach of the free construction. In fact, such infinite sets would correspond to “junk elements” (cf. Appendix A), i.e., they would be unnecessary to have in the carrier set of the free c-semiring since we only have the finitary combination and supremum operation.

In [50], the condition that only non-empty sets have to be considered is missing. The empty set has to be excluded, however, since otherwise \(\emptyset \tilde {\otimes }_{A} \{ \mathbf {0}_{A} \} = \emptyset \), although \(\{ \mathbf {0}_{A} \}\) has to be the annihilator for \(\tilde {\otimes }_{A}\), and \(\emptyset {\tilde {\oplus }_{A}} \{ \mathbf {0}_{A} \} = \{\mathbf {0}_{A} \}\), i.e., \(\emptyset \leq _{\mathcal {A}} \{ \mathbf {0}_{A} \}\) contradicting that \(\{ \mathbf {0}_{A}\}\) has to be the smallest element w.r.t. \(\leq _{\mathcal {A}}\). By contrast, in our approach, we have to consider \(\emptyset \) as well in order to obtain a “fresh” bottom element of the free c-semiring over an arbitrary PVS. If we only applied the construction of a free c-semiring to the sub-category of bounded PVS, we also could exclude \(\emptyset \) and would obtain \(\{ \bot _{M} \}\) as bottom element of the free c-semiring over the bounded PVS M. However, the free PVS over a partial order—our original mission—clearly is not bounded.

To verify that we can design the morphism part of a free functor, it is useful to convince ourselves that the application of the maximum operator in a target structure subsumes the maximum operator in a source structure.

Lemma 4 (Subsumption of maximum)

Let \(\varphi : M \to N\) be a PVS homomorphism. For finite sets \(X \subseteq |{M}|\) , we have \(\text {Max}^{\leq _{N}}(\varphi (\text {Max}^{\leq _{M}}(X))) = \text {Max}^{\leq _{N}}(\varphi (X))\) .

Proof

First, \(\text {Max}^{\leq _{N}}(\varphi (\text {Max}^{\leq _{M}}(X))) \subseteq \text {Max}^{\leq _{N}}(\varphi (X))\), since \(\text {Max}^{\leq _{M}}(X) \subseteq X\) holds which in turn implies \(\varphi (\text {Max}^{\leq _{M}}(X)) \subseteq \varphi (X)\).

To conversely show \(\text {Max}^{\leq _{N}}(\varphi (X)) \subseteq \text {Max}^{\leq _{N}}(\varphi (\text {Max}^{\leq _{M}}(X)))\), it suffices to show that for each \(n \in \varphi (X)\) there is a (weakly dominating) \(n^{\prime } \in \varphi (\text {Max}^{\leq _{M}}(X))\) such that \(n \leq _{N} n^{\prime }\): If \(n \in \varphi (X)\) then \(n = \varphi (m)\) for some \(m \in X\). Either m is maximal, in which case n is obviously in \(\varphi (\text {Max}^{\leq _{M}}(X))\) as well. Otherwise, there is an \(m^{\prime } \in \text {Max}^{\leq _{M}}(X)\) with \(m \leq _{M} m^{\prime }\), hence \(n = \varphi (m) \leq _{N} \varphi (m^{\prime })\), and \(\varphi (m^{\prime }) \in \varphi (\text {Max}^{\leq _{M}}(X))\). □

Finally, we define the functor cSRng〈−〉 : PVS →cSRng as

$$\begin{array}{@{}rcl@{}} &&\mathit{cSRng}\langle{M}\rangle = (\mathcal{I}_{\text{fin}}(M), {{\tilde{\cup}_{M}}}, {{\tilde{\cdot}_{M}}}, \emptyset, \{ \varepsilon_{M} \}) \text{,} \\ &&\mathit{cSRng}\langle{\varphi : M \to N}\rangle = \lambda \{ m_{1}, \dots, m_{k} \} \in \mathcal{I}_{\text{fin}}(M) \,.\, \text{Max}^{\leq_{N}} \{ \varphi(m_{1}), \dots, \varphi(m_{k}) \} \text{.} \end{array} $$

We need to check (using Lemma 4) that \(\mathit {cSRng}\langle {\varphi : M \to N}\rangle \) is indeed a c-semiring homomorphism from \(\mathit {cSRng}\langle {M}\rangle \) to \(\mathit {cSRng}\langle {N}\rangle \) for the functor to be well-defined:

$$\begin{array}{@{}rcl@{}} &&\mathit{cSRng}\langle{\varphi}\rangle(\emptyset) = \emptyset \text{,} \quad \mathit{cSRng}\langle{\varphi}\rangle(\{ \varepsilon_{M} \}) = \{ \varphi(\varepsilon_{M}) \} = \{ \varepsilon_{N} \} \text{,}\\ &&\mathit{cSRng}\langle{\varphi}\rangle(I_{1} {\tilde{\cup}_{M}} I_{2}) = \mathit{cSRng}\langle{\varphi}\rangle(\text{Max}^{\leq_{M}}(I_{1} \cup I_{2})) \\ &=&\text{Max}^{\leq_{N}}(\varphi(\text{Max}^{\leq_{M}}(I_{1} \cup I_{2}))) = \text{Max}^{\leq_{N}}(\varphi(I_{1} \cup I_{2})) = \text{Max}^{\leq_{N}}(\varphi(I_{1}) \cup \varphi(I_{2}))\\ &=&\mathit{cSRng}\langle{\varphi}\rangle(I_{1}) {\tilde{\cup}_{N}} \mathit{cSRng}\langle{\varphi}\rangle(I_{2}) \text{,}\\ &&\mathit{cSRng}\langle{\varphi}\rangle(I_{1} {\tilde{\cdot}_{M}} I_{2}) = \mathit{cSRng}\langle{\varphi}\rangle(\text{Max}^{\leq_{M}} \{ m_{1} \cdot_{M} m_{2} \mid m_{1} \in I_{1}, m_{2} \in I_{2} \})) {}\\ &=&\text{Max}^{\leq_{N}}(\varphi(\text{Max}^{\leq_{M}} \{ m_{1} \cdot_{M} m_{2} \mid m_{1} \in I_{1}, m_{2} \in I_{2} \})) \\ &=& \text{Max}^{\leq_{N}} \{ \varphi(m_{1} \cdot_{M} m_{2}) \mid m_{1} \in I_{1}, m_{2} \in I_{2} \} \\ &=&\text{Max}^{\leq_{N}} \{ \varphi(m_{1}) \cdot_{N} \varphi(m_{2}) \mid m_{1} \in I_{1}, m_{2} \in I_{2} \} {}\\ &=&\text{Max}^{\leq_{N}} \{ n_{1} \cdot_{N} n_{2} \mid n_{1} \in \varphi(I_{1}), n_{2} \in \varphi(I_{2}) \} {}\\ &=&\mathit{cSRng}\langle{\varphi}\rangle(I_{1}) {\tilde{\cdot}_{N}} \mathit{cSRng}\langle{\varphi}\rangle(I_{2}) \text{.} \end{array} $$

With these functors from \(\text {PVS}\) to \(\text {cSRng}\) and vice versa defined, we are ready to apply Definition 3 to the problem of finding the free c-semiring over a PVS, as depicted in Fig. 4. As unit morphism, we define \(\eta _{M} : M \to \mathit {PVS}(\mathit {cSRng}\langle {M}\rangle )\) for every PVS M by \(\eta _{M}(m) = \{ m \}\). Now let M be some PVS, A a c-semiring, and \(\varphi : M \to \mathit {PVS}(A)\) be a PVS-homomorphism. Again, we search a lifting \({\varphi }^{\sharp }\) that “emulates” (and extends) the PVS-homomorphism \(\varphi \) at the c-semiring level, i.e., makes the diagram in Fig. 4 commute by asserting that \(\mathit {PVS}({\varphi }^{\sharp }) \circ \eta _{M} = \varphi \). We define φ : cSRngM〉→ A as a function \({\varphi }^{\sharp } : \mathcal {I}_{\text {fin}}(M) \to |{A}|\) and need to show that it is a c-semiring homomorphism:

$${\varphi}^{\sharp}(\{ m_{1}, \dots, m_{n} \}) = \varphi(m_{1}) \oplus_{A} {\cdots} \oplus_{A} \varphi(m_{n}) $$

for all \(\{ m_{1}, \dots , m_{n} \} \in \mathcal {I}_{\text {fin}}(M)\), where, if \(n = 0\), \(\emptyset \) is mapped to \(\mathbf {0}_{A}\); \({\varphi }^{\sharp }\) is indeed a c-semiring homomorphism, since for the constants, \({\varphi }^{\sharp }(\mathbf {0}_{\mathit {cSRng}\langle {M}\rangle }) = {\varphi }^{\sharp }(\emptyset ) = \mathbf {0}_{A}\) and \({\varphi }^{\sharp }(\mathbf {1}_{\mathit {cSRng}\langle {M}\rangle }) = {\varphi }^{\sharp }(\{ \varepsilon _{M} \} ) = \varphi (\varepsilon _{M}) = \varepsilon _{\mathit {PVS}(A)} = \mathbf {1}_{A}\). To show that \({\varphi }^{\sharp }\) preserves the operations \({\tilde {\cdot }_{M}}\) and \({\tilde {\cup }_{M}}\), we first note that for each finite set \(\{ m_{1}, \dots , m_{n} \} \subseteq |{M}|\) (not necessarily composed of incomparable elements) it holds that \({\varphi }^{\sharp }(\text {Max}^{\leq _{M}} \{ m_{1}, \ldots , m_{n} \}) = \varphi (m_{1}) \oplus _{A} {\ldots } \oplus _{A} \varphi (m_{n})\): if some dominating \(m_{i} \leq _{M} m_{j}\) exists, then φ(mi) ≤PVS(A)φ(mj) (since \(\varphi \) is a PVS-homomorphism), hence, \(\varphi (m_{i}) \oplus _{A} \varphi (m_{j}) = \varphi (m_{j})\). We can thus “remove” each occurrence of the dominated mi in \(\varphi (m_{1}) \oplus _{A} {\ldots } \oplus _{A} \varphi (m_{n})\) since its dominator \(m_{j}\) is included in that term. Therefore,

$$\begin{array}{@{}rcl@{}} &&{\varphi}^{\sharp}(\{m_{1}, \ldots, m_{k}\} {\tilde{\cup}_{M}} \{m_{k + 1}, \ldots, m_{n}\}) = {\varphi}^{\sharp}(\text{Max}^{\leq_{M}} \{m_{1}, \ldots, m_{n}\}) \\ &=& \varphi(m_{1}) \oplus_{A} \ldots \oplus_{A} \varphi(m_{n})\\ &=& (\varphi(m_{1}) \oplus_{A} {\ldots} \oplus_{A} \varphi(m_{k})) \oplus_{A} (\varphi(m_{k + 1}) \oplus_{A} {\ldots} \oplus_{A} \varphi(m_{n})) \\ &=&{\varphi}^{\sharp}(\{m_{1}, \ldots, m_{k}\}) \oplus_{A} {\varphi}^{\sharp}(\{m_{k + 1}, \ldots, m_{n}\}) \text{.} \end{array} $$
Fig. 4
figure 4

Diagram of the free c-semiring over a PVS. As with previous free constructions, cSRngM〉 only identifies and orders elements as absolutely required by c-semiring axioms—it is again most general

Similarly, for two sets \(I, J \in \mathcal {I}_{\text {fin}}(M)\)

$$\begin{array}{@{}rcl@{}} {\varphi}^{\sharp}(I {\tilde{\cdot}_{M}} J) &=& {\varphi}^{\sharp}(\text{Max}^{\leq_{M}} \{ m_{1} \cdot_{M} m_{2} \mid m_{1} \in I, m_{2} \in J \}))\\ &=& \textstyle{\bigoplus}_{A} \{ \varphi(m_{1} \cdot_{M} m_{2}) \mid m_{1} \in I, m_{2} \in J \}\overset{\text{PVS hom.}}{=}\\ && \textstyle{\bigoplus}_{A} \{ \varphi(m_{1}) \cdot_{\mathit{PVS}(A)} \varphi(m_{2}) \mid m_{1} \in I, m_{2} \in J \} \\ &=&\textstyle{\bigoplus}_{A} \{ \varphi(m_{1}) \otimes_{A} \varphi(m_{2}) \mid m_{1} \in I, m_{2} \in J \} \overset{\text{distr.}}{=}\\ &&\textstyle{\bigoplus}_{A} \{ \varphi(m_{1}) \mid m_{1} \in I \} \otimes_{A} {\bigoplus}_{A} \{ \varphi(m_{2}) \mid m_{2} \in J \} = {\varphi}^{\sharp}(I) \otimes_{A} {\varphi}^{\sharp}(J) \text{.} \end{array} $$

Thus, \({\varphi }^{\sharp }\) is a c-semiring homomorphism and additionally, \(\mathit {PVS}({\varphi }^{\sharp })(\eta _{M}(m)) = \varphi (m)\), i.e., the diagram in Fig. 4 commutes, and φ is unique with this property (the proof is analogous to that of Lemma 2). We may thus conclude:

Lemma 5

cSRngMis the free c-semiring over the partial valuation structure M.

From the fact that the composition of two free constructions is a free construction itself [54, Ch. 3], we further know:

Corollary 1

cSRngPVSP〉〉 is the free c-semiring over the partial order P.

Therefore, we abbreviate \(\mathit {cSRng}\langle {\mathit {PVS}\langle {P}\rangle }\rangle \) as \(\mathit {cSRng}\langle {P}\rangle \) and obtain a generic way to embed any partial order P into a c-semiring in a canonical way. More explicitly, this c-semiring has finite sets of incomparable (w.r.t. the Smyth-ordering) multisets composed of elements from \(|{P}|\) as its elements. If, e.g., P = ({I, II,1,2}), then are in \(\mathcal {I}_{\text {fin}}(\mathcal {M}_{\text {fin}}(P))\) but is not since (cf. Fig. 5 for a similar ordering). In the following section, we revisit this free c-semiring to illustrate the application of \(\oplus \) and the required distributivity with respect to possible solving algorithms.

Fig. 5
figure 5

The upper part illustrates the rating system (R), its free partial valuation structure PVSR〉 and the free c-semiring cSRngR〉 = cSRngPVSR〉〉. Highlighted elements are introduced by axioms of the respective algebraic structure. The center part presents a SCSP with variables {x,y,z}, domain {0,1} and five (unary or binary) soft constraints that map assignments to |R|. The lower part finds the optimal solution degrees of SCSP by applying bucket elimination on cSRngR〉 with the elimination order 〈x,y,z〉. Note that valuations of soft constraints μ are embedded into cSRngR〉 according to (2)

3.5 Adequacy of algebraic structures for soft constraints

The original goal of algebraic abstractions of specific soft constraint formalisms was to provide a common theoretical ground for questions of computational complexity and, perhaps more intensely studied, efficient solving algorithms. The latter include search strategies, dynamic programming techniques, and constraint propagation.

In terms of model expressiveness, we seek a fairly general structure that captures a broad variety of formalisms. In terms of algorithmic efficiency, however, we are inclined to sacrifice generality for additional structure that makes search and propagation more effective. Most algorithmic efforts can roughly be divided into:

  • Classical search algorithms such as branch-and-bound, limited discrepancy search, or large neighborhood search [61] with accompanying search heuristics and efficient bounding techniques such as russian doll search or mini bucket elimination [43].

  • Soft local consistency and soft global constraints to enhance a search scheme [18, 19]

  • Dynamic programming algorithms (variable elimination, bucket elimination, cluster tree elimination) [10, 11, 21]

Originally, valued constraints and c-semiring-based soft constraints generalized weighted constraints and fuzzy constraints, respectively. While c-semirings additionally allowed for partiality to better represent incomparable decisions, valued constraints put a total ordering first instead of an operator for the supremum.Footnote 6 Totality is beneficial for solving as it reduces search to more well-known scalar optimization tasks with a unique optimal solution degree and allow for more efficient pruning. Soft local consistency techniques with non-idempotent combination operators further require so-called “fair” valuation structures that admit a difference operator \(a \ominus b\)—which is not mandatory for a PVS.

Similarly, the supremum \(\oplus \) presupposed by a c-semiring is put to use in non-serial dynamic programming such as bucket elimination whenever we perform a “projection” operation. Projection means finding the best extension (with a greater variable scope) of a given assignment. If we are dealing with PVS without a supremum (such as the free PVS), these algorithms are not directly applicable. However, as a remedy, we can still use this family of algorithms if we put in place the free c-semiring instead. Example 2 demonstrates this procedure for bucket elimination. This algorithm proceeds by picking a variable elimination order, leading to “buckets” for each variable x which collect all soft constraints \(\mu \) that have x as next (not yet eliminated) variable in their scope. Intermediate soft constraints \(\nu \) are generated by taking the union of all variables in a bucket, then calculating the intermediate results (i.e., the combination over all soft constraint valuations in the bucket) for each assignment in the Cartesian product of the domains, and projecting out x (see [21]). One can check that each elimination step is an application of the distributivity law (see Lemma 3). All known limitations regarding time and space which prohibit widespread usage in practice, of course, remain [43].

Example 2

Consider a decision that is made based on some abstract “rating system” \(\mathsf {R}\) (as can be seen in Fig. 5) that is inspired by, e.g., two executives that make an independent choice, denoted by \(\{1,2\}\) and {I, II}, respectively, where a higher number means a better evaluation. Any “two”, however, is better than any “one”. There is an explicit top element \(\top \) representing maximal satisfaction. There is no unique least upper bound for 1 and I, though. We assume that soft constraints are specified by a map from variable assignments to elements of \(|{\mathsf {R}}|\), as presented in the figure. To consider combinations of individual soft constraint valuations, i.e., to have a proper \(\mathit {SCSP}\), we use the free partial valuation structure \(\mathit {PVS}\langle {\mathsf {R}}\rangle \) to obtain a multiplication. We represent every element r other than \(\top \) as and let \(\top \) map to the neutral element . Note how the resulting partial order \(\mathit {PO}(\mathit {PVS}\langle {\mathsf {R}}\rangle )\) over \(\mathcal {M}_{\text {fin}}(|{\mathsf {R}}|)\) is not suprema-closed (center in Fig. 5). To still be able to apply bucket elimination, we embed \(\mathit {PVS}\langle {\mathsf {R}}\rangle \) into its associated free c-semiring \(\mathit {cSRng}\langle {\mathit {PVS}\langle {\mathsf {R}}\rangle }\rangle = \mathit {cSRng}\langle {\mathsf {R}}\rangle \). Consequently, we embed any soft constraint \(\mu \) mapping to \(|{\mathsf {R}}|\) into cSRngR〉 as follows:

For instance, . Finally, we invoke bucket elimination to obtain the optimal solution degree (see [21] for a similar illustration). The algorithm terminates with that is clearly not reachable by any individual assignment. However, each of the three components (i.e., multiset over \(|{\mathsf {R}}|\)) corresponds to one assignment. By appropriate bookkeeping during the elimination process, we find that 𝜃1 = {x↦0,y↦0,z↦0}, \(\theta _{2} = \{ \mathrm {x} \mapsto 0, \mathrm {y} \mapsto 1, \mathrm {z} \mapsto 0\}\), and 𝜃3 = {x↦1,y↦1,z↦0} map to the respective optimal solution degrees and are thus optimal solutions. The free c-semiring provides enough information for said bookkeeping—another c-semiring returning a supremum of all solution degrees need not do this, in general.

Note that in fact we get a set of all PVS-optima as the unique optimal solution degree in the free c-semiring. Clearly, however, enforcing totality or a supremum for the only sake of better algorithms might counteract a modeler’s intentions. Some (in reality incomparable) solutions are dominated by others. If we do not rely on explicit soft constraint operations but rather formulate it as a conventional constraint optimization problem that is solved by search and propagation (as in branch-and-bound or large neighborhood search), the structure a PVS offers suffices—which makes them the appropriate data structure for designing MiniBrass.

4 Implementation

Our considerations up to now have been mostly abstract and mathematical in terms of proper constructions of PVS and their relationship to c-semirings. We now turn to the design of MiniBrassFootnote 7 as an extension of MiniZinc and how it includes existing formalisms in the literature. Indeed, the argumentation in Section 3.5 motivates that PVS are the adequate algebraic structure to encode soft constraint formalisms such that the theoretical constructions substantiate the MiniBrass language. MiniZinc, on the other hand, offers a well-balanced compromise between expressive power (high-level concepts such as functions and predicates) and broad support by a variety of solvers including propagation engines such as Gecode or JaCoP but also other paradigms such as MIP or SAT solvers. To smoothen the transition between conventional constraint models and soft constraint models, MiniBrass follows many MiniZinc conventions such as having a “solve” item, independence of order of statements and the notation, in general.

MiniBrass is a file-based soft constraint modeling language that revolves around the concept of partial valuation structures. A model (resp., instance) is divided into a (hard) constraint model (see Listing 1) written in conventional MiniZinc, consisting of variable definitions and classical constraints, and a preference model (see Listing 2) which contains PVS type declarations along with instantiations, soft constraint definitions based on the variables in the constraint model, and combinations (Pareto and lexicographic) of instances. MiniBrass separates essential constraints of a problem from its objective because:

  • Existing soft constraint formalisms in the literature (weighted, fuzzy, constraint preferences, …) are available for a preference model using the respective PVS types.

  • Preferences can be elicited and specified using PVS type \(\mathcal {A}\) (perhaps having a non-trivial (multi)set-based order such as the free PVS) which is then transformed to another PVS type \(\mathcal {B}\) that is better supported by existing solvers (cost function networks/integer objectives) using morphisms (see Section 4.3).

  • By exploiting modularity, users can combine several preference structures (perhaps stemming from different agents) at runtime (Pareto or lexicographic).

  • Multiple preference models for the same hard constraint model can co-exist and be selected at runtime depending on other context factors.

Listing 1
figure bn

hello-world.mzn: The conventional constraint model contains all variable definitions and hard constraints. It includes the compiled MiniBrass output (hello-world_o.mzn) which contains generated variables, linking constraints, search procedures relevant to MiniSearch (miniBrass, defined in minibrass.mzn), and the (optional) search annotation pvsSearchHeuristic

Listing 2
figure bo

hello-world.mbr: A preference model of the problem in Fig. 1 with PVS type for a free PVS and one PVS-instance that also serves as the “solve”-item analogous to MiniZinc. The MiniZinc function embed converts boolean expressions to multisets, i.e., embed(bexpr, idx, len) yields the empty multiset (represented by an array of length len containing only zeros) if bexpr holds and (represented by an array of length len with a one at index idx and zeros otherwise) if bexpr does not hold. Note how an edge (p2, p1) hereby indicates that p2Pp1 and thus

Conceptually, the main idea of how to encode a soft constraint problem as a conventional constraint optimization problem has been outlined in [43] after being first described in [46]: For a soft constraint problem \(\mathit {SCSP} = (X, D, C, M, S)\) with PVS \(M = (|{M}|, \cdot _{M}, \varepsilon _{M}, \leq _{M})\), we define the classical constraint model \((X, D, C)\) as usual and for every soft constraint \(\mu _{i} \in S\), we have a constraint along the lines of “[i] = i(X)” where \(\mathtt {valuation}\) is an array of variables of type \(|{M}|\) and \(\mathtt {mznExpr}_{i}(X)\) stands for the MiniZinc expression of soft constraint μi, based on variables X. Additionally, there is an \(|{M}|\)-variable \(\mathtt {overall}\) holding the overall valuation which is constrained such that “= ⋅M… ⋅M” where \(\mathtt {nScs}\) refers to the number of soft constraints. The partial ordering \(\leq _{M}\) is used to generate constraints on future solutions such as “\(\mathtt {overall} <_{M} \mathtt {overall}^{\prime }\)” to ask for the next solution \(\mathtt {overall}^{\prime }\) to be better than the current one \(\mathtt {overall}\). Branch-and-bound (see Section 4.5) is based on this predicate.

Listings 1 and 2 exemplify these ideas using the introductory toy rostering problem covered in Fig. 1. The problem is specified with constraint preferences which are mapped to the free PVS as discussed in Section 3.3. Each soft constraint \(\mu _{i}\) maps to if violated and otherwise (implemented using a function embed in Listing 2). The corresponding PVS-type FreePVS (see Listing 3) implements multisets of elements from an underlying partial order as solution degrees, multiset union as the combination operation, and the Smyth-ordering as MiniZinc functions and predicates in the file free-pvs-type.mzn. Said partial order P is passed by parameters maxP (denoting the highest index) and orderingP (the ordering relation as list of edges) during instantiation.

Multisets are not natively supported by MiniZinc but need to be encoded for solvers. In the free PVS, the set \(\mathcal {M}_{\text {fin}}(P)\) is clearly infinite as we can reach any finite multiset over P by applying combination (i.e., multiset union) often enough. Since solvers operate on finitely many decision variables with finite domains, we never have to use the full range of \(\mathcal {M}_{\text {fin}}(P)\), though, and always operate on a finite subset of it. Put differently, the maximal multiplicity of any element of P is necessarily restricted. To put a meaningful upper bound on the multiplicities, we note that in a \(\mathit {SCSP}\), the overall valuation is given by \({\prod }_{M} \{\mu _{i}(\theta ) \mid \mu _{i} \in S\}\). If we can determine the maximal occurrence any P-element has in any individual \(\mu _{i}(\theta )\), say k (maxPerSc in Listing 3), we simply use ⋅ k as the maximal occurrence for the overall valuation. Taking constraint preferences as instance, this value is in fact easy to determine as any soft constraint can obviously only be violated once—we exploit this further in Section 4.2.2. With these restrictions, a multiset composed of P-elements with maximal multiplicity maxOccurrences is just encoded as an array[1..maxP] of var 0..maxOccurrences. We discuss the model elements further in the following sections.

Listing 3
figure bu

The type definition used in the hello world example in Listing 2 is found in defs.mbr

The toolchain needed for MiniBrass adds an additional preceding step to the conventional MiniZinc/MiniSearch workflow. Figure 6 illustrates the involved processes: First, the MiniBrass preference model (e.g., Listing 2) is compiled into MiniZinc or MiniSearch code using mbr2mzn. During this process, auxiliary variables taking soft constraint valuations and their aggregations, improvement and not-worsening constraints for branch-and-bound, as well as variable orderings for search heuristics and complex order predicates (in case of Pareto or lexicographic combinations) are generated. Finally, the classical constraint model (e.g., Listing 1) includes the compiled MiniBrass output and is solved by MiniZinc (mzn2fzn) or MiniSearch (minisearch) and a FlatZinc solver.

Fig. 6
figure 6

Toolchain of MiniBrass and its integration with MiniZinc. Blue elements indicate artifacts that have to be created for every new problem instance whereas orange ones refer to reusable (MiniBrass) library items that do not need to be modified by end users

4.1 PVS types and instantiations

Every PVS type definition has to specify the possible solution degrees, ordering predicate and combination operation in MiniZinc. The solution degrees (semantically the underlying set of the PVS) are referred to as the element type E of a PVS type. This can be any MiniZinc base type such as int, float, bool, or subtypes thereof as well as sets or arrays of a base type. For instance, the aforementioned free PVS uses mset which is a MiniBrass keyword abbreviating an array of integers representing a multiset. Weighted constraints or cost function networks map to int or some integer range 0..k, and fuzzy constraints employ the float range 0.0 .. 1.0.

The element type E corresponds to (a subset) of the carrier set \(|{M}|\) of a resulting PVS instance \(M = (|{M}|, \cdot _{M}, \varepsilon _{M}, \leq _{M})\) and prescribes the signature of the ordering \(\leq _{M}\), i.e., \(\leq _{M} {} \subseteq E \times E\). Canonically, every soft constraint \(\mu _{i}\) maps to E and the combination operation has to be a function ⋅M : E × EE, as was shown in Listing 2. However, there are cases where the essential model information is different from an E-element—compare Fig. 5 or our previous toy example where soft constraints map to E but are wrapped by an embedding function. This becomes more evident when we consider PVS that are based on “violated soft constraints” such as, e.g., weighted constraints. Each soft constraint \(\mu _{c} : [X \to D] \to |{M}|\) (parametrized by some conventional constraint c and associated weight \(w_{c}\)) then has the form \(\mu _{c}(\theta ) = 0\) if \(\theta \models c\) and \(w_{c}\) otherwise: The essential information here is whether 𝜃c. Wrapping every boolean expression (as done in Listing 2 with the embed function) obviously leads to cluttered and less readable constraint and preference models. To avoid this clutter on the syntactic level, PVS types can be augmented with a soft constraint type S that defines the type of each soft constraint expression (here bool). We semantically define a mapping \(g_{i}: S \to E\) that maps each S-expression of soft constraint \(\mu _{i}\) to an E-element, all of which are combined with \(\cdot _{M}\). However, there are cases when we do not want or need to have \(\mathtt {nScs}\) E-expressions but rather emulate the above combination using an embedding SnE—especially if there are beneficial global constraints involved (see Listing 4). In both cases, we end up with an overall valuation of type E that is ordered using \(\leq _{M}\) during search. As a consequence, we have the combination operation map several S-expressions to one element in E (where \(S=E\) is of course valid) and have it mimic the \(\cdot _{M}\) operation on E.

Moreover, PVS types may declare parameters that need to be specified during the instantiation of concrete PVS in a preference model. Examples thereof are the maximally allowed weight k in weighted constraints or the preference graph in constraint preferences. The MiniZinc implementation of the combination function and ordering predicate has to be supplied in a separate MiniZinc file that will be included by the compiled MiniZinc output. All presented types are included in the MiniBrass standard library but new definitions can likewise be added with regard to extensibility.

To sum up, for a PVS type parametrized by soft constraint type S and element type E, the ordering predicate, combination, and top element have to implement these interfaces:

figure bv

The ordering of the PVS type parameters must match the order in the PVS type declaration. In a similar way, some PVS types offer generic search heuristics that can be provided (see the keyword pvsSearchHeuristic below). The interface is expected to be:

figure bw

It should generally be noted that is_worse always corresponds to a predicate denoting strict worsening (this is the most common type of predicate used in branch-and-bound). The top element is beneficial for bounding search and having default soft constraints.

Besides the type declarations, there are a few “technical” MiniBrass keywords that are sure to be found in the compiled MiniZinc output (e.g., prefs_o.mzn in Fig. 6) and can thus be accessed from the constraint model (e.g., model.mzn in Fig. 6):

  • topLevelObjective: contains a var E-expression for an atomic top level PVS (the instance specified in the solve item), with element type E; not applicable if the top level PVS is complex (e.g., a lexicographic product). It appears in the output in Listing 1 and could also be a MiniZinc objective: solve minimize topLevelObjective (only if E is scalar).

  • pvsSearchHeuristic: contains an annotation object for the top level PVS that holds a particular variable order (of the generated variables) that depends on the PVS-type(s) involved (see Section 4.2.1). For complex PVS, multiple heuristics are concatenated sequentially.

  • postGetBetter: contains a MiniSearch procedure that is used to post a constraint requiring the next solution to be better than the current one. The generic branch-and-bound procedure pvs_BAB used in Listing 1 (which is defined in pvs_gen_search.mzn and explained in Section 4.5) relies on postGetBetter being written by the compiler.

  • postNotGetWorse: dually, this MiniSearch procedure only requires the next solution not to be dominated (important to find all optima of an instance).

Once PVS types are declared, we can use them for the instantiation of concrete PVS objects. A PVS object stores a specific set of parameters and includes the actual soft constraints mapping to E (or S) as MiniZinc expressions—thereby connecting the constraint and preference model. In addition, the operators pareto and lex can be used to compose complex preference structures from elementary ones.

4.2 Examples of soft constraint formalisms as PVS types

For illustration purposes, we survey the most common soft constraint formalisms (see Section 2) presented as PVS types. Throughout the examples, we assume a simplistic classical constraint model without any actual hard constraints except for the domain restrictions:

figure bt

4.2.1 Integer-valued: weighted CSP or cost function networks

The PVS types for weighted constraints and cost function networks are naturally very similar. The latter are defined as integer-valued soft constraints that map any assignment to some value in the range \([0 {\ldots } k]\) for some parameter k denoting maximal violation and top being 0. Consequently, there is no distinct soft constraint type but just the element type 0..k.

figure by

Combination means adding individual costs (bounded by k) and the ordering relation is the integer greater-than ordering (consistently with literature, cost minimization is default):

figure bz

Besides this core type definition, the MiniBrass library offers other utility functions: For better access to native cost function implementations in Toulbar2, there is a global constraint (along with a default decomposition for other solvers) that is handled by Numberjack and properly given to Toulbar2. For instance,

figure ca

ties a cost function’s valuation for variables x and y to a cost variable costVariable, depending on a given cost vector that contains the value for every element in the Cartesian product of the domains of x and y. In a similar spirit, soft global constraints are implemented in MiniBrass. Since the existing soft globals map to a numeric variable, they naturally lead to cost function networks. For instance, a soft variant of alldifferent counts the variables taking the same value as a measure of violation:

figure cb

There are native implementations for soft_alldifferent (e.g., in JaCoP [40]) which can make use of dedicated propagation instead of this provided decomposition—precisely like in conventional global constraints managed by MiniZinc.

In contrast to cost function networks, weighted constraints are binary—a soft constraint is violated or not, and if so, punished with a weight. This is reflected by using the soft constraint type bool that is mapped to the element type 0..k.

figure cc

Weighted constraints provide the first example of a generic search heuristic annotation that comes with the PVS type. The MiniZinc function getSearchHeuristicWeighted (included in the file weighted_type.mzn) provides a particular variable ordering: the variables containing the highest-weighted possible violation first (called most important first in [56]). That way, search can start by setting the generated satisfaction variables of all soft constraints to true and let propagation take over to possibly find high-quality solutions early:

figure cd

Section 5.3 provides some insight in the effectiveness of the above search heuristic.

There is a double-usage of the PVS type WeightedCsp. As we set the weights’ default value to 1, we get a Max-CSP instance if no weights are supplied. We can add weights (more generally, parameter values tied to every soft constraint) by annotating a soft constraint during instantiation:

figure ce

4.2.2 Comparative: free PVS and constraint preferences

Next, we revisit purely comparative preference structures that operate directly on partial orders (instead of resorting to numeric values) such as the free PVS seen in Listing 2. As mentioned above, its element type is mset[maxOccurrences] of 1..maxP which is syntactic sugar for an array[1..maxP] of var 0..maxOccurrences that represents the overall solution degree. This type is the most general available in MiniBrass and encodes an arbitrary partial order such as the rating system \(\mathsf {R}\) in Fig. 5 as a PVS.

Each soft constraint maps to a multiset with bounded multiplicity, as indicated by the parameters in Listing 3. While the combination (multiset_union) is straightforward by just summing the individual soft constraints’ element multiplicities, implementing a MiniZinc predicate for the Smyth-ordering (cf. Section 3.2) is a bit more involved but showcases MiniBrass’ abilities. In essence, to establish \(T \prec ^{P} U\) the key idea is to apply Lemma 1 and have the witness \(h :\mathcal {S}(U) \to \mathcal {S}(T)\) be decided by the solver using local decision variables of the predicate. Recall that \(\mathcal {S}(U)\) refers to the “set of pairs” representation of a multiset. Thus h is defined on pairs, has to be injective and satisfy the constraints \(p \leq _{P} q\) whenever \(h(j,q) = (k,p)\). Injectivity is best modeled by an alldifferent-constraint but there is none for pairs. We can mitigate this by defining a one-dimensional witness and apply the bijective Cantor pairing function \(\pi : \mathbb {N}^{2} \to \mathbb {N}\) defined by \(\pi (k_{1},k_{2}) := \frac {1}{2}(k_{1} + k_{2})(k_{1} + k_{2} + 1)+k_{2}\). The resulting one-dimensional array can be constrained to be all different, as usual.

figure cf

At this point, we want to emphasize that end-users (i.e., modelers) do not need to fully understand the implementation of the Smyth-ordering in MiniZinc but only its (rather intuitive) inductive definition to apply it in their models. The above definition is fully encapsulated by the freePVS-type. In MiniZinc, we have to note however that this predicate relies on local free variables (e.g., witnessOcc) which prohibit its usage in a negative or mixed context such as “it must not be the case that the next solution is Smyth-worse than the current” (cf. non-domination search in Section 4.5). This is, in fact, due to the focus on existential quantification in constraint solvers that would have to support universal quantification in case of negated predicates with local free variables. However, currently only Zinc and MiniZinc support local free variables, as required for the Smyth-ordering, at all [63].

By construction, freePVS is well-suited to transform any partial order such as, e.g., the rating system \(\mathsf {R}\) into a PVS. However, for problems specified with constraint preferences (such as Listing 2), freePVS might seem too rich in generality. We only observe the multisets for distinct soft constraints \(\mu _{i}\). Similar to weighted constraints, we could thus make use of the soft constraint type bool that relieves us from the embed function in Listing 2 (times uses if a soft constraint \(\mu _{i}\) holds and otherwise):

figure cj

Here, msetVal encodes a multiset as an array mapping from the underlying set of elements to their cardinality, e.g., msetVal('[1,0,0]') represents the multiset or msetVal('[1,0,2]') would be . However, we can further improve the encoding of the free PVS for constraint preferences by noting that no element i can occur more than once in any reachable overall valuation. Each such m in \(\mathcal {M}_{\text {fin}}(P)\) can then just be represented by a set of integers—a type that is natively supported with appropriate global constraints by MiniZinc and several constraint solvers. Hence, we define a dedicated PVS type ConstraintPreferences with bool as soft constraint type and set of int as element type that, in fact, only operates on a certain subset of the free PVS:

figure cm

More precisely, we use set of 1..nScs, and identify each soft constraint with a number. By channeling the boolean expressions evaluating to false to a set for the combination, we obtain the set of all violated soft constraints (see Listing 4). The Smyth-ordering on sets (cf. Section 3.2) is also implemented as a MiniZinc predicate (is_worse_cr) and is activated by having the useSPD parameter set to true. Alternatively, one may use the so-called transitive-predecessor-ordering (TPD) [39] that defines a more important constraint to dominate a whole set of less important ones:

figure cn

Clearly, \(T \preceq ^{P} U \Rightarrow T \preceq ^{P}_{\!\text {TPD}} U\), i.e., TPD adds ordering relations to SPD. But perhaps more usefully, the above predicate is easier to decide, i.e., no free local variables are involved—hence, TPD is suited for mixed and negative contexts such as non-domination search.

Listing 4
figure co

Using the MiniZinc global link_set_to_booleans to connect reified soft constraints with an overall solution degree denoting the set of violated soft constraints

Of course, an instance of ConstraintPreferences also needs a directed acyclic graph (the crEdges parameter represents an adjacency list) over soft constraints. For convenience, we include here that the transitive closure is automatically calculated by MiniBrass during compilation (turning the DAG into a partial ordering)—as an example of a parameter wrapping method. Such methods could either be MiniZinc functions for data transformation or Java methods. Ensuring correct user input (i.e., acyclicity or other validations by means of MiniZinc assertions or Java exceptions) can be done here as well.

The restriction to set of int also eases the implementation of the Smyth-ordering as we do not have functions over pairs—as opposed to the multiset case. As a corollary to Lemma 1, on two sets T and U, \(T \preceq ^{{P}} U\) holds if and only if there exists an injective witness function \(f : U \to T\) such that \(f(p) \leq _{P} p\) for all \(p \in U\). Similar to the multiset case, we enforce (and propagate) the injectivity of f with alldifferent and make sure the witness property is fulfilled.

4.2.3 Real-valued: fuzzy CSP and probabilistic CSP

A third class of soft constraint formalism is best characterized by the element type being the reals over \([0.0, 1.0]\). Starting with fuzzy constraints, each soft constraint maps to [0.0,1.0] with the combination being defined as the minimum operator—in this case, soft constraint type and element type coincide.

figure cp

The resulting soft constraints of element type 0.0 .. 1.0 could directly be defined as MiniZinc functions but we added some support as a global constraint which is included in fuzzy_type.mzn. For instance, consider a soft constraint \(\mu _{1}\) defined over two boolean variables mainCourse and wine: μ1 = {(0,0)↦1.0,(0,1)↦0.8,(1,0)↦0.3,(1,1)↦0.7}. In MiniBrass, this can be written as follows:Footnote 8

figure cq

On the other hand, probabilistic constraints bear similarities to both weighted and fuzzy constraints. We use bool as soft constraint type to denote violated constraints and 0.0 .. 1.0 for probabilities as element type. Formally, the objective is \({\prod }_{\mu _{i} : \theta \not \models \mu _{i} } 1-p_{i}\). The “constraint presence” probabilities \(p_{i}\) are, analogously to weights, supplied as parameters.

figure cr

Both fuzzy and probabilistic constraints aim at maximization of the solution degree such that is_worse_prob(x,y) corresponds to x < y.

4.3 Morphisms to switch PVS

There are at least two reasons for users to specify their \(\mathit {SCSP}\) using one PVS type but solve the problem using another: If the original PVS shows many incomparable optimal solutions, we might want to totalize the ordering—if only for testing and debugging. But more frequently, solvers do not support the data types required to represent a PVS type even though they have to be used for performance reasons or due to the target software environment. For instance, set-based types for constraint preferences or real-valued domains with suitable global constraints for fuzzy constraints are not universally implemented. A modeler would only accept transforming the \(\mathit {SCSP}\) in a structure-preserving way: at least, existing strict “is better than” decisions in the original ordering are not to be contradicted; at most, incomparable solutions may become comparable—precisely what PVS-homomorphisms offer.

We saw an example in Fig. 2 where we compared \(\mathit {PVS}\langle {P}\rangle \) and \(\mathit {Weighted}(P)\), i.e., we can calculate a weight for each constraint (by making use of the instance parameters such as the supplied graph) and transform a constraint preferences problem into a weighted CSP instance. In MiniBrass, we first define a morphism

figure cs

using a function that is applied to each original soft constraint expression (here just the identity id) and then transforming a specific PVS instance:

figure ct

By devising similar morphisms for other PVS types, we can integrate the previously mentioned fact that many soft constraint formalisms can be (monotonically) encoded as cost function networks in polynomial time [58], the type for which Toulbar2 offers efficient dedicated algorithms. For instance, a probabilistic PVS having a multiplicative maximization objective \(f(\theta ) = {\prod }_{\mu _{i} : \theta \not \models \mu _{i} } 1-p_{i}\) can be transformed into an additive minimization problem by taking the negative logarithm of f: \(- \log f(\theta ) = {\sum }_{\mu _{i} : \theta \not \models \mu _{i}} - \log p_{i}\) where we can precalculate the \(- \log p_{i}\) terms as weights:

figure cu

The above-mentioned calculation here takes place in the Java class ProbWeighting, indicated by the generated keyword. While this morphism definition is mathematically sound, we have to round the terms to the nearest integer in the implementation.

4.4 Products of PVS

As mentioned in Section 3.1, we can form composite PVS from elementary ones by means of products. The direct (Cartesian) product is denoted by pareto in MiniBrass and the lexicographic product is denoted by lex in MiniBrass. We can combine these two operators and morphisms to form complex PVS. Consider these exemplary use cases:

figure cv

4.5 PVS-based search

With the tools at hand, we are able to define PVS types, instantiate them, and combine and morph them to more complex structures. The overall goal is the PVS passed in the solve-item. To find optimal solutions, MiniBrass relies on classical systematic constraint solving and optimization using propagation and search, as outlined in the beginning of Section 4. The necessary facilities are provided by MiniZinc/MiniSearch and the underlying solvers. If the element type is numeric, the problem can be solved in MiniZinc by minimizing (or maximizing) topLevelObjective. However, the full strength of abstract soft constraint formalisms precisely is the presence of partial and product orders. MiniSearch provides blueprints for various of the classical searches that can be customized that way.

The first search strategy corresponds to classical branch-and-bound (BAB) search in propagation engines. For every found solution, a constraint is imposed that the next solution has to be strictly better.

figure cw

While this procedure yields optimal solutions, it is not ideal for partially ordered objectives since another optimum does not have to be better than the current solution. Instead, it must not be dominated by any solution seen so far [35]. When solving for a PVS M, we have a set of lower bounds (the valuations of previous solutions) \(L = \{l_{1}, \ldots , l_{m} \} \subseteq |{M}|\) and require that it must not be that \(\exists l \in L : \mathtt {obj} \leq _{M} l\) where \(\mathtt {obj}\) denotes the generated MiniZinc variable(s) holding the overall objective. The next solution must be strictly better than any one of the maxima of L or incomparable to all of them.

figure cx

There is a caveat to this solution. With the is_worse predicates that PVS types offer, we can generate a MiniSearch procedure “postNotGetWorse” during compilation as well. However, we have to negate this predicate, i.e., change its boolean context. This leads to problems if the predicate shows free local variables [63]. We have seen this in Section 4.2.2 for the witness function necessary to decide the Smyth-ordering which is not compatible with postNotGetWorse. For constraint preferences, we have to resort to the TPD-ordering instead. Since we expect future non-trivial PVS types to rely on local variables, we need modelers to be aware of this restriction.

Example 3

Consider the following simplified example to illustrate the difference:

figure cy

We explore \(\mathrm {x}\) in a decreasing order. Each assignment to \(\mathrm {x}\) violates precisely one soft constraint. This results in the sequence \(\langle { \{3\}, \{2\}, \{1\} }\rangle \) of solution degrees. \(\{ {3} \}\) and \(\{ {2} \}\) both dominate \(\{ {1} \}\) but are incomparable using TPD-ordering (and Smyth, too). The reachable optima of this problem are clearly \(\{\{{2}\}, \{{3}\}\}\) but pvs_BAB would stop after \(\{2\}\) since \(\{3\}\) is not better. By contrast, pvs_BAB_NonDom returns both optimal solution degrees.

MiniSearch actually offers much more flexibility in crafting problem-specific searches than just branch-and-bound. For instance, designing large-neighborhood-search for PVS-based models can be done using their concepts of scopes, as described in [49].

figure cz

In a similar way, we can anticipate many variants of search algorithms with postGetBetter or postNotGetWorse. By separating concerns between constraint and preference model, the preference model in MiniBrass can be tested with various searches.

5 Evaluation

To evaluate MiniBrass we decided to model soft constraint problems using the PVS type constraint preferences (see Section 4.2.2). We used MiniZinc benchmark problemsFootnote 9 as the underlying constraint models. These are taken from several editions of the MiniZinc challenge [65]. Optimizing according to constraint preferences requires set-based variables and compatibility with MiniSearch. By applying morphisms as described in Section 4.3, we obtain weighted CSP versions that are compatible with a wider range of solvers.

Alternatively, we could have resorted to the existing cost function networks benchmark library that also offers MiniZinc models for a tabularized encoding.Footnote 10 However, conventional constraint solvers have already been shown to be dominated on these problems by Toulbar2. Moreover, these problems only address one particular PVS. Optimizing according to the Smyth-ordering has not been addressed before. In particular, this ordering introduces some partiality in the models which generally makes the task of finding optima more demanding due to reduced pruning.

The problems were selected according to features that justify an encoding approach (i.e., efficient conventional propagation), feasibility of decompositions for many solvers, and meaningful soft constraint addition. Certainly, there are cost function network problems (e.g., in bioinformatics) that are out of reach for conventional solvers, as [34] demonstrated for Toulbar2. However, we argue that there are many practical cases with relatively few soft constraints and many conventional constraints. Among those, we investigate the following problems:

Soft N-Queens :

is a toy \(\mathit {SCSP}\) that adds three artificial soft constraints with a preference relation over them to a classical N-queens problem (such as, e.g., having a queen in the center of the grid), not to be mistaken with the M-queens optimization problem.

Photo Placement :

asks to place people close to their friends—in its original version it was already designed to handle preferences but we (morally questionably) allowed for some friends to be more important to stand close to than others.

Talent Scheduling :

aims at scheduling movie scenes including various actors cost-effectively. We augmented the conventional problem with preferences to avoid being simultaneously on set with a rival actor and early/late times for specific scenes.

On-call Rostering :

requires to assign staff members to days in a rostering period, respecting work constraints and unavailabilities. The original formulation already contained preferences for not being on-call for more than two days in a row or not being on-call for a weekend and a consecutive day. We modeled these existing preferences in MiniBrass and added additional ones regarding preferred co-workers.

Multi-Skilled Project Scheduling :

(MSPSP) is a variant of resource-constrained project scheduling and asks to assign a set of tasks to workers such that the required set of skills for a task is provided by its assigned worker. To add soft constraints, we again allowed workers to state with whom they would like to work and which tasks they would like to work on or which ones they would rather avoid.

More detailed information about the changed and added aspects can be found online.Footnote 11 Each problem is tested with three to six instances, totaling 28 benchmark instances. Since most of these problems already were formulated as constraint optimization problems to begin with, we had to deal with two objectives: the original one and the soft constraint objective. First, we converted the problems to constraint satisfaction problems by imposing the original objective value to lie within a 10%–15% boundary around the (previously determined) optimal value, and eventually used the soft constraint objective. In MiniBrass, we write this as solve search presolveMin(origObj, 0.15) /\ miniBrass();.

We solved the resulting models (including parameters that will be subject to the respective experiment questions) using branch-and-boundFootnote 12 (cf. Section 4.5) with the classical constraint solvers Gecode 5.0.0 [59], JaCoP 4.4.0 [40], Google OR-Tools 5.1.0 (CP solver) [30], Choco 4.0.3 [36], and G12 1.6.0 [64], as well as with the only competitive cost function network solver Toulbar2 0.9.8 [2], accessed via Numberjack 1.1.0 [32]. Each presented experiment was run on a machine having 4 Intel Xeon CPU 3.20 GHz cores and 14.7 GB RAM on a 64 bit Ubuntu 15.04 with a timeout set to 10 min per instance. Whenever we average relative values (such as a speedup for runtimes) that are normalized to one solver/heuristic, we use the geometric mean as the only valid choice for such data [26]. Concerning statistical tests for runtime comparisons, we used the Wilcoxon signed-rank test (alpha level \(\alpha = 0.05\)) since the measured runtimes showed a heavy-tailed distribution instead of a normal one.

Our primary goal is to demonstrate the feasibility of implementing soft constraint formalisms more generally than a numeric objective at low runtime overheads—a capability that is not shared by any state-of-the-art soft constraint solver. Besides, even in the realm of cost function networks and weighted constraints, it can pay off to use an encoding approach with a conventional constraint solver as opposed to a dedicated soft constraint solver. Furthermore, we examine the effects of our proposed search heuristics for weighted constraints.

5.1 Comparing performance: encoded weighted CSP versus native Toulbar2

If we want to obtain a comparative view on the performance of MiniBrass models, we have to use cost function networks. For one thing, they are the native formalism Toulbar2 supports, for another thing the task boils down to minimizing a numeric value in conventional models which is directly supported by MiniZinc. On the one hand, Toulbar2 can be seen as the only true state-of-the-art alternative to MiniBrass (given that WSimply [6] has no MiniZinc or Numberjack interface, only runs on a 32 bit Linux distribution, and is no longer maintained)—on the other hand it serves as a well-supported backend. Therefore, this evaluation cannot be truly seen in a competitive light as MiniBrass is a modeling language. Here, the central question is:

ᅟ:

How fast and effectively (in terms of finding optima) can WCSP instances be solved by encoding them as COPs versus using a dedicated solver?

Table 1 presents the results for this first question with times and objectives being averaged over all instances for the respective problem, ranked by runtimes.Footnote 13 Values in parentheses denote averaged relative values with respect to the minimum (geometric mean of ratios for time or arithmetic mean for excess penalty violation, resp.) for each instance—as opposed to relative average values. Therefore, the relative overhead does not necessarily correlate with the absolute values (e.g., Toulbar2 versus Gecode on Talent Scheduling). The number of wins indicates how many instances a solver won (i.e., being fastest) within the respective problem. If an instance was not solved at all within the specified time limit, the maximally possible violation for it was assumed.

Table 1 Comparison of solvers’ performance on the weighted CSP representations

We observed a fairly even distribution of solvers performing well with OR-Tools being among the top three on all problems, showing the most reliable contribution of all conventional constraint solvers. In addition to the table, we noted that over all problems, OR-Tools had the lowest average runtime (97.39 s) and the lowest average objective value (6.18), whereas Gecode achieved the most wins (12). Interestingly, Toulbar2 managed to achieve the best (or second best) average runtimes for three problems, excelling in On-call Rostering. However, the memory-intensive decompositions required for MSPSP and Talent Scheduling had Toulbar2 fail during model creation without returning a solution. To conclude, even though Toulbar2 is a strong choice when dealing with cost function networks, there are cases where only an encoding approach succeeded at all (MSPSP)—or substantially faster (Talent Scheduling). With problems modeled in MiniBrass, both options remain.

5.2 Comparing models: Smyth-optimization versus weighted-optimization

Upon learning that weighted instances can be solved efficiently by conventional constraint solvers, we turn to optimization according to the Smyth-ordering. MiniBrass was explicitly designed for more general orderings than numeric objectives—in particular, Smyth as the ordering of the free PVS. We want to quantify how expensive the partiality of an original model is with respect to the totalization obtained by weighting constraints. To solve these models, only Gecode and JaCoP are applicable, as they are both compatible with MiniSearch and support set-based variables to the necessary extent. For these solvers, we compare the running times and objective valuesFootnote 14 for the original Smyth-based model and the (morphed) weighted CSP. Gecode is provided in a native version directly accessed by MiniSearch (see Section 2) and a FlatZinc-based version—with the latter being more recent than the native one. JaCoP is only available using FlatZinc. Where applicable, i.e., if Toulbar2 solved the instances, we additionally provide its reference values (Toulbar2 is restricted to the weighted version). Here, the central question is:

ᅟ:

Is optimizing according to the Smyth-ordering much more expensive than solving a weighted counterpart obtained by a morphism?

Table 2 presents our results answering this question. Note that, for this evaluation, the Smyth-based models have been solved with strict domination BaB since this is the only way the totalized weighted version can operate. We expected the weighted problems to be much easier to solve since there is possibly stronger pruning and propagation involved. To our surprise we noticed that, whereas for most instances (87.8%), the weighted counterpart was indeed easier to solve, there were instances where the constraint preferences version took substantially less time—as in Talent Scheduling and Soft N-Queens. A possible explanation is that optimality can be easier proved using propagation of the witness function of the Smyth-ordering. Put differently, there could be better solutions in terms of weights but not Smyth, therefore search can be pruned earlier. We may also notice that, on these instances, Toulbar2 can provide much better performance than the constraint solvers on the weighted counterparts—when applicable. This is mostly due to the fact that Choco and OR-Tools are left out (as opposed to Section 5.1) since they currently do not support set variables. In terms of objective values, even though optimality is proven in most cases, the Smyth and weighted versions yield different values which is not surprising as, again, a “weight-better” solution need not be “Smyth-better”. Thus, there are generally lower values to be expected using the weighted version. The attentive reader will notice that the average objective for the Smyth-model is in fact lower than for the weighted model in Soft N-queens solved by the native Gecode solver. In fact, the solver timed out on one weighted instance at the sub-optimal objective value 4 whereas the Smyth-based variant happened to yield a Smyth-optimal solution that is also weight-optimal with objective value 2.

Table 2 Comparing the solvers’ performance on a Smyth-based model and the weighted CSP representations

With strict BaB only, we only get one optimal solution—at best. The advantage of using partial orders clearly is having multiple incomparable optima at modeling and not having to totalize the ordering by weighing. However, searching for a whole set of optima (as done in non-domination BaB) obviously leads to longer runtimes than stopping at the first found optimum (as done in strict BaB). We investigate the differences in Table 3 (recall that TPD has to be used for non-domination search, see Section 4.2.2). On the examined benchmark problems, we observe a slowdown factor between one and five when non-domination leads to longer runtimes. However, in some cases the difference between non-domination and strict BaB was negligible (i.e., MSPSP and Soft N-Queens), even showing a small (not statistically significant) speedup for non-domination search—mostly due to the set of optima actually being small where strict and non-domination BaB converge to similar search trees.

Table 3 Comparison of runtimes between searching for all optima instead of a strict domination improvement

5.3 Comparing search heuristics: most important first versus default

Lastly, with abstract higher level preference models, we can use generic search heuristics that align with the optimization goals—dependent on the PVS type in use. Here, our simple strategy (shown in Section 4.2.1) is to try and assign true to the boolean variables reifyingFootnote 15 the satisfaction of soft constraints in the order of decreasing weight (i.e., importance). We refer to this heuristic as most important first (MIF). Some of the benchmark problems already shipped with a problem-specific variable ordering heuristic. In such cases, activated MIF prepends the reified satisfaction variables to the existing heuristic. We compare the effects of MIF on various types of searches (strict, non-domination, weighted), problems, and solvers. The central question is:

ᅟ:

Can a generic heuristic (MIF) for soft constraint problems speed up the search for optima?

Over all 168 runs across solvers, problem instances, and search types, the MIF heuristic led to a faster runtime in 73 cases (43%) with the average runtime reduced by 6.22 s. Yet, MIF is not uniformly beneficial but affects some solvers more than others. Similarly some problems are more likely to be improved. Table 4 presents results in a more fine-grained fashion, grouping the evaluated data by problems and solvers, respectively. We find that MIF seems to negatively influence the performance compared to the built-in default search strategies in particular for OR-Tools, JaCoP, and Toulbar2 but can lead to tremendous improvements for Choco (cf. Fig. 7a)—which showed the only statistically significant difference when grouped by solvers—due to instances of all problems being compared.

Table 4 Runtime difference between models with MIF activated and deactivated

On the other hand, when grouping by problem, On-call Rostering and Talent Scheduling benefited the most from MIF (cf. Fig. 7b), both showing a speedup on average (relative differences)—although the difference was not significant for Talent Scheduling: over all instances, MIF produced faster but also slower runtimes about the same number of times while the average still favors MIF. Contrary to that, for On-call Rostering, the runtime difference—albeit small on average—was statistically significant since MIF improved the performance in about half of the instances—and in the others it did no harm. We suspect that for the other problems, either the built-in heuristics were effective enough or MIF led to thrashing behavior if the best solutions still violated many soft constraints. Then, MIF initiates many “pointless” searches by setting all soft constraints to be satisfied and propagation fails to prove infeasibility fast enough. For problems such as Photo Placement or MSPSP, we admittedly better use the default search strategy. MIF clearly is no silver bullet. However, since activating a search heuristic in MiniBrass only amounts to placing pvsSearchHeuristic in front of the search procedure (cf. Listing 1), this still is an easy first step in tweaking the performance. Interestingly, our experiments also reveal that the effectiveness of the heuristic depends more strongly on the problem at hand than it does on the particular solver.

Fig. 7
figure 7

Average runtimes for instances solved with and without MIF activated for several models (Smyth or weighted). Figures correspond to the data showing differences in Table 4

6 Conclusion and future work

We presented and evaluated MiniBrass, a soft constraint modeling language building on MiniZinc that closes the gap between algebraic soft constraint frameworks and state-of-the-art solvers. We motivated why the concept of free constructions is an appropriate tool to facilitate the transition from partial orders to PVS and from PVS to c-semirings with the least overhead and provided proofs for these constructions. MiniBrass is capable of expressing a broad variety of soft constraint formalisms in the literature that are subsumed by partial valuation structures. Moreover, it allows designing complex preference structures using product operators and morphisms separately from conventional constraints. Finally, we evaluated MiniBrass on a set of “softened” benchmark problems and found that on these problems an encoding-approach is competitive with dedicated soft constraint solving, optimizing with the Smyth-ordering is only slightly more expensive than weighted problems, and the most-important-first heuristic can lead to significant runtime savings.

In the future, we plan to extend MiniBrass to distributed settings where we use other preference aggregation strategies than pareto or lex to combine several agents’ PVS specifications. Moreover, we develop a graphical interface to MiniBrass to facilitate modeling. We also plan to extend the number of available backends such as, e.g., WSimply, to further broaden the applicability of the algebraic soft constraint modeling language MiniBrass.