1 Introduction

Shape grammar interpreters have been studied for more than forty years addressing several areas of design research. Several useful accounts of existing general shape grammar interpreters and purpose-built shape grammar interpreters are readily available in the literature [1,2,3,4]. An updated view of these lists is given below in Table 1 featuring 61 applications including implementations of general and specific purpose shape grammar interpreters. Note, however, that this list—and most of its predecessors—includes very few actual shape grammar interpreters (in the general and technical sense of the word), and that many of the references in the list are just implementations of very specific grammars for a very specific design or research purpose. This is not an accident; the very core values that their underlying shape grammar formalism has promised, the miraculous calculations with shapes, the visual treatment of emergence and ambiguity, the seamless interface in design workflows, are all still in want. Surprisingly, the original account of the list of the shape grammar interpreters by Gips [1] already accounted for applications that claimed recognition of subshapes and deployment in 2D and 3D space and yet, even after all these years, general-purpose shape grammar interpreters seem still limited by shape types, types of transformation, complexity of geometry, matching conditions, counting of non-equivalent parts, semantic information, interface design and so forth. Still, the situation is not as grave as it may seem. It is suggested here that beyond this seemingly long list of technological hurdles, the operation of embedding, that is, the implementation of the mathematical concept of the “part relation” between two shapes, or equivalently, between two drawings, or between a shape and a design, is the single major obstacle to take on.

Table 1 List of shape grammar implementations

The work here focuses exactly on this front foregrounding the criteria that characterize the underlying machinery for the most important aspect of the shape grammar interpreter implementation, namely, the conventions of matching under which a shape can be a part of a design. These calculations follow the general structure of the calculations involved for tackling embedding outlined in Stiny [5, 6] and Krishnamurti [7, 8] but are recast here in a slightly different format following in part the lattice of schemata rules outlined in Stiny [9]. This modified structure consists of three matching conditions starting from simple queries of determinate matchings of embedded shapes under restricted conditions, to a rising complexity of determinate and indeterminate matchings without any restrictions, all characterized by four types of transformations under which each matching occurs. It is further suggested here that these general calculations for these three conditions of embedding (including their twelve subcases for the four types of transformations), along with the calculation for the determination of the non-equivalent mappings for each type of embedding, plus the familiar calculations for the characteristic signature of the shape grammar formalism—the maximal representation of shape (for each type of shape)—altogether do provide a map of the five families of calculational obstacles that general-purpose shape grammar interpreters face. The work here considers all three embedding conditions cast within the singular rule schema \(x\to y\) [6]. A visual map of the work accomplished in the field in terms of the current state of embedding and the work ahead is given in the end. Aspects of interface design and integration to current work design workflows are deliberately left aside.

2 Requirements of a Shape Grammar Interpreter

A computer implementation of a shape computation requires the implementation of five distinct processes, all intimately involved in the recognition and replacement of a shape under a given transformation and all encoded in the structure of the shape algebras \({U}_{ij}\) that shape rules are defined in [6]. More specifically, for \(u\), \(v\) and \(W\) shapes, a shape rule \(u\to v\) and the shape \(W\) defined as the current design, the operation that a shape grammar interpreter should process is:

$$ \begin{array}{*{20}l} {if\;f\left( u \right) \le W:} \hfill \\ {\quad \quad \quad W = W - f\left( u \right) + f\left( v \right)} \hfill \\ \end{array} $$

or, a) Encode the shapes \(u\), \(v\) and \(W\) in the smallest number of basic elements that can specify them; b) Inquire whether there is transformation f that embeds the shape \(f(u)\) in \(W\), and if yes; c) Subtract the shape \(f(u)\) from \(W\); d) Add the shape \(f(v)\) in \(W\); and e) Repeat the above processes for all applicable transformations of the shape \(f(u)\) in \(W\). A visual example is shown in Fig. 1 to demonstrate the five procedures outlined above underlying a shape replacement.

Fig. 1
figure 1

A shape computation. a–c Shapes \(u\), \(v\), and \(W\); d Shape rule \(u\to v\); e Eight matches of the shape \(f(u)\) in \(W\); f Subtractions of the eight instances of the shape \(f(u)\) from \(W\); g Additions of the eight instances of the shape \(f(v)\) to the corresponding eight instances of the shape \(W-f(u)\)

The example illustrated in Fig. 1 features a shape rule applied under an isometry transformation, that is, a transformation that keeps shape and size invariant while varying handedness and position. In this case, there are eight f transformations that embed the shape \(f(u)\) in the shape \(W\) (that is, make the shape \(f(u)\) part of the shape \(W\). The implementation of each of these five processes in a shape grammar interpreter brings its own set of problems and some more than others. A brief description of each process is given below.

The first process of encoding the shapes \(u\), \(v\) and \(W\) in maximal representation, that is, in the smallest number of basic elements that can specify a shape [6], is to ensure that the shapes have a unique specification so that they can be compared and acted upon. The maximal representation of shape is typically defined in different ways depending on the dimensionality of the shape, that is, 0-, 1-, 2- and 3-dimensions, and its type, that is, line, arc, conic, Bezier, etc., requiring in essence different algorithms for a maximal point representation, maximal line representation, maximal curve representation, maximal plane representation, maximal surface representation, maximal solid representation, and so on [6]. For most shape grammar implementations, the maximal representation of shape is implemented with various approaches and typically, by a combined usage of operations (computer programs) of shape instantiation and shape addition or shape subtraction [10]. However, for more complex geometries, such as curves, surfaces, and solids, it is difficult to obtain the maximal representations of the corresponding elements of the shapes [11] and there is still a large number of shape types to be addressed. A different kind of problem might arise when shapes are perceptually similar but mathematically different and the implementation might seem to fail or otherwise cause confusion to users.

The second process of inquiring whether there is a transformation \(f\) that embeds the shape \(f(u)\) in \(W\) is the most critical—and elusive—process for most of the shape grammar interpreters. The part relation between shapes is typically achieved by checking the boundaries of shapes [12]. The matching transformation, \(f\) is typically a Euclidean transformation—but more generally, a transformation belonging to the Euclidean, affine and projective geometries. Most detrimentally, most of the interpreters adopt database query [13] to simulate the desired transformation but this method, powerful as it may appear, it is severely limited because it assumes that a shape can be decomposed and represented as a finite set of subshapes and therefore violates the fundamental definition of shapes. Additionally, most of these matching calculations in affine and projective spaces accumulate a rounding error so fast that the matching results are often useless.

The third process of subtracting the shape \(f(u)\) from \(W\) is based on the detection of shape boundaries. Chase [15] has listed 13 cases of two input lines with labeled endpoints so that the system can derive the results with three different procedures. The implementation of shape subtraction for lines has been done by Krishnamurti [7, 10] and has been broadly adopted in other interpreters. However, this shape operation is highly related to shape type and the complexity of implementation increases as the dimensionality of shape and the corresponding dimensionalities of space that the shapes are defined in are both increasing too. As above, shape types captured by higher degrees might cause severe precision errors and make the system unstable. For instance, the calculation of descriptors of high degree curves such as Bezier curves can be heavy because the system has to resolve the coefficients of high degree polynomials [16].

The fourth process of adding the shape \(f(v)\) to \(W-f(u)\) is similar to subtracting. As above, Chase [15] has listed out 13 cases of two input lines with labeled endpoints so that the system can derive the results with three different procedures. Similarly, the implementation of shape addition for lines has been done by Krishnamurti [7, 10] and has been broadly adopted in other interpreters as well. The same problems that are encountered in the implementation of the subtraction operation are encountered here too.

The fifth process of repeating the above processes for all applicable transformations of the shape \(f(u)\) in \(W\) is straightforward for all shape grammar interpreter. The recursion can be implemented by re-assigning the result \(W-f(u)+f(v)\) from the previous iteration back to the same variable \(W\) for the next iteration. The formal expression can be written as:

$$ W_{i} = W_{i-1}{-}f_{i-1}\left( u \right) + f_{i-1}\left( v \right) $$

so that the result is:

$$ W_{N} = W_{N-1}{-}f_{N-1}\left( u \right) + f_{N-1}\left( v \right) $$

after \(N\) iterations, where \({f}_{i}()\) represents the applied transformation for the \({i}^{th}\) iteration.

3 Calculating Embedding

Shape matching in computer-aided design (CAD) systems is enabled by a database query requesting the retrieval of shapes from a CAD database. Surprisingly, shape matching under a given Euclidean, affine or linear transformation (visual matching), the most characteristic part of the shape grammar formalism, is entirely absent from current CAD systems. It is argued here that the conditions under which these visual matchings can occur and the calculations to implement them are the most important requirements for shape grammar interpreters to process rule applications and the single obstacle to merge shape grammar interpreters with generative CAD modelers.

The first condition to specify is the transformations themselves: for a shape \(u\), the shape \(f(u)\) to be embedded in a shape \(W\) can be modeled by four types of transformations: a) isometry transformations including translations, rotations, and reflections; b) similarity transformations including isometry transformations, scale transformations and their combinations; c) affine transformations including similarity, stretch, compress, and shear transformations and their combinations; and d) linear or projective transformations including affine transformations, one-point, and two-point perspective transformations and their combinations. The rising hierarchy of the matching transformations \(f\) is given in Fig. 2 for a shape in the form of a capital K.

Fig. 2
figure 2

Types of linear transformations. a Identity; b Translation; c Rotation; d Reflection; e Scale; f Stretch; g Shear; h Stretch and shear; i One-point perspective; k Two-point perspective

The calculations for the transformations \(f\) of a shape \(u\) so that the shape \(f(u)\) can be embedded in a shape \(W\) involve the following five processes:

  1. 1)

    Encode the shapes \(\mathrm{u}\) and \(\mathrm{W}\) in their maximal representation;

  2. 2)

    Calculate the determinate match of embedded shapes \(\mathrm{f}(\mathrm{u})\) whose boundaries are all recorded in the dataset of \(\mathrm{W}\) (restricted embedding);

  3. 3)

    Calculate the determinate match of embedded shapes \(\mathrm{f}(\mathrm{u})\) whose boundaries are not recorded in the dataset of \(\mathrm{W}\) (unrestricted embedding);

  4. 4)

    Resolve the indeterminate matching for embedded shapes whose boundaries are not recorded in the dataset of \(\mathrm{W}\) that in addition can be embedded into \(\mathrm{W}\) in infinite ways (indeterminate embedding);

  5. 5)

    Count all non-equivalent matchings of the Left-Hand Side (LHS) and the Right-Hand Side (RHS) of the rule.

A brief description of the processes and conventions pertaining to the calculation of the maximal shape representation is given in the previous section. Here the focus is given in the calculation of the transformations \(f\) that make the shape \(u\) embedded in a shape \(W\). A pictorial description of the types and instances of visual matching follows below.

3.1 Restricted Embedding

The calculation of the determinate embedding of shapes whose boundaries are defined in the target shapes (restricted embedding) has been viewed as the most important criterion for the calculation of the inverse transformations because the ways digital tools represent shapes are discrete and object-based [3, 17]. An example of a query of a restricted embedding is shown below in Fig. 3. Note that the shape \(f(u)\) has boundary points that are all well defined as registration points in the shape \(W\).

Fig. 3
figure 3

An example of a restricted embedding of a subshape \(f(u)\). All boundary points of the shape \(f(u)\) are registration points in the shape \(W\). a Shape \(u\); b Shape \(W\); c Registered objects; d Eight transformations f embedding the shapes \(f(u)\) in \(W\)

3.2 Unrestricted Embedding

The calculation of the determinate embedding of shapes whose boundaries are not recorded in the dataset of \(W\) (unrestricted embedding) is more involved and limited progress has been recorded on this front. An example of a query of an unrestricted embedding is shown below in Fig. 4. The query consists of a composite line of three segments showcasing two vertices that are registered in the dataset of the two squares and two that are not.

Fig. 4
figure 4

An example of an unrestricted embedding of a shape \(f(u)\). The boundary points of the shape \(f(u)\) are not registration points in the shape \(W\). a Shape \(u\); b Shape \(W\); c Registered objects; d Eight matchings of the shapes \(f(u)\) in \(W\)

3.3 Indeterminate Embedding

The calculation of the indeterminate embedding of shapes whose boundaries are not recorded in the dataset of \(W\) can become involved too [5,6,7,8]. For such cases of shape matching, the system should be able to detect the indeterminate condition and offer processes to resolve the infinite possible matches. An example of an indeterminate query is shown in Fig. 5. The LHS shape “k” can be matched in infinite ways under a similarity transformation.

Fig. 5
figure 5

An example of an indeterminate embedding of a shape \(f(u)\). a Shape \(u\); b Shape \(W\); c Registered objects; d Eight infinite families of similarity transformations \(f\) embedding the shapes \(f(u)\) in \(W\)

3.4 Counting Non-equivalent Embeddings

Finally, the counting of all non-equivalent embeddings of the LHS and the RHS of the shape rule completes the requirements for the calculations of the embedding operation and its interface with the transformations under which a shape rule applies. An example of a calculation of non-equivalent matchings of the LHS and the RHS of the shape rule is shown in Fig. 6. In this example, the eight matches of the LHS shape are reduced to four non-equivalent matches, which are expanded again to eight matches of the RHS shape.

Fig. 6
figure 6

An example of a calculation of counting non-equivalent embeddings and replacements of a shape \(f(u)\) shape by the shape \(f(v)\). a Shape \(u\); b Shape \(v\); c Shape \(W\); d Registered objects; e Shape rule \(u\to v\); f Four embeddings of the LHS in \(W\); g Eight embeddings of the RHS in \(W\)

4 Three Systems

A sketch of the increasing complexity underlying the implementation of the modules required for the calculation of shape recognition and replacement, along with the modules required for the calculation of the various implementations of the mathematical concept of the part relation between two shapes, is given in Table 2. Significantly, the arrangement of these modules in three successive sets provides a common framework of rule-based computation that foregrounds the similarities and differences between generative geometric modelers, set grammar interpreters and shape grammar interpreters, respectively.

Table 2 Requirements of a shape grammar interpreter

The two modules given in the first part of Table 2, the rule editor and the rule compiler, provide the underlying functionality for the implementation of generative systems (rule-based systems) irrelevant of the actual symbols, strings, shapes and so on, involved in the computation [17]. A symbolic rule editor and a symbolic rule compiler can be extended to a shape rule editor and a shape rule compiler by implementing the five modules in the second part of Table 2.

The five modules given in the second part of Table 2 are commonly found in most geometric modelers (CAD systems). These modules include geometric modeling functions to allow instantiation of shapes, modification of shapes and database query. The interpreters based on the integration of these five modules with the computational framework of the rule editor and the rule compiler provide rule-based systems for symbolic shapes and are typically classified as generative geometric modelers, see for example, Cellular Automata [18], L-system [19], CityEngine [20], and several more.

The five modules in the third part of the Table 2 outline the fundamentals of an advanced shape query system to make a symbolic generative modeler a shape grammar system. The major area of this part of the table—and the least populated region of the whole table—focuses on the implementation of the mathematical concept of the part operation (\(\le \)) for shapes of the shape grammar interpreter, including the three subcategories of matching and the four transformations under which the matching is enabled. Note that these modules will be different for different types of shapes because the implementation of maximal representation of various types of geometries (lines, arcs, etc.), their embedding conditions, the definitions of addition and subtraction in terms of their parts, and even the transformations required for different dimensions, requires often radically different solutions. Clearly, a general account for the state-of-the-art of shape grammar interpreters requires different accounts of the implementation of distinct types of shapes, for example, lines, conics, Bezier curves, NURBS, and so forth in the algebra \({U}_{12}\), and other types of shape in different algebras too. A brief discussion of the current state of general-purpose shape grammar implementations of these five modules for shapes made up of lines in the algebra \({U}_{12}\) is given below.

4.1 Case Studies of Shape Grammar Interpreters of Lines in \({{\varvec{U}}}_{12}\)

The maximal representation of shapes consisting of lines has been successfully implemented in SGI by implementing line addition operation (Boolean union for lines), and most of the interpreters have been following this method—see for example, SGS [15], GEdit [13, 14], ShaDe [21] and Shape Machine [22]. Some interpreters have adopted a graph representation of maximal lines such as GRAPE [23] and SortAl GI [24] and use hypergraphs [25] to simulate the procedure. And still others, such as Curve-based SGI [26] and QI [16] use algorithms to achieve the maximal representation for Bezier curves and in doing so they solve the problem of maximal representation for lines because straight lines can be viewed as degree one Bezier curves.

The restricted embedding of shapes consisting of lines has been successfully implemented by adopting the \(3\times 3\) matrix algorithm in SGI and in particular for Euclidean and affinity transformations. SGI uses the \(3\times 3\) matrix method to derive the possible transformations that can make \(f(u)\le W\) true with two given registration points, and the algorithm of the \(3\times 3\) matrix is adopted by most of the interpreters such as SGS, GEdit, Curve-based SGI, QI, SGIRF [27], ShaDe and Shape Machine. Significantly, the two registration points, which are the points registered in the database of the design, can provide enough information to calculate transformations up to the Euclidean transformations and SGS, Curve-based SGI, Shape Machine and other interpreters have adopted three registration points to do so. Still, three points are not enough to calculate the complete range of all linear transformations: The new transformations that are added in the list, the one-point and two-point perspectivities, require four distinguishable points and a \(9\times 9\) matrix. Despite the seemingly straightforward extension of the approach in this new domain, the \(9\times 9\) matrix requires a heavy computation load taxed by severe precision or rounding errors. Some interpreters such as SortAl GI and GRAPE are looking for the data description of shapes to simulate the transformations. Shape Machine uses a non-numerical algorithm to derive the linear transformation and successfully avoids these issues. Significantly, the interpreters that provide maximal representation of shapes and restricted embedding are classified as set grammar interpreters [3, 17, 28] following the theoretical distinction between set grammars and shape grammars [29].

The unrestricted embedding foregrounds the main difference between the set grammar interpreters and the shape grammar interpreters as the productions of the former (and the generative geometric modelers at large) are symbolic and thusly indifferent to the richness of shape recognition and the open-ended calculations with shape rules. The unrestricted embedding of shapes consisting of lines has been partially implemented in SGI, GEdit and SGIRF. SGI provides a partial foundation for the unrestricted embedding. GEdit and SGIRF includes projection intersections as the registration points to achieve a partial unrestricted embedding only for a limited range of shapes. Shape Machine appears to succeed in this front integrating and extending SGI and GEdit’s existing algorithms to offer a general solution of this type of matching for lines.

The indeterminate embedding of shapes consisting of lines is a set of cases that the matching results are indeterminate until users provide more information to consolidate the results. The indeterminate embedding of shapes consisting of lines has been partially implemented in Grape and SortAl GI by specifying floating endpoints but only for a specific class of shapes, including the K-shape. Shape Machine appears to be the only interpreter able to deal with the indeterminacy of rules by detecting the special cases and offering a structure for users to pass the parameters to the system.

The enumeration of non-equivalent matchings of shapes consisting of lines for all three types of embedding has been implemented in various ways using diverse approaches pertinent to the representation of the shape and the type of embedding. GEdit uses diagonal vectors to manage the matching results and remove the visual equivalent results. GRAPE eliminates the equivalent results by checking the symmetry of the graph representations. SortAl GI uses a predefined description of shapes to prevent the system from equivalent counting. Curve-based SGI, QI, ShaDe and Shape Machine remove the equivalent matches by checking the pictorial equivalency [5] between the matches.

5 Discussion

The review of the shape grammar interpreters within the lens provided in this work showed the profound complexities that are involved in the implementation of the part operator “\(\le \)” (embedding)—and the different ways that this operator can be implemented. Unrestricted and/or indeterminate embedding are two of the hallmarks of the shape grammar formalism and the only general-purpose shape grammar interpreter that appears that successfully tackles this problem for shapes consisting out of lines is the Shape Machine. Shapes consisting of lines, arcs and their combinations—the subject matter of Euclidean geometry and an expressive space for design—also appear that they have been successfully tackled in Shape Machine so far for all Euclidean transformations along with some promising work on conics in affine and projective geometries [30].

Future development of shape grammar interpreters will be highly related to the types of geometry they will support. The implementation of the corresponding maximal representations and Boolean operations for different kinds of geometries requires the descriptors of corresponding geometries as their underlying structures. Still, finding the geometry descriptors can be a difficult task because there are many different types of shapes—lines, arcs, conics, Bezier, etc. Defining the range of geometry types that are commonly used in a design process might help reduce the complexity of the implementation. Another possibility is to use rational elements to approximate complex geometries. For instance, a NURB curve might be hard to model with the descriptor but it can be approximately decomposed into a composition of arcs. By adopting this concept, a complex geometry might be decomposed into a composition with rational elements such as lines, arcs, conic elements, planes, spheres, ellipsoids and so forth. Along with the increased complexity of the geometries, management of the performance will be one of the main tasks in the future. A major bottleneck of performance will surely be related to the calculations involved in the various procedures of embedding outlined above. The complexity of embedding follows the number of the registration points of the current design, \(W\). The method of \(3\times 3\) matrix allows the system to use two registration points to achieve a Euclidean transformation, thus, the complexity of the embedding under Euclidean transformation is \(O({n}^{2})\) where n is the number of the registration points of W. For affinity transformation, the complexity increases to \(O({n}^{3})\) for requiring three registration points. For a linear transformation, the complexity increases to \(O({n}^{4})\).

A second trajectory for the future development of shape grammar interpreters will be highly related to parametric shape representations. The interpreters reviewed in this paper are mostly based on transformational geometry because geometric transformations provide a precise recognition match. As a production system, the accuracy of subshape matching is important because the system has to guarantee that the computational tasks can be precisely executed. Thus, parametric shape grammar interpreters will require a certain level of accuracy to make sure the parametric computation process is precise too. Graph representation of shape cannot satisfy this accuracy in advance because it is too abstract to guarantee the uniqueness of shapes. For example, the k-shape in Fig. 7 cannot be matched through a linear transformation: the connection of its boundaries makes a concave quadrilateral and there is no geometric transformation that can match a convex quadrilateral to a concave quadrilateral; and the graph representation could not help either because it would not be able to distinguish between a k-shape and a ψ-shape. As in shape grammar interpreters, a unique representation of a parametric shape is required to implement a parametric shape grammar interpreter [31].

Fig. 7
figure 7

Parametric deformation of the k-shape

The expansion of the range of geometry descriptors for various types of shapes in different dimensions and the unique representation of parametric shape are two of the possible directions for the implementation of shape grammar interpreters. Additional directions pertaining to the design of interfaces for these systems and their seamless integration with current and future modes of practice provide a bright future for their development.