Introduction

Computational methods play a crucial role in synthetic biology, providing powerful tools that significantly improve the design, analysis, and construction of synthetic biological systems, with a particular emphasis on multicellular synthetic systems [1, 2]. These systems implement intricate functions by distributing genetic constructs among different cells [3]. This approach exploits intra and intercellular interactions within the cell population, distributing the metabolic burden to amplify system responsiveness. However, the complexity of these synthetic designs leads to intricate interactions with the host organism, thereby diminishing predictability and controllability [4, 5]. In this context, synthetic morphogenesis applications pose unique challenges as they strive to govern cellular self-organization, which heavily relies on spatial relationships and interactions among cells in space [6].

Computational methods have a key role in the analysis [7, 8], modeling [9, 10], design [1] and optimization [11,12,13] of complex biological processes. In particular, for analyzing and predicting the dynamics of multicellular synthetic systems, computational tools must offer instruments for modeling and simulation, accounting for multiple spatial and temporal scales [14]. Computational modeling languages serve as powerful tools in this domain. They must expressively represent the target systems while integrating knowledge from diverse sources [15], thus enhancing our understanding of the Design-Build-Test-Learn (DBTL) cycle [16]. Furthermore, to facilitate interdisciplinary collaboration, these languages must support collaborative development, reproducibility, and knowledge sharing [17, 18].

In computational biology, Domain-Specific Languages (DSLs) serve specific applications (see Sect. “Related work”). For instance, the Systems Biology Markup Language (SBML) specializes in biochemical networks [19], NeuroML focuses on the structure and function of neural systems [20], and the Simulation Experiment Description Markup Language (SED-ML) handles procedures for running computational simulations [21]. While these languages excel within their applications, their limited scope and interoperability [22] hamper integration into the multi-level models needed for multicellular synthetic biology. The Infobiotics Language (IBL) addresses interoperability by consolidating modeling, verification, and compilation into a single file, streamlining in silico synthetic biology processes and ensuring compatibility with the Synthetic Biology Open Language (SBOL) and SBML frameworks [22]. However, IBL lacks support for describing multicellular synthetic designs and expressing spatial aspects crucial for synthetic morphogenesis applications [6].

Models of multicellular, spatial biological systems can utilize low-level modeling formalisms [23,24,25] or multi-level hybrid models that combine different formalisms across multiple scales [14]. Unfortunately, these powerful tools are primarily accessible to expert users, limiting their availability to experimental synthetic biologists.

This paper introduces the Biology System Description Language (BiSDL), a computational language for spatial, multicellular synthetic designs that can be directly compiled into simulatable, low-level models to explore system behavior. BiSDL aims to balance simplicity and intuitive usage for broad accessibility, while its expressive power enables the description of biological complexity in multicellular synthetic systems. Building on preliminary work [26], BiSDL supports flexible abstraction, allowing non-experts to reuse high-level descriptions and experts to manipulate or create low-level models. Additionally, BiSDL supports modularity and composition, facilitating the creation and usage of libraries for knowledge exchange, integration, and reuse in the multicellular synthetic biology DBTL cycle. In this work, the low-level models generated from BiSDL descriptions are based on the Nets-Within-Nets (NWN) formalism [27], chosen for its capability in multi-level and spatial modeling of complex biological processes [25, 28]. Nevertheless, the language is general enough to be integrated with other low-level formalisms.

BiSDL closely mirrors the natural language used within the biological domain. The compiler manages the gap between this high-level biological semantics and the low-level NWN formalism syntax, reducing the need for advanced modeling skills and knowledge of the low-level formalism. While in its current implementation BiSDL requires basic programming and modeling skills, the language could be integrated with a dedicated Graphical User Interface (GUI), paving the way for extensive broadening of the user base. BiSDL aims to simplify data exchange in bioinformatics, offering a high-level approach that abstracts away the complexity found in standards like SBOL [29, 30] and SBML [31], and is versatile enough to be translated into other formalisms like Petri Nets (PN), unlike tools such as pySBOL [32] and libSBML [33] that still rely on complex XML-like syntax.

The paper is organized as follows: Sect. “Related work” summarizes existing computational languages for synthetic biology, Sect. “Methods” details the design, syntax, and semantics of BiSDL and its compilation into low-level models using the NWN formalism. Then, Sect. “Results and discussion” showcases BiSDL capabilities through three case studies on multicellular synthetic systems. Finally, Sect. “Conclusions” summarizes the contributions, highlights open challenges, and outlines future developments.

Related work

The scientific landscape of model description languages for systems and synthetic biology is rich and complex.

Figure 1 organizes them by expressivity of biological semantics and generality in modeling different biological levels or domains. Each language links to the biological levels it targets (either molecular pathways, cells, multicellular systems, or a combination thereof) and the level of flexibility the language has in generalizing to different biological domains or mechanisms (see Legend on the right).

Fig. 1
figure 1

Comparison of Model Description Languages in Systems Biology over expressivity (broadness and depth of described models) and generality (broadness of modeling target, scope, and domain), providing further details on the biological levels covered and the modeling flexibility supported

The COmputational Modelling in BIology NEtwork (COMBINE) initiative coordinates the development of inter-operable and non-overlapping standard languages covering different aspects of biological systems [34,35,36]. COMBINE DSLs provide intermediary layers between the user and low-level modeling formalisms. They rely on XML for model description and compile into Ordinary Differential Equations (ODE) models, making this mathematical modeling formalism accessible by non-expert users. Some of the COMBINE DSLs specialize in intracellular pathways, such as BioPAX [37], and processes, such as Systems Biology Graphical Notation (SBGN) [38], SBML [31] and CellML [39]. SED-ML [21, 40] exclusively aims at managing simulations of system behavior. NeuroML [20] tackles different biological aspects simultaneously, including spatiality and support simulation management, yet specializes in only neuronal systems. Among COMBINE standards, SBOL [29, 30] targets in silico synthetic genetic designs, yet is limited to genetic circuits alone, and does not cover any other biological aspect. Existing standards such as SBOL [29, 30], CellML [39], SED-ML [21, 40], and SBML [31] offer valuable frameworks for data exchange, albeit with limitations such as verbosity and complexity. While tools like pySBOL [32] and libSBML [33] provide programmatic access to these standards, they still require users to navigate XML-like syntax. BiSDL allows users to focus solely on high-level concepts, abstracting from the implementation details. The proposed compiler, translating BiSDL into PN, showcases the versatility of the language, demonstrating the potential for translation into other languages and formalisms.

Besides COMBINE standards, several computational languages for systems and synthetic biology exist [41]. Antimony is a text-based definition language that directly converts to the SBML standard employed in Tellurium, a modeling environment for systems and synthetic biology [42]. The Cell Programming Language (gro) [43] is a language for simulating colony growth and cell-cell communication in synthetic microbial consortia. It handles the spatiality and mobility of bacterial cells, internal genetic regulations, and mutual communications. Eugene [44] specifies synthetic biological parts, devices, and systems, mainly focusing on genetic constructs and their expression. Genetic Engineering of Cells (GEC) [45] centers over logical interactions between proteins and genes. GEC programs can be compiled into sequences of standard biological parts for laboratory applications. Genomic Unified Behavior Specification (GUBS) [46] focuses on the cell’s behavior as the central entity with a rule-based, declarative, and discrete modeling formalism. gro and GUBS can model the interaction between cells. gro also supports the representation of the spatial organization in a system. However, it is limited to bacterial cells only. Also, both languages require programming skills. Thus, neither is easily accessible to non-expert users. Eugene and GEC focus on genetic circuits only or simple molecular interaction networks for representing and exchanging reusable genetic designs through functional modules, such as Standard Parts [47]. Even when combined in more complex structures, such modules only partly comprise the complexity and hierarchy of interdependent regulations and the role of spatiality in biological multicellular designs. IBL [22] is a DSL for synthetic biology that manages several computational aspects into a single specification, overcoming interoperability issues and ensuring seamless compatibility with SBOL and SBML frameworks. Yet, it currently does not support multicellular synthetic systems nor spatial aspects.

To overcome the limitations of existing solutions in biological scope, expressivity of multicellular and spatial aspects, and accessibility, BiSDL provides high-level descriptions of intra- and inter-cellular mechanisms over spatial grids, and their direct compilation into low-level simulatable models for the exploration of system behavior.

Methods

To help synthetic biologists create models of multicellular synthetic systems, BiSDL is designed for users with varying computational skills, making biological knowledge readable and writable. BiSDL stands at an abstraction level parallel to biological concepts used in experimental science, bridging the gap between these concepts and the more intricate low-level models. BiSDL descriptions combine user-friendly biological semantics with the capacity to capture system complexity. Its development is centered on domain-specific terminology and the ability to compile into NWN simulation models, discussed in Sect. 3.4. The language syntax covers process hierarchies, spatial relations, and cellular interactions at the intercellular and intracellular levels. While the BiSDL supports the description of the system, the BiSDL models can be compiled into complex NWN models for the simulation and analysis of system behavior.

Biological perspectives and levels of abstraction

The BiSDL syntax supports describing spatial and multi-level biological concepts through multiple domains and abstraction levels, as illustrated in the Y Chart reported in Fig. 2.

Fig. 2
figure 2

A scheme of BiSDL domains and abstraction levels inspired to the VHDL Y-Chart [48]. The upper panel (BiSDL) shows the (A) Structural, (B) Behavioral, and (C) Spatial domains and the corresponding high abstraction levels (II, III, and IV) in BiSDL descriptions. The lower panel (NWN) illustrates the three domains at the low abstraction level relative to the Nets-Within-Nets formalism (I)

Inspired by the Very High-Speed Integrated Circuit Hardware Description Lan-guage (VHDL) [48], the BiSDL Y Chart adapts the three description domains defined for VHDL (i.e., Behavioral, Structural, and Physical) to the biological semantics.

The Structural domain, illustrated in Fig. 2 (A—STRUCTURAL, top right), delves into the architecture of biological structures such as transcriptional machinery, protein complexes, or synthetic genetic constructs within a host. On the other hand, the Behavioral domain (Fig. 2, B - BEHAVIORAL, top left) focuses on describing the dynamic functioning, interactions, and transformations of biological elements, encompassing processes like gene transcription, diffusion, and protein degradation. Lastly, the Spatial domain (Fig. 2 C—SPATIAL, center left) outlines the spatial substrate influencing interactions among the elements composing the system (e.g., the spatial organization of a group of cells).

The BiSDL describes each domain at four different abstraction levels, spanning from general biological concepts (high abstraction) to the NWN modeling formalism elements (low abstraction).

Level IV (Fig. 2, circle IV) describes a Biosystem comprising multiple composite motifs within the structural domain. This corresponds, for instance, to a Bioprocess made up of multiple bioprocesses in the behavioral domain and a Biocompartment defined by multiple spatial grids in the spatial domain. Level III (Fig. 2, circle III) elucidates Composite motifs, i.e., combinations of building blocks representing complex biological structures in the structural domain. These correspond to Subprocesses that emerge from interlaced base functions in the behavioral domain, necessitating Spatial grids in the spatial domain to model the underlying spatial relations. Level II (Fig. 2, circle II) defines Building blocks, capturing fundamental biological concepts in the structural domain. These correspond to Base functions in the behavioral domain and simple Local relations in the spatial domain. Level I (Fig. 2, circle I) describes NWN formalism elements combined in low-level models of the system.

When describing composite systems (e.g., a biological tissue), BiSDL covers all domains: structural, spatial, and behavioral (as shown in Fig. 2.A-C). These descriptions can include cells, extracellular structures, spatial arrangements, and the processes involved, requiring elements from each of the BiSDL domains. However, simpler descriptions may focus on a single domain, such as the transcription process of a gene concentrating on the behavioral domain. Simple MODULE definitions can be combined to form more complex descriptions. The syntax and semantics of a set of composable BiSDL descriptions are detailed in Additional file 1 BiSDL Modules Library—Section 1 as an example of a BiSDL library.

Syntax and semantics

All BiSDL descriptions respect the template shown in Algorithm 1 based on a hierarchy of MODULE, SCOPE and PROCESS constructs. They start with naming the MODULE (line 1). A MODULE encapsulates the complete description of a biological system, encompassing structural, behavioral, and spatial aspects. This includes detailing groups of cells, their spatial arrangement on a two-dimensional grid, intracellular processes, and the spatial diffusion mechanisms facilitating intercellular communications. Modules are self-contained and serve as the fundamental units for reusing and composing existing descriptions.

Each MODULE consists of a set of SCOPE declarations with defined identifiers and spatial coordinates (lines 3–12 and 13) that describe the relevant biological compartments within the modeled system and a set of DIFFUSION mechanisms (lines 14–15) that model the diffusion of signals among them. The SCOPE declarations may incorporate additional communication methods, such as PARACRINE_SIGNAL (line 10) and JUXTACRINE_SIGNAL (lines 11–12), describing intercellular communication, either through diffusible signals (paracrine) or direct contact (juxtacrine). Integer timescales can represent any ratio between the operations of different models in the provided discrete-time simulator. The TIMESCALE of a module (line 2) sets the base pace of the system dynamics compared to the unitary step of the discrete-time simulator. For instance, if one model has a TIMESCALE of N, it means that it evolves by 1 step every N simulator’s steps (whose TIMESCALE is made equal to 1). The model is slower than the base time step by a factor of N.

Each SCOPE contains a set of biological PROCESS instantiations with explicit identifiers (lines 4–8 and 9). They comprise base functions like transcription, translation, and degradation.

The TIMESCALE of a process (line 6) is a discrete multiplier of the MODULE timescale, determining the relative speed at which the process occurs compared to the base module pace. The same applies to processes with different timescales: they proceed at a relative speed, the ratio of their respective timescales. For instance, if TIMESCALE is 2 for PROCESS p1, and 5 for PROCESS p2, they will proceed at a relative speed of 5/2 (i.e., p1 evolves 2.5 times faster than p2). Different PROCESS instances can connect over the same elements: for example, one process might produce a molecule that regulates a base function in another process. BiSDL emphasizes ease of description. Each SCOPE can reuse a PROCESS from another SCOPE simply by declaring a PROCESS with the same <process_id>.

Algorithm 1
figure a

BiSDL general template. Each MODULE organizes around a set of SCOPE definitions. Each SCOPE contains a set of PROCESS instances describing the behavior of entities in the MODULE and a set of SIGNAL declarations describing communication mechanisms among entities. DIFFUSION mechanisms support communication among SCOPE constructs.

As a simple example, Algorithm 2 provides the BiSDL description of the chemical reaction by which two \(H_{2}\) molecules react with one \(O_{2}\) molecule to form two \(H_{2}O\) molecules.

Algorithm 2
figure b

BiSDL description of the chemical reaction by which two \(H_{2}\) molecules react with one \(O_{2}\) molecule to form two \(H_{2}O\) molecules.

The MODULE, whose base TIMESCALE is 1, consists of a biological compartment at coordinates (0, 0) within the spatial grid (SCOPE s, line 3). The SCOPE contains a single PROCESS, whose base TIMESCALE multiplier is 1, named reaction. Here, the entities describing molecular hydrogen (H2_molecule) and oxygen (O2_molecule) transform into water (2*H2O_molecule, line 6). Multipliers for H2_molecule and H2O_molecule specify the proportion the molecules combine, implying a multiplier with unitary value when not indicated.

BiSDL constructs

This work proposes a library of BiSDL constructs (see Additional file 1—Section 1) to exemplify the language semantic capabilities, showing its expressiveness and closeness to the biological semantics. To foster standardization in model description languages, all proposed BiSDL constructs follow the Systems Biology Ontology (SBO) [49] and fall into the following four subcategories (of the seven provided by the standard):

  • Physical entity representation:

    • Material entities identify the functional entities (i.e., SCOPE, CELL, and the base types GENE, MRNA, PROTEIN, COMPLEX, MOLECULE);

    • Functional entities identify the function they perform (i.e., PARACRINE_SIGNAL, JUXTACRINE_SIGNAL, and DIFFUSION).

  • Participant role:

    • identifies the role played by an entity in a modeled process (i.e., INDUCERS, INHIBITORS, ACTIVATORS);

  • Occurring entity representation:

    • identifies processual relationships involving physical entities (i.e., TRANSCRIPTION, TRANSLATION, DEGRADATION,PROTEIN_COMPLEX_FORMATION, ENZYMATIC_REACTION, CUSTOM_PROCESS);

  • System description parameter:

    • provides quantitative descriptions of biological processes (i.e., TIMESCALEs and the multipliers of physical entities).

From BiSDL descriptions to NWNs models

BiSDL supports the system dynamics analysis utilizing simulations. This is obtained by compiling BiSDL descriptions into low-level models based on the NWN formalism.

Compilation of NWNs models

NWN extend the PN formalism to support hierarchy, encapsulation, and selective communication [9, 24, 27], which makes them suitable to model complex biological processes [25, 28], coherently with the design goals of BiSDL (Sect. 3.1). PN are bipartite graphs where nodes can be either places or transitions. Places represent states the modeled resources can assume. Transitions model the creation, consumption, or transformation of resources in, from, or across places. Tokens model discrete units of resources in different states. Each transition has rules regulating its enabling and activation, depending on the availability of tokens in its input places. Directed arcs link places and transitions to form the desired network architectures. When a transition fires, it consumes the required tokens from the input places and creates tokens in its output places.

Fig. 3
figure 3

A representation of the NWN formalism. Tokens in these PN can be instances of PN, thus implementing a hierarchy of encapsulated levels. Channels can interlock two transitions from different nets, allowing the exchange of tokens and information

The NWN formalism is a high-level PN formalism supporting all features of other high-level PN: tokens of different types and timed and stochastic time delays associated with transitions. NWN introduce an additional type of token named Net Token. A Net Token is a token that embeds another instance of a PN. With this type of token, NWN support hierarchical organization, and each layer relies on the same formalism (see Fig. 3). This characteristic introduces the Object-Oriented Programming (OOP) paradigm within the PN formalism. Therefore, NWN models express encapsulation and selective communication, allowing the representation of biological compartmentalization and semi-permeability of biological membranes easily. Nets at different levels in the hierarchy evolve independently and optionally communicate through synchronous channels that interlock transitions from different nets, synchronizing their activation upon satisfaction of enabling conditions.

NWN have heightened expressivity compared to other modeling formalisms. While Boolean models offer binary node states, high-level Petri Nets can convey intricate information regarding system resources and processes. Similarly, while ODE represent uniform compartments with continuous values, NWN accommodate discrete and continuous quantities. Unlike many existing approaches that primarily focus on intracellular mechanisms, multi-level NWN allow for the modeling of both intracellular and supra-cellular information, enabling a broader scope of representation and effective analysis of complex biological systems [23]. However, the increased expressivity of NWNs comes at the cost of greater model complexity and computational demands for simulation algorithms, a common trade-off in computational modeling [50]. In conclusion, the decision to employ NWN for demonstrating BiSDL stems from the desire to highlight its full expressive capacity in generating complex models. BiSDL may support compilation into various low-level formalisms, generating models on different points across this trade-off.

To support NWN, BiSDL compilation generates models implemented with nwn-snakes, a customized version of the SNAKES library [51]. SNAKES is an efficient Python library for the design and simulation of PN [51]. The nwn-snakes library presented in this work extends SNAKES to handle multi-scale models, ensuring consistency across the hierarchical levels in the model. nwn-snakes provides constructs to express the hierarchy of temporal and spatial scales. Every BiSDL MODULE in the compiled model is represented by a Module class with an individual timescale, which in turn inherits from the PetriNet class implemented in nwn-snakes. A prototype BiSDL compiler (bisdl2snakes.py) generates Python Module classes implementing nwn-snakes models from BiSDL descriptions. Detailed instructions on compiler use are available in the BiSDL GitHub public repository (see Availability of data and materials).

The spatial hierarchy underlying BiSDL descriptions (see Sect. 3.2) is translated into a low-level model based on a system of nested spatial grids represented by PN. In this model, the places model sub-portions of space, allowing the representation of multiple spatial scales. Each place in a spatial grid can host, as a net token, another spatial grid, ensuring cross-level semantic consistency across different spatial scales. This work provides consistent semantics for two-level hierarchies, which support the intended modeling of multicellular systems where both intra- and intercellular mechanisms are described. nwn-snakes handles the marking evolution of the two levels synchronously: if marking evolves on one level, the other level mirrors the exact change. Additional file 1—Section 2 reports the way nwn-snakes supports NWN modeling describing the mapping between BiSDL building blocks and NWN.

BiSDL supports high interpretability of generated NWN models in two ways. Firstly, a compilation of BiSDL constructs labels the resulting low-level constructs with the high-level specific parameters. For instance, the construct PROTEIN_COMPLEX_FORMATION(3*LuxR_protein, 3*AHL_molecule, 3*LuxR_AHL_complex) generates NWN constructs containing the product name: LuxR_AHL_complex. Secondly, in BiSDL, any construct can be wrapped into a process, and the process is assigned a custom name. This feature supports the direct reuse of processes in general and the reuse of constructs wrapped up in processes by leveraging the process name. Algorithm 4 exemplifies this mechanism: the PROCESS defined in lines 18–21 encapsulates TRANSCRIPTION, TRANSLATION, and DEGRADATION constructs, and is named CD19_production. The same process is reused (by name) in line 37. Indeed, the SCOPE defined in lines 36–40 (5 lines of code) reuses processes defined only once for the first SCOPE, which spans over 35 lines of code (1–35). In compilation, to avoid ambiguity, each time the same process is used again in the BiSDL description, a new set of low-level elements is generated and named by appending a progressive number. In the same example from Algorithm 4, the first instance of PROCESS CD19_production (lines 18–21) is assigned the name CD19_production_process_0 in the NWN model. The second instance (line 37) is internally assigned the name CD19_production_process_1, and thereafter.

Simulation of NWNs models

The exploration of systems dynamics relies on the simulation of nwn-snakes models compiled from BiSDL descriptions with the nwn-petrisim simulator. The simulator is designed to be simple and easy to use, thus requiring minimal coding as reported in Listing 1.

Listing 1
figure c

nwn-petrisim instantiation

The simulator is instantiated at line 2. The argument m=test_module represents the instance of the top-level net to be simulated. The arguments draw_nets=False and mode=’exploration’ control the generation of visual output, preventing the creation of images of the net architectures and allowing evolution plots to adapt to the generated output value ranges.

The simulation, executed in line 3, is discrete, with nstep=100 determining the number of simulation steps. The simulator analyzes the stochastic evolution of the system. Additionally, it allows simulating the system’s response to external stimuli applied to the model as outlined in Listing 2. The simulation comprises a loop running for the specified number of steps (n_steps). Within this loop, at every n steps, the simulator adjusts the marking of the place that models the stimulus within the network. This adjustment involves adding a random number of black tokens, ranging from 0 to r. Subsequently, this modified marking is applied to the simulated model before proceeding to the next simulation step. The values of n and r control the intensity of the administered stimulus.

Listing 2
figure d

nwn-petrisim stimuli administration

In the stochastic simulation supported by nwn-petrisim, conflicts among transitions competing for tokens are managed through the randomized ordering of transitions enabling and firing events. All transitions have a user-defined firing probability of p set by default to 0.6. Furthermore, for each firing event, the set of tokens consumed by the transition is randomly chosen from those available in the input place. These random selections prevent the systematic exclusion of specific transitions from firing.

Results and discussion

The BiSDL allows synthetic biologists to quickly model and design multicellular synthetic systems, simulating their behaviour. Synthetic biology aimed first to develop essential genetic constructs to control specific intracellular processes, then to combine such essential elements into complex circuits within or across cells [52]. Complex circuits enable a broader range of controllable behaviours yet have the drawback of metabolic burden and unknown interactions at the host. To address these constraints, synthetic biology has shifted focus towards designs based on multicellular networks [53], where splitting the overall construct across different cells facilitates integration into host cells and limits their metabolic burden. Construct parts interact via intercellular communication, and the desired behaviour emerges from the interaction between the different cells. Multicellular synthetic designs must consider complex interactions between the construct and the host cells. Results prove BiSDL capability for (1) model description and (2) exploration of system behaviour over three case studies of multicellular synthetic designs: a bacterial consortium (see Sect. 4.1), a synthetic morphogen system (see Sect. 4.2), and a conjugative plasmid transfer (see Sect. 4.3).

Case study 1—bacterial consortium

The first case study focuses on implementing gene expression control across different bacterial cells. To achieve this, a synthetic biologist can exploit gene expression regulation across cells operated by the lactose repressor protein (LacI).

Initially, the biologist must identify a reliable source of knowledge regarding synthetic designs that realize the desired behaviour. The Registry for Standard Biological Parts holds a collection of predefined genetic constructs with known functionality [54]. These constructs can serve as templates for the DBTL process. Selecting and combining these parts makes it possible to design a bacterial consortium where the overall genetic device enforces LacI-operated gene expression regulation across cells. This consortium comprises two cell types. Controller cells establish baseline 3-oxohexanoyl-homoserine lactone (3OC6HSL) production, inhibited by a reference signal: LacI administration (Fig. 4, top panel A). Conversely, Target cells initiate Green Fluorescent Protein (GFP) reporter signal production only when receiving the Acyl-homoserine lactones (AHL) molecular signal, which, in this system, is 3OC6HSL (Fig. 4, bottom panel B).

Various Standard Biological Parts contribute to the design. Part:BBa_C0012 (LacI protein) serves as the reference signal, inhibiting the Lac-repressible promoter. Part:BBa_I13202 (3OC6HSL Sender Controlled by Lac Repressible Promoter) integrated with S-adenosylmethionine (SAM) and an acylated acyl carrier protein (ACP) substrate to synthesize 3OC6HSL [55] constructs in the Controller cell. Part:BBa_E0040 (GFP) together with Part:BBa_T9001 (Producer Controlled by 3OC6HSL Receiver Device) complete the design with the inducible reporter gene expression in the Target cell. The GFP levels serve as the readout signal.

Fig. 4
figure 4

The multicellular bacterial consortium synthetic design was considered for the first case study. This consortium comprises two cell types. (A) Controller cells establish baseline 3OC6HSL production, inhibited by a reference signal (LacI administration); (B) Target cells initiate GFP reporter signal production only when receiving the AHL molecular signal, which, in this system, is 3OC6HSL

The synthetic biologist who leverages BiSDL to describe the synthetic bacterial consortium should model the synthetic construct split across Controller and Target cells. Moreover, the model must include the biological interactions and mechanisms involved, such as transcriptional processes (gene expression), activation and inhibition of gene expression, protein production and degradation, enzymatic reactions, and inter-cellular signaling. Algorithm 3 provides a BiSDL description of the synthetic bacterial consortium that considers all of these relevant aspects.

Algorithm 3
figure e

BiSDL description of the bacterial consortium.

The illustrated bacterialConsortium MODULE contains two SCOPE statements: one for the Controller cell (producer) (Algorithm 3, lines 3–20); another one for the Target cell (sensor) (Algorithm 3, lines 21–34). Each SCOPE definition includes its name and two-dimensional coordinates on the spatial grid underlying the model (Algorithm 3, lines 3 and 21). Each SCOPE contains a single PROCESS representing the fundamental biological functionality of each cell: AHL_production for the producer and GFP_production for the sensor. The TIMESCALE at the top level (Algorithm 3, line 2) indicates the base pace for the bacterialConsortium. On the other hand, the TIMESCALE of each PROCESS (Algorithm 3, lines 5 and 22) indicates the process slowdown factor related to the base pace: AHL_production evolves at half the base pace and GFP_production evolves at one-third of the base pace. The bacterialConsortium also contains declarations of the DIFFUSION processes that set up bidirectional connections between the two SCOPE constructs (producer and sensor) and the diffusion of AHL_molecule across them (Algorithm 3, lines 33–37).

The BiSDL supports a very compact description of the system, using approximately 25% of the lines of code required by the low-level nwn-snakes model: 50 lines of code (see Algorithm 3) versus 203 lines of code in the compiled nwn-snakes Python model file (based on the files in the public GitHub repository, see Availability of data and materials).

To compile the BiSDL description into a nwn-snakes model, the synthetic biologist uses the BiSDL compiler (see Sect. 3.4) to generate a nwn-snakes file that contains all the NWN models required by the BiSDL description. Visualization of the NWN models relies on the GraphViz visualization tool [56], provided by SNAKES as a plugin. For this use case, the NWN description includes a top-level net, where two places contain one net token each, and a bottom-level net where these net tokens lie.

Figure 5 visualizes the top-level NWN model, where the places that contain net tokens correspond to the two BiSDL SCOPE statements. The producer place holds the AHL_production net token (Fig. 6), while the sensor place holds the GFP_production net token (Fig. 7). Several transitions connect the two places, allowing the bidirectional diffusion of AHL_molecule colored tokens across them and the net tokens they contain during simulation, thanks to the nwn-snakes synchronization and communication capabilities (see Sect. 3.4.1).

Fig. 5
figure 5

The top-level bacterial_consortium net architecture. The places that contain net tokens correspond to the two BiSDL SCOPE statements. The producer place holds the AHL_production net token, while the sensor place holds the GFP_production net token. Several transitions connect the two places, allowing the bidirectional diffusion of AHL_molecule colored tokens across them and the net tokens they contain during simulation, thanks to the nwn-snakes synchronization and communication capabilities

Figure  6 and Figure  7 visualize the producer and sensor net tokens, respectively. In these PN, places model genes, transcripts, proteins, and molecules, while transitions model processes involving them, such as transcription, translation, degradation, and enzymatic reactions. Black tokens model discrete quantities of resources in each place and are represented by the dot symbol.

Fig. 6
figure 6

The producer bacterial_consortium net token architecture. Places model genes, transcripts, proteins, and molecules, while transitions model processes involving them, such as transcription, translation, degradation, and enzymatic reactions. Black tokens model discrete quantities of resources in each place and are represented by the dot symbol

Fig. 7
figure 7

The sensor bacterial_consortium net token architecture. Places model genes, transcripts, proteins, and molecules, while transitions model processes involving them, such as transcription, translation, degradation, and enzymatic reactions. Black tokens model discrete quantities of resources in each place and are represented by the dot symbol

The simulation of BiSDL-compiled nwn-snakes models shows that LacI levels control GFP_protein levels, consistently with the expected behaviour under the following conditions:

  • noLacI: the absence of LacI administration;

  • lowLacI: constant and low LacI administration (n=3 and r=3);

  • highLacI: constant and high LacI administration (n=3 and r=10);

Values of n and r determine the intensity of stimulus administration (see Sect. 3.4.2).

Fig. 8
figure 8

Marking evolution of the LacI signal mediators (Lux1_protein and AHL_molecule) and Target (GFP_reporter_protein) in the bacterial consortium after three LacI administration schemes. (A) noLacI does not interfere with (D) AHL_molecule levels, (G) inducing transcription of high GFP_reporter_protein readout signals. (B) lowLacI (E) hampers AHL_molecule levels, resulting in (H) the transcription of lower GFP_reporter_protein. (C) highLacI (F) almost shuts down AHL_molecule levels, (I) suppressing the transcription of high GFP_reporter_protein readout signals almost completely. The results show consistency with the expected system behavior, an inverse relation between LacI stimulus and readout signal levels

Figure 8 presents the marking evolution of the LacI signal mediators (Lux1_protein and AHL_molecule) and Target (GFP_reporter_protein) in the bacterial consortium after the three considered LacI administration schemes. noLacI (Fig. 8, top left panel A) does not interfere with AHL_molecule levels (Fig. 8, middle left panel D), inducing transcription of high GFP_reporter_protein readout signals (Fig. 8, bottom left panel G). lowLacI (Fig. 8, top central panel B) hampers AHL_molecule levels (Fig. 8, middle central panel E), resulting in the transcription of lower GFP_reporter_protein (Fig. 8, bottom central panel H). highLacI (Fig. 8, top right panel C) almost shuts down AHL_molecule levels (Fig. 8, middle right panel F), suppressing the transcription of high GFP_reporter_protein readout signals almost completely (Fig. 8, bottom right panel I). The results show consistency with the expected system behavior, an inverse relation between LacI stimulus and readout signal levels.

Case study 2—RGB synthetic morphogen system

The second case study implements a synthetic morphogen system where the spatial interactions and organization of the cells sustain the emergence of a spatial pattern of red, green, and blue (RGB) fluorescent markers. In developmental processes, morphogens transmit positional signals to cells, diffusing from a source to create concentration gradients. Cells interpret these gradients using diverse signaling mechanisms, including paracrine signaling (short-range) and juxtacrine communication (cell-to-cell). Synthetic biology offers the potential to manipulate these mechanisms, enabling control over spatial arrangement and functional features in synthetic morphogenetic designs. This second case study illustrates how BiSDL descriptions express the essential dynamics underlying a multicellular synthetic design accounting for the role of spatial organization and neighborhood relations among cells.

Fig. 9
figure 9

The multicellular synthetic circuit mediated by cell-cell communication makes the RGB pattern emerge. (A) The Sender cells (Cells A) constitutively express blue fluorescent protein (BFP), CD19 ligand, and the anti-GFP synthetic Notch (synNotch) receptor, which drives expression of a low amount of E-cadherin (\(Ecad_{lo}\)) fused with a (D) mCherry reporter for visualization. (B) Receiver cells (Cells B) inducibly express E-cadherin (\(Ecad_{hi}\)) and a modified form of GFP working as a synNotch ligand on the cell membrane (\(GFP_{lig}\)). (C) \(GFP_{lig}\) serves as both a fluorescent reporter and a ligand for a secondary synNotch receptor with the cognate anti-GFP binding domain. (E) Cells A have low adherence, blue fluorescence, and inducible red fluorescence; (F) Cells B have inducible green fluorescence and high adherence. Adapted from [57]

This time, rather than relying on Standard Parts (refer to Sect. 4.1), the objective is to replicate designs found in the scientific literature, such as the modular synNotch system outlined in [57], providing a platform for engineering orthogonal juxtacrine signaling, which functions independently of natural cellular communication pathways. It enables specific and controlled cell interactions, featuring an extracellular recognition domain, the Notch core regulatory domain, and an intracellular transcriptional domain. With the incorporation of fluorescent markers, the synNotch system proves to be a valuable tool for engineering multicellular synthetic systems.

This second case study comprises a three-layer multicellular circuit in which the Receiver cells (Cells B, Fig. 9, top right panel B) inducibly expresses E-cadherin (\(Ecad_{hi}\)) and a modified form of GFP working as a synNotch ligand on the cell membrane (\(GFP_{lig}\)). Furthermore, \(GFP_{lig}\) serves as both a fluorescent reporter and a ligand for a secondary synNotch receptor with the cognate anti-GFP binding domain (Fig. 9, central right panel C). The Sender cells (Cells A, Fig. 9, top left panel A) constitutively express BFP, CD19 ligand, and the anti-GFP synNotch receptor, which drives expression of a low amount of E-cadherin (\(Ecad_{lo}\)) fused with a mCherry reporter for visualization (Fig. 9, central left panel D). Thus, Cells A have low adherence, blue fluorescence, and inducible red fluorescence (Fig. 9, bottom left panel E), while Cells B have inducible green fluorescence and high adherence (Fig. 9, bottom right panel F).

Fig. 10
figure 10

The RGB synthetic morphogenesis process for the second case study. (A) Cells start as a disorganized aggregate; (B) CD19 in Cells A activates anti-CD19 synNotch in Cells B, inducing the expression of a high level of E-cadherin (\(Ecad_{hi}\)) and \(GFP_{lig}\); (C) Cells B aggregate and form a compact group in the middle of the aggregate; (D) The \(GFP_{lig}\) on Cells B activates the anti-GFP synNotch receptors on Cells A in direct contact with the central group, inducing \(Ecad_{lo}\) and the mCherry reporter, (E) making a spatially organized pattern of cells emerge in a synthetic morphogenetic pattern with three concentric layers: a green internal core (Cells B expressing \(Ecad_{hi}\) and \(GFP_{lig}\)) with high cell-cell adhesion, an outer layer of blue cells (Cells A expressing BFP), and a population of red cells in the middle layer (Cells A expressing \(Ecad_{lo}\) and mCherry).Adapted from [57]

Cells start as a disorganized aggregate (Fig. 10, bottom left panel A), and CD19 in Cells A activates anti-CD19 synNotch in Cells B, inducing the expression of a high level of E-cadherin (\(Ecad_{hi}\)) and \(GFP_{lig}\) (Fig. 10, top left panel B). Cells B thus aggregate and form a compact group in the middle of the aggregate (Fig. 10, bottom central panel C). The \(GFP_{lig}\) on Cells B activates the anti-GFP synNotch receptors on Cells A in direct contact with the central group, inducing \(Ecad_{lo}\) and the mCherry reporter (Fig. 10, top right panel D), and making a spatially organized pattern of cells emerge in a synthetic morphogenetic pattern (Fig. 10, bottom right panel E) with three concentric layers: a green internal core (Cells B expressing \(Ecad_{hi}\) and \(GFP_{lig}\)) with high cell-cell adhesion, an outer layer of blue cells (Cells A expressing BFP), and a population of red cells in the middle layer (Cells A expressing \(Ecad_{lo}\) and mCherry).

Algorithm 4
figure f

BiSDL description of the RGB synthetic morphogen system.

Algorithm 4 illustrates the BiSDL description of the RGB synthetic morphogen system, where a central cell induces patterning in its neighbors of the first and second-degree. Vertical dots indicate sections describing red and blue cells SCOPEs with identical structure as the ones presented but different spatial coordinates. The complete description and the visualization of the compiled nwn-snakes PN models are not included due to their size, but they are available from the BiSDL GitHub public repository (see Availability of data and materials).

BiSDL again supports a compact description of the system composed of 165 lines of code, approximately 9% of the corresponding compiled nwn-snakes model file, having 1807 lines of code (based on the files in the public GitHub repository, see Availability of data and materials).

Fig. 11
figure 11

Evolution of the three fluorescent marker levels (GFP, BFP and mCherry) in each cell on the two-dimensional spatial grid along the simulation of the nwn-snakes Python models compiled from the BiSDL description. (A) At first (t = 10), the central cell slightly affects only one of its neighbors of the first degree, while other cells keep producing the BFP signal; (B-E) the central cell engages all of its neighbors of the first degree, inducing the expression of mCherry in them, whose intensity evolves throughout the simulation (from t=20 to t=50); (F) all first-degree neighbors of the central cell express high levels of mCherry (t=60)

Results recapitulate the emergence of the expected simplistic version of one of the synthetic morphogenetic patterns presented in [57]. Fig. 11 depicts the evolving intensity of the fluorescent markers in each cell during a simulation of the nwn-snakes Python models compiled from the BiSDL description (see Algorithm 4). At first (t = 10), the central cell slightly affects only one of its neighbors of the first degree, while other cells keep producing the BFP signal (Fig. 11, top left panel A); the central cell engages all of its neighbors of the first degree, inducing the expression of mCherry in them, whose intensity evolves throughout the simulation (from t=20 to t=50, Fig. 11, top central panel B, top right panel C, bottom left panel D, bottom central panel E); at t=60 all first-degree neighbors of the central cell express high levels of mCherry (Fig. 11, bottom right panel F). On the contrary, the simulated deletion of \(GFP_{lig}\) results in a stable pattern with a central, colorless cell of type B and all its neighbors of the first and second degree constitutively expressing BFP (data not shown).

Case study 3—conjugative plasmid transfer

Plasmids are crucial in disseminating antibiotic resistance, virulence genes, and various adaptive traits within bacterial communities through horizontal gene transfer [58]. The third case study examines antibiotic resistance (R) conjugative plasmids transfer between bacterial cells, mirroring the fundamental conjugation mechanism presented in [59]. As depicted in Fig. 12, the plasmid transfer process [60] initiates with a Donor cell harboring a conjugative R plasmid (Fig. 12, top left). The Donor extends a Pilus, a proteinaceous protrusion [61], encoded by the R plasmid, to establish contact with a compatible recipient cell, referred to as the Transconjugant cell [59] (Fig. 12, top right). Upon contact, the Pilus retracts, bringing the cells into proximity and forming a conjugation bridge. This bridge facilitates the transfer of one of the plasmid DNA strands in a linearized form from the Donor to the Transconjugant (Fig. 12, center). Subsequently, both cells harbor a single-stranded copy of the R plasmid. Finally, through circularization and DNA synthesis, the Donor and Transconjugant cells complete the second strand for their respective R plasmid copies. Consequently, both cells become capable of further disseminating the R plasmid, effectively functioning as Donor cells (Fig. 12, bottom), thereby facilitating the propagation of antibiotic resistance.

Fig. 12
figure 12

The mechanism considered for the third case study is antibiotic resistance (R) plasmid transfer across bacterial cells, recapitulating the basic conjugation mechanism modeled in [59]. The plasmid transfer mechanism [60] begins with a Donor cell that carries a conjugative R plasmid (top left). The Donor extends a Pilus, a proteinaceous protrusion [61] to contact a compatible receiver cell, named the Transconjugant cell (top right). In this example, the Pilus is encoded in the R plasmid. Upon contact, the Pilus retracts, pulling the cells together and establishing a conjugation bridge. This bridge enables the transfer of one of the plasmid DNA strands in linearized form from the Donor to the Transconjugant (center). At this point, both cells hold a single-stranded copy of the R plasmid. Finally, both cells build the second strand for their respective R plasmid copies through circularization and DNA synthesis. Thus, both Donor and Transconjugant cells are equipped to disseminate the R plasmid further, making them both Donor cells (bottom) enacting antibiotic resistance propagation

Algorithm 5 provides a BiSDL description of the conjugative R plasmid transfer from the Donor to the Transconjugant cell.

Algorithm 5
figure g

BiSDL description of the conjugative R plasmid transfer from the Donor to the Transconjugant cell.

The illustrated plasmidTransfer MODULE contains three SCOPE statements: one for the Donor cell (donor, (Algorithm 5, lines 3–15); another one for the Transconjugant cell (transconjugant, Algorithm 5, lines 21–32), and a third one for the Pilus proteinaceous structure mediating the conjugation process (pilus, Algorithm 5, lines 16–20). This shows the flexibility of the SCOPE construct in supporting the modeler for expressing different types of biological compartments. JUXTACRINE_SIGNAL mechanisms connect the SCOPE statements from the donor to the transconjugant through the pilus, recapitulating the R plasmid transfer process.

Figure 13 visualizes the top-level NWN model for this use case, having a place for each of the three BiSDL SCOPE statements, holding the donor net token (Fig. 14), the transconjugant net token (Fig. 15) and the pilus net token (Fig. 16), respectively.

Fig. 13
figure 13

The top-level plasmid_transfer net architecture. The places that contain net tokens correspond to the two BiSDL SCOPE statements, holding the donor net token (Fig. 14), the transconjugant net token (Fig. 15) and the pilus net token (Fig. 16), respectively

Fig. 14
figure 14

The donor plasmid_transfer net token architecture. Places model genes, transcripts, proteins, and molecules, while transitions model processes involving them, such as transcription, translation, degradation, and enzymatic reactions. Black tokens model discrete quantities of resources in each place and are represented by the dot symbol

Fig. 15
figure 15

The transconjugant plasmid_transfer net token architecture. Places model genes, transcripts, proteins, and molecules, while transitions model processes involving them, such as transcription, translation, degradation, and enzymatic reactions. Black tokens model discrete quantities of resources in each place and are represented by the dot symbol

Fig. 16
figure 16

The Pilus plasmid_transfer net token architecture. Places model genes, transcripts, proteins, and molecules, while transitions model processes involving them, such as transcription, translation, degradation, and enzymatic reactions. The black tokens model discrete quantities of resources in each place and are represented by the dot symbol

The simulation of BiSDL-compiled nwn-snakes models demonstrates that transferring R plasmids from the Donor to the Transconjugant cells aligns with anticipated behavior. Throughout the simulation, the Donor cell consistently retains one R plasmid (Fig. 17, top left), while the Transconjugant cell initially lacks any R plasmids (Fig. 17, top right). The R plasmid within the Donor encodes the Pilus protein, which initiates the conjugation process upon translation. The Pilus mediates two R plasmid transfer events, illustrated by linearized single-strand R plasmids within it (Fig. 17, center), resulting in two subsequent increases in R plasmid copies within the Donor cell (Fig. 17, top right). R plasmids encode the R protein, conferring antibiotic resistance. The R protein is consistently present in the Donor cell due to the R plasmid (Fig. 17, bottom left). Conversely, in the Transconjugant cell, R protein levels only rise above zero after the initial R plasmid transfer event (Fig. 17, bottom right). These findings indicate that the simulation accurately reproduces the conjugative plasmid transfer process, leading to R protein-mediated antibiotic resistance propagation.

Fig. 17
figure 17

Simulation of BiSDL-compiled nwn-snakes models shows that the transfer of R plasmids from the Donor to the Transconjugant cells is consistent with the expected behavior. The Donor cell holds one R plasmid throughout the simulation (top left), while the Transconjugant cell starts with none (top right). The R plasmid in the Donor encodes for the Pilus protein, which initiates the conjugation process as soon as it is translated. The Pilus mediates two R plasmid transfer events, depicted as the presence of a linearized single-strand R plasmid within it (center), resulting in two subsequent increases of R plasmid copies in the Donor cell (top right). R plasmids encode for R protein, which provides antibiotic resistance. In the Donor cell, the R protein is present from the start due to the presence of the R plasmid (bottom left). Otherwise, protein levels rise above zero in the Transconjugant cell R only after the first R plasmid transfer event (bottom right). These results show that simulation recapitulates the plasmid-transfer conjugation process, causing the acquisition of R protein-mediated antibiotic resistance

Conclusions

In conclusion, the BiSDL framework represents a significant advancement in synthetic biology modeling. The core aim of BiSDL is to merge the detailed expressive capabilities of computational models with the user-friendliness of high-level languages, providing a tool that is both powerful and more accessible than a low-level language. In fact, BiSDL requires only basic modeling and programming skills compared to the advanced modeling and programming skills required by low-level languages. The hierarchical and modular structure of BiSDL is ideal for capturing the inherent complexity of biological systems and constructing reusable and adaptable components.

Through the development of a prototype BiSDL compiler, models can be compiled into low-level descriptions that encapsulate the spatial, hierarchical, and dynamic behaviors of biological entities, using the NWN approach to handle biological complexity. As BiSDL closely mirrors the domain-specific language used within the biological domain, the compiler closes the gap between high-level biological semantics and NWN low-level formalism syntax. Generated models rely on the nwn-snakes library, an extension of the SNAKES library [51] to support the NWN formalism. Results indicate that BiSDL can dramatically simplify complex model descriptions, significantly reducing the code needed to represent sophisticated systems. This is evidenced by the bacterial consortium, RGB and conjugative plasmid transfer case studies.

The accompanying nwn-petrisim simulator has been developed to reproduce and investigate behaviors of systems modeled in BiSDL, confirming that the BiSDL-compiled models accurately represent the expected behaviors of the systems studied.

Future developments may integrate into BiSDL a tool for formal verification [62], inherently supported by PN as a low-level formalism. This would allow the validation of complex models by rigorously checking for correctness according to specific criteria. Several integrable tools exist for PN formal verification, including TINA (Time Petri Net Analyzer) [63], ABCD plus Neco for SNAKES models [64] and GreatSPN [65], which provides a user-friendly visual interface, in alignment with BiSDL’s aim for broader accessibility.

The proposed BiSDL constructs exemplify the expressiveness of the language, and its closeness to the biological semantics. Future developments include gradually extending the syntax of the language with additional constructs.

In the future, plans are in place to further enhance BiSDL usability by developing a visual language interface and refining the user interface to simplify combining or editing modules. While, in its current implementation, the BiSDL still requires basic programming and modeling skills, a graphical interface would pave the way to the extensive broadening of the user base to people with no programming skills.

Considering the importance of BiSDL adhering to Findability, Accessibility, Inter-operability, and Reuse (FAIR) guidelines for data management and stewardship, adoption and enforcement of a standardized naming convention in BiSDL is of paramount goal in the future. In this regard, BiSDL current framework is poised for advancements in two critical areas. Firstly, the versatility of its source-to-source compiler, which currently facilitates the translation of BiSDL into NWN models, could be expanded to support additional formalisms and standards, including those under the COMBINE initiative. Secondly, future works may integrate an ONTOLOGY_ID field into the language within the Metadata representation SBO category to encourage adherence to standards and foster model exchange.