Royal Society Publishing

Autonomous Boolean modelling of developmental gene regulatory networks

Xianrui Cheng, Mengyang Sun, Joshua E. S. Socolar


During early embryonic development, a network of regulatory interactions among genes dynamically determines a pattern of differentiated tissues. We show that important timing information associated with the interactions can be faithfully represented in autonomous Boolean models in which binary variables representing expression levels are updated in continuous time, and that such models can provide a direct insight into features that are difficult to extract from ordinary differential equation (ODE) models. As an application, we model the experimentally well-studied network controlling fly body segmentation. The Boolean model successfully generates the patterns formed in normal and genetically perturbed fly embryos, permits the derivation of constraints on the time delay parameters, clarifies the logic associated with different ODE parameter sets and provides a platform for studying connectivity and robustness in parameter space. By elucidating the role of regulatory time delays in pattern formation, the results suggest new types of experimental measurements in early embryonic development.

1. Introduction

In animal embryos, gene expression patterns define cell fate territories that initiate the development of the adult body plan. The topology and logic of the underlying gene regulatory network (GRN) are generally understood to be the key factors in determining which genes will be expressed. Topology and logic alone, however, do not tell the whole story. Gene expression patterns can depend on the order in which signals from different sources arrive at each gene, and therefore on the times required by the processes represented by network links.

Our ultimate goal is to model complex developmental gene networks like the one governing endomesoderm specification in the sea urchin, where cell fate decisions are discrete and deterministic, and several decades of qualitative biological study have established the utility of modelling genes as either ‘on’ or ‘off’ [1]. We therefore propose an autonomous Boolean network (ABN) modelling approach in which real-valued time delays are assigned to links in a regulatory network with a given topology and logic. Although molecular concentrations in this model are binary variables, the effects of fluctuations can be included as stochastic variations about the nominal time delays. The formalism is closely related to the approach developed by Thomas and coworkers [2] and to the Boolean delay equations studied by Ghil et al. [3], but differs in some technical respects (see the electronic supplementary material).

Our ABN models are a useful compromise between Boolean approaches that rely on external specifications of the timing of updates, as used by Albert & Othmer [4], and models designed to incorporate more detailed information about the effects of activation at intermediate levels, such as the implementation of the Thomas framework by Sanchez et al. [5] or the timed automaton models of [6]. An assumption we share with the Thomas approach is that the dynamical details of hybrid models in which continuous variables are governed by discrete switches [7,8] are not essential for explaining the biological processes of interest.

In this paper, we first present the ABN framework in mathematical terms and describe the biological meaning of its time delay parameters, then apply it to a non-trivial example—the Drosophila segment polarity gene regulatory system. We take as a starting point the network of the ordinary differential equation (ODE) model developed by Ingolia [9]. Taking the ODE model as a proxy for the real system, we find that a straightforward method for extracting time delay parameters yields ABNs with final states and transients that approximate the dynamics well, provided that certain conditions are satisfied by the ODE.

Considering the ABN in itself, we study its behaviour both with and without stochastic fluctuations in the time delays. We address three features of the 26-dimensional parameter space domain that produces agreement with experimentally observed wild-type and mutant expression patterns: (i) we derive a non-trivial set of sufficient conditions for reaching the desired final state from the relevant initial condition; (ii) we present numerical evidence strongly suggesting that the domain is connected, but non-convex; and (iii) we develop a quantitative measure of the robustness of the transient dynamics and show that the fitness landscape has wide plateaus rather than isolated peaks, as parameters are varied. Points (ii) and (iii) suggest that robust mutational pathways are available to this system for exploring the full range of working parameter sets.

Many of the interactions among segment polarity genes have been experimentally characterized and several models of the relevant regulatory network have been developed. ODE models have successfully reproduced expression patterns of wg, en and hh on a two-dimensional lattice of hexagonal cells [9,10]. Other studies employed synchronous and asynchronous Boolean models on one-dimensional lattices of cells [4,11], multi-valued logical models with time delays [5] or reductions to a small set of ODEs representing functional modules [12]. Our ABN approach retains much of the simplicity and transparency of Boolean models while incorporating a biologically plausible scheme for determining the order of updates. Our results for the Drosophila model highlight the utility of framing a model in terms of binary variables and experimentally observable time delay parameters, and suggests that extensions to developmental systems of greater complexity will be useful.

2. Autonomous Boolean network modelling of developmental gene regulatory networks

Autonomous Boolean modelling has been applied to yeast cell cycle [13] and electronic circuits [14,15]. Here we show how to implement the approach for developmental GRNs, which requires the introduction of processor nodes for more faithful representation of processes involving multiple regulators, and the representation of cell–cell signalling in multicellular systems undergoing cell division.

2.1. The basic formalism

An ABN is a set of vertices (or nodes) and directed edges (or links) together with a set of timing parameters. Each node i carries a binary variable Embedded Image representing a physical quantity, a Boolean function fi, and two times Embedded Image representing minimum durations for generating a response in xi. The link from input i to output j is denoted eij, and we refer to nodes i and j as the ‘source’ and ‘target’, respectively. Associated with eij are two time delay parameters Embedded Image, with Embedded Image and Embedded Image. For each target j, there is a set of input links Embedded Image with Embedded Image distinct sources i, and fj is a Boolean function with inputs Embedded Image.

The link eij represents a chain of events through which xi may influence xj. Embedded Image and Embedded Image may be thought of as the times required for signals to traverse the link. Embedded Image is the delay between the instant xi updates from 0 to 1 and the instant this update affects node j; Embedded Image is the corresponding delay when xi switches from 1 to 0.

The time series xj(t) is developed in chronological order as follows. At time t, we first computeEmbedded Image 2.1where Embedded Image and each tk is a time infinitesimally later than the latest origination time of the signals transmitted from node ik that have arrived at node j before time t. Note that it is the origination time from node ik that enters the equation rather than the arrival time at node j. After a given source switches, it may switch back quickly enough that the trailing signal would reach the target before the leading one, in which case the first signal would never affect xj. Physically, this corresponds to a case in which the trailing signal annihilates the leading one before it reaches the target.

Next we adjust the recent history of xj to account for the fact that a target cannot respond to counteracting signals received in sufficiently rapid succession. Let s be the latest time before t that xj switched from 0 to 1 (or 1 to 0). If a signal arrives at time t that causes xj to switch back from 1 to 0 (or 0 to 1) and Embedded Image (or Embedded Image), then Embedded Image is set to 0 (or 1) for the entire interval Embedded Image and we say that the short pulse (or dip) is rejected [13,14]. (This differs from the handling of short pulses in the Thomas framework; see the electronic supplementary material.) Note that the restrictions Embedded Image and Embedded Image ensure that the adjustment to xj(t) can be made in time to remove any signals generated by the switch at time s before they reach their targets.

The ABN dynamics can be simulated by an event-driven computer code. A time-ordered queue containing signals and arrival times at all targets is maintained, and the target receiving the next earliest signal is updated according to its logic function. Stochastic fluctuations are modelled by adding a random increment to each event time as it is added to the queue [2,13].

For present purposes, we neglect the dependence of the time delay on the recent history of the source and target; i.e. we do not attempt to model the fact that a node that has just recently turned off may turn on more quickly than one that has been off for a long time. In some physical systems, such as unclocked digital electronic circuits, the time delay parameters can be measured with high precision [14] and history-dependent effects can be observed. In such cases, the dynamics of the network may be chaotic [15]. For our networks, however, short pulse rejection (SPR) will eventually lead to a fixed point or strictly periodic oscillations (in the absence of stochastic fluctuations).

As for any model with explicit time delays, the initial condition for a simulation must specify the state of the system over a time interval equal to the largest time delay. For an ABN, the initial condition is a specification Embedded Image for each node i, where τ is larger than any of the time delay parameters Embedded Image, along with predefined switches in the interval Embedded Image. At t = 0, equation (2.1) is applied with tk set to Embedded Image for all inputs that have not sent a signal reaching xj before t = 0.

2.2. Modelling biomolecule abundances and time delays

In an ABN model of a GRN, each xi represents the abundance of an mRNA or protein. If the concentration of the molecular species i is below its functional threshold, then xi = 0; otherwise xi = 1. The regulatory mechanism that determines the expression of species i is encoded in the Boolean function fi. Regulatory influences of species i on others are represented by directed links eij, where j runs over the targets of species i. The two distinct time delays associated with each link account for the different sets of biomolecular processes involved when the source is turned on or off. As an example, consider the simple network of figure 1a, consisting of mRNAs encoding three transcription factors. The delay Embedded Image accounts for the time between the instant the concentration of mRNA i rises above its functional threshold and the instant mRNA j rises above its functional threshold.

Figure 1.

Time delays and cell–cell communication in the autonomous Boolean modelling framework. (a) Time delays in the autonomous Boolean model account for distinct biological processes. Each rectangle represents a set of processes that is assigned a time delay. See text for details. (b) Illustration of the insertion of a processor node, Embedded Image, for separating delays describing different processes. (c) The autonomous Boolean modelling framework accounts for cell–cell communication by connecting copies of identical GRNs in different cells. Each rectangle represents a cell, and links between neighbouring cells represent the effects of signalling molecules.

For an intracellular link, two sets of processes contribute to Embedded Image. The first set Embedded Image is associated with the dynamics of factor i. It includes the activation of translation as the mRNA concentration crosses its functional threshold, translation of i into its protein and post-translational processing of the protein to form the active transcription factor. The second set Embedded Image is associated with factor j and accounts for the synthesis of the mRNA, which includes binding of the transcription factor i to the cis-regulatory element of gene j, activation of transcription of mRNA j and accumulation of mRNA j molecules. For a factor that can be activated by more than one source, we assume the synthesis times are the same.

Similarly, Embedded Image accounts for two sets of processes: Embedded Image includes the deactivation of translation of i as the mRNA concentration drops below its functional threshold and degradation of the corresponding transcription factor; and Embedded Image includes dissociation of transcription factors from the cis-element of j, deactivation of transcription of j and degradation of existing j molecules. Let T(s) denote the time it takes to complete a set of processes s, then the time delays associated with eij areEmbedded Image 2.2andEmbedded Image 2.3

Note that for two targets of node i, Embedded Image and Embedded Image (and similarly Embedded Image and Embedded Image) generally have different values. Embedded Image and Embedded Image are clearly different, as they correspond to the transcription of different genes. In addition, Embedded Image and Embedded Image may differ because the concentrations of i required to activate j and k may not be the same.

Cases in which the source has a single target and the target has a single source, such as some protein-to-protein and RNA-to-protein links, may be modelled using just one delay parameter for turning on and one for turning off. For cases where two or more sources regulate the same target, we introduce a processor node to account for the fact that a time delay from one source may depend on the state of the target if the target can be activated by other inputs. Consider again, for example, the link eij of the transcriptional network in figure 1a. If i updates from 0 to 1 at t = 0, then the delay Embedded Image may take two possible values, depending on the state of j. If Embedded Image at Embedded Image, then Embedded Image. However, if xj = 1 at Embedded Image owing to activation by another source, the synthesis time delay Embedded Image is no longer needed. To account for this effect while still assigning each link single-valued on and off delays, we insert a processor node Embedded Image. Embedded Image splits each original time delay into its two parts Embedded Image and Embedded Image (figure 1b), each is only associated with either the source or target node. The processor node for a transcription factor can be thought of as representing the state of the cis-regulatory element of its gene. Another example showing the necessity of introducing a processor node is given in the electronic supplementary material, §B.

For an intercellular (cross-regulatory) interaction, the relevant signal in the target cell may be produced by any of the neighbouring sources. To account for the changes in signalling induced by cell division (see §2.3), the source protein is explicitly represented as a node P in the source cell, a processor x is added in the target cell, and both the on and off time delays between P and the processor are set to zero. The processor's logic function Embedded Image integrates the inputs from all adjacent cells and sends a single output to the target of the signalling pathway (which may be a processor). For present purposes, we take Embedded Image to be an OR function on all of its inputs.

Each processor is treated exactly as any other node in the ABN. Insertion of a processor preserves the complexity of the network logic, as Embedded Image becomes a trivial activating link and Embedded Image is precisely the original fj. We note that additional processors could be introduced to account for more complex features of the time delays. For example, a node y that responds to xi OR xj may turn on more rapidly if both inputs are turned on together. To model this, one could introduce a processor that responds to xi AND xj, and assign a smaller time delay for it to activate y. For present purposes, we neglect such complications.

2.3. Multiple cells and cell division

A multicellular system is modelled as a large network in which intracellular networks are connected by links representing cell–cell signalling molecules. In modelling a single organism, all cells are assumed to carry the same intracellular network and be receptive to the same intercellular signals [4,5,9,10,12] (figure 1c). Two cells are connected by a signalling link if they are physically close enough for the signal from one to influence the other.

To model cell division, we assume that a progenitor cell gives rise to two identical daughter cells, each inheriting the dynamical state of the progenitor at the instant of division. The positions of all cells are then recomputed based on some externally specified morphological model. All previously existing signalling links are deleted from the full ABN, and new signalling links are established between adjacent cells in the new configuration. For each new signalling link, an update signal for the target indicating the state of the source is added to the event queue at the time of the division, corresponding to the zero time delay for the contact interaction between a signalling protein (P) and its receptor (the processor).

3. Constructing the Drosophila segment polarity model

During Drosophila embryogenesis, maternal effect genes (such as bicoid and nanos) establish gradients that initiate the specification of anterior–posterior axis (A–P axis) of the embryo. Following these maternal cues, gap, pair-rule and segment polarity genes act sequentially to establish and refine a periodic, striped pattern of expression along the A–P axis. While the gap and pair-rule genes are under direct transcriptional control in the syncytial embryo, regulation of the segment polarity genes (such as en, wg and hh) expressed after cellularization of the blastoderm relies on intercellular signalling. In the stabilization phase of epidermal patterning, the segment polarity signal genes hh and wg are expressed in single-cell-wide stripes, and they encode secreted molecules that diffuse only about one cell diameter [16,17]. These stripes partition the embryo into repeating units along the A–P axis known as parasegments. The width of each parasegment grows as the cells continue to divide, but the juxtaposed wg and en/hh stripes each remain a single cell in width [9,18]. We take as a starting point the ODE model of Ingolia for segment polarity stabilization in the Drosophila system [9].

3.1. Network topology

Figure 2a shows the topology of the Ingolia network [9]. We first simplify it by collapsing each linear chain in the ODE network to a single link in the ABN, absorbing the effects of the intermediate links into the time delay parameters. For example, the links from en to EN, and from EN to ci are collapsed into a single link from en to ci. An unspecified variable driving the expression of ci, slp and hh, which is implicit in the ODE model, is represented by node B in figure 2b. Initializing the ODE with the concentration [ci] = 0, for example, is equivalent to assuming that B turns on at t = 0.

Figure 2.

Adapting the ODE topology to autonomous Boolean topology. (a) The topology of the ODE model of the segment polarity network, taken from Ingolia [9]. (b) The segment polarity network topology for the ABN model after removing unnecessary nodes and accounting for the effective logic of the CI–CN–HH system. (c) An illustration of the insertion of a processor node. This processor node Embedded Image may be interpreted as representing a cis-regulatory element. (d) The full segment polarity network topology for the ABN model.

Further modification is needed to account for the regulation of the proteins CI and CN. In the cell, CN is produced by proteolytic cleavage of CI, a reaction inhibited by HH signalling [19]. Without expression of HH in a neighbouring cell, a given cell will express a low concentration of CI and high concentration of CN, but the presence of HH in a neighbour will result in high CI and low CN. While in the ODE models, it is possible to have a steady state with low CI and high CN, a direct activating link from CI to CN in a Boolean model cannot account for a state with CI ‘OFF’ (xCI = 0) and CN ‘ON’ (xCN = 1). To correctly represent the regulatory relationship in the ABN, both ci in the cell at hand and hh in a neighbouring cell are linked to CI through an AND function, and the CI–CN link is replaced by an activating link from ci to CN together with a repressing link from neighbouring hh to CN. Figure 2b shows the ABN topology at this stage in the analysis.

We next introduce processors for nodes with multiple inputs. Links directed towards a given mRNA or protein now point to its processor node, which relays the regulatory decision to the node it controls via a single link (figure 2c). Each node in figure 2b requires a processor except for B, which has no inputs, and HH, WG, whose incoming links represent only protein synthesis. Table 1 gives our nomenclature for the processor nodes and abbreviated notations used below for other nodes. Intercellular signalling processors Embedded Image and Embedded Image are introduced for each cell to integrate WG and HH signals from its neighbours. The topology of our adaptation (figure 2d) is consistent with the biology and expresses the same causal structure the ODE topology of figure 2a.

View this table:
Table 1.

Nomenclature of nodes and their corresponding processors in the Drosophila segment polarity ABN model.

3.2. Boolean logic functions

The Boolean logic functions assigned to each node are summarized in table 2. Our choices are determined as follows:

  • — Each mRNA or protein node that has only one input copies the state of its input.

  • — We assume that wg or hh signal from any single neighbour is sufficient to activate the signal transduction pathways in a target cell; therefore, both Embedded Image and Embedded Image have OR functions.

  • — To turn Embedded Image ON requires the presence of ci and also the HH signal from a neighbour to avoid cleavage into CN. On the other hand, to turn Embedded Image ON requires both ci and the absence of neighbouring HH [19,20].

  • — Embedded Image, Embedded Image, Embedded Image and Embedded Image each have one activating and one repressing input. Experimental evidence suggests that the repressor input dominates the activator when both are present. For Embedded Image and Embedded Image, lack of slp results in ectopic expression of en/hh [21]. For Embedded Image and Embedded Image, ectopic expression of en suppresses slp and ci [22,23].

  • — Finally, both neighbouring HH and local WG are required to activate wg [24]. Hence in our ABN model, both CI and WG are required to activate Embedded Image and CN represses wg transcription [20], leading to the Embedded Image logic function in table 2.

View this table:
Table 2.

The set of logic functions used in the ABN model.

3.3. Spatial structure and cell division

The cells in the model are hexagons arranged in a 2-by-4 lattice with periodic boundary conditions imposed on both the rows and columns of figure 3a [9]. Because both wg and hh signals are short-ranged during the developmental stages [16,25], signalling occurs only between cells in direct contact [4,9,10]. We assume for simplicity that all cells divide synchronously. When a cell divides, the two daughters inherit all of the mother's past records and any future updates that are scheduled. In our simulations, we set the units of time such that cell division occurs at t = 1, corresponding to the cell cycle time after cellularization of about 1 h.

Figure 3.

Cell division in the segment polarity network model. (a) The spatial arrangement of the cell starts from a 2-by-4 lattice with both horizontal and vertical periodic boundaries. (b) At cell division, all cells divide synchronously, each becoming two daughter cells: column 1 gives rise to 1a and 1b, column 2 gives rise to 2a and 2b, and so forth. If wg is expressed in column 2 just prior to the division, it will be expressed in the resulting columns 2a and 2b immediately after the division. (c) In wild-type Drosophila embryos, the expression of wg requires a hh signal from a neighbouring cell; so cells not touching the hh expressing cells will eventually lose wg.

3.4. Time delays and short pulse rejection

Our segment polarity model has 26 independent time delay parameters. We assume that they are within the ranges indicated in table 3, representing generic rates of the relevant physical processes [26] relative to the typical cell cycle time. All other time delays correspond to relatively fast processes, such as transcription factor binding to cis-regulatory elements and signal transduction [27,28], and we assume that they are fixed at a very low value (10−4).

View this table:
Table 3.

The segment polarity ABN model time delay assignment and interpretation. Time delay ranges were estimated from earlier studies [19,26–28].

We note that there is some in vivo experimental evidence for a long half-life (75 min) for conversion of CI to CN [19]. Incorporating this value into our model without generating cases where CI and CN are on simultaneously in some cells, which appears to be inconsistent with experiments [19,10], requires either (i) assuming that the activation time for CI from ci is greater than 75 min or (ii) assuming that the functional threshold for CI is substantially lower than that for CN. For the present study, we choose the latter option, as the former yields solutions only for extreme values of other time delay parameters. We note, however, that all of the successful ODE model [9] parameter sets we examined correspond to short CI–CN conversion times, and we have checked that using a short time in our model does not substantively alter the results.

3.5. Initial conditions

Pair-rule genes generate initial transcripts of wg and en in columns 2 and 3, respectively [16] (see figure 3a for column labels). In addition, the pair-rule gene slp is expressed prior to the developmental stages we study and maintains its expression throughout our time window in columns 1 and 4 [21]. In correspondence with the initial conditions employed in the ODE models, we assume that the segment polarity proteins are not active until, at t = 0, they reach some critical concentration and begin to affect their targets, and that the en, wg and slp transcripts preloaded by pair-rule genes decay in the absence of endogenous production. The initial conditions for the ABN model therefore specify that en, wg and slp are on for Embedded Image in the appropriate columns and that Embedded Image, Embedded Image, and Embedded Image turn off at t = 0 (due to deactivation of pair-rule regulators not explicitly included in the model). To match the initial conditions of the ODE model, B is assumed to switch from off to on t = 0 as well. The dynamics then proceeds as explained above.

3.6. Limitations of the autonomous Boolean network approximation

An ABN model is clearly a simplification of the full dynamics of the real system. One test of its suitability for modelling systems with continuously variable concentrations is to check for whether it can reproduce the attractors and transients of a relevant ODE system. By integrating the Ingolia ODEs using Mathematica code based on Ingeneue [10], we identified several parameter sets that produce the desired target pattern from a given initial condition. We then binarized the ODE time series using thresholds derived from the ODE Hill function parameters [29] and asked whether a set of ABN time delays can be found such that the ABN trajectory matches the binarized ODE time series.

In the Drosophila model, each node in the network only switches once or twice before the fixed point is reached and it is possible to get very good ABN approximations by adjusting each relevant time delay to fit the one transition that it governs. A better test, however, is to ask whether the ABN time delays can be found without relying on fitting the entire transient, using instead only local information about each link, then to see whether the ABN accounts well for the full transient and correct final pattern.

To compute the time delays associated with the link eij, we considered the input links to both i and j. Let Embedded Image be the set of sources for i and Embedded Image be the set of sources for j not including i itself, and let Embedded Image be the ODE variable corresponding to Embedded Image for each node h. We determined the time delay Embedded Image by holding all Embedded Image and Embedded Image at their saturated values that yield Embedded Image and Embedded Image, then switching one of the Embedded Image instantaneously so that Embedded Image rises above its threshold value, which causes Embedded Image to rise above its threshold. Embedded Image is taken to be the time between these two crossings. For the equations used by Ingolia, the result does not depend on which Embedded Image is taken to be the step function (see the electronic supplementary material for further details).

We found that this procedure is sufficient to capture the ODE dynamics in the Drosophila network if three conditions are met. First, the ODE must not have a threshold parameter so low that a target crosses its threshold before its source crosses its own threshold, in which case, the extracted time delay is negative. Some sets of ODE parameters classified by Ingolia as working (see the electronic supplementary material, §B.2) do have this property. Second, the success of the ODE must not depend on a variable that rises above the threshold required to turn on one of its targets but then plateaus before getting high enough to turn on another target. Such a case would require assigning three distinct values to the variable in a discrete model [2], or adding an auxiliary binary node that carries the relevant information. Third, the ODE must not depend on a variable with a very low threshold barely crossing the threshold before decaying back below it. In such a case, the time delay extracted by beginning from the saturation value is too large, which can lead the ABN to the wrong fixed point. If the transition in question occurs in only one context, the ABN parameter can simply be adjusted to give a good account of the dynamics. If however, the same variable does decay from its saturation value at a different time or different cell, or in a perturbation experiment, then there may be a conflict that cannot be resolved within the ABN without introducing an auxiliary node.

Figure 4 shows an example of a match between the ABN and ODE dynamics for a case that meets the three conditions. The key point is that the ABN model can clarify the mechanism by which the system achieves its goal. For a system in which positive time delays can be measured by isolating individual local motifs, the ABN model can either account for the dynamics or indicate a need for modelling of ternary effects or sensitivity to low activation levels.

Figure 4.

Boolean dynamics match closely the time series of the corresponding ODE model. Each panel corresponds to the designated gene and a given column. The labels 2a, 2b, 3a and 3b refer to the column numbers in figure 3. Dashed lines indicate thresholds for binarizing the ODE concentration levels.

4. Analysis of the autonomous Boolean network Drosophila model

By treating the ABN model directly, without reference to the ODE, we assessed its ability to reproduce the experimentally observed segment polarity gene expression patterns described in previous studies. For the network of figure 2d with the initial conditions defined earlier, we randomly sampled 104 parameter sets. The 26 time delays were sampled from uniform distributions within ranges shown in table 3. Time delays on links representing fast processes, such as protein binding, were fixed at a very low value (10−4) in all parameter sets. The SPR thresholds of node i were set toEmbedded Image 4.1where V is the set of all target nodes of i, and l = 1 for rejecting positive pulses and l = 2 for rejecting inverted pulses (dips). Of the 104 sampled parameter sets, 3652 produced the target patterns of en, wg, ci, hh, CI and CN discussed in von Dassow et al. [10] in the absence of cell division. All 3652 sets also produced the correct pattern of en, wg and hh discussed in Ingolia [9] when cell proliferation was incorporated. Furthermore, without cell division, a comparison could be made with the en, wg and hh mutants studied by Albert & Othmer [4]. For all eight nodes common to our model and [4] (wg, WG, en, hh, HH, ci, CI or CIA, and CN or CIR), all 3652 of our working sets produced the same final patterns as [4]. In addition, for many parameter sets that captured the en mutant pattern, hh was expressed transiently, consistent with the phenotype described in Tabata et al. [30].

4.1. Timing constraints

The ABN framework allows for analytical determination of some regions in parameters space that lead to successful pattern formation. For example, given the Boolean logic functions specified in table 2 and initial conditions specified in §3.5, we derived a set of sufficient conditions on the time delays for guaranteeing the production of the desired final pattern. We use the abbreviated notation from table 1. Deriving all of the necessary constraints is a complex problem that is beyond our present scope.

Independent of timing considerations, the Boolean logic of the network (table 2) ensures that the desired final state (figure 3c) of the full multicellular system is a fixed point of the dynamics if it is reached with no pending events in the queue. We seek conditions on the time delays that ensure the transient generated from the given initial conditions leads to this fixed point. The conditions take the form of inequalities that ensure an ordering of events that leads to the correct pattern. One such constraint isEmbedded Image 4.2which ensures that the repression signal from slp arrives in time to prevent neighbouring WG from turning ON en.

The full set of conditions, together with a proof that they are sufficient, is given in the electronic supplementary material. The strategy of the proof is to identify positive feedback loops that act as bistable switches to control the fixed point pattern, and then determine conditions that force each loop to hold its proper state (figure 5). These are the loops identified also by Ma et al. [12] as constituting the fundamental structure of the network [12]. For each steady-state of each loop, the configuration will hold for all time if (i) certain external inputs to the nodes in the loop are held constant at appropriate values and (ii) there are no pending updates propagating on any links in the loop. Straightforward inspection reveals that when condition (ii) is met, the following configurations are stable for the enslp loop consisting of Embedded Image and the cross-cellular wgen loop consisting of Embedded Image.

  • L1: In the enslp loop, Embedded Image and the input B is always ON.

  • L2: In the wgen loop, all nodes are ON except s and Embedded Image, and Embedded Image is always ON.

  • L3: In the wgen loop, all nodes are OFF except s and Embedded Image, and B is always ON.

Figure 5.

Final pattern of expression in four columns surrounding the wg–en boundary. In each column, all the cells are depicted as being collapsed into a single cell. Ovals, squares and diamonds represent mRNA, proteins and processors, respectively. Filled colours indicate elements that are turned on. Element B is always on by definition. Dashed lines indicate signalling links between two adjacent cells in the same column, which have identical expression patterns. Black links show interactions required for establishing the final pattern. Thick links indicate self-sustaining feedback loops that are useful for finding sufficient conditions on the time delay parameters.

To guarantee the correct final pattern, it suffices to ensure that the initial dynamics drive the loops into the correct steady states in appropriate columns, which can be done through the straightforward (but tediously technical) reasoning presented in the electronic supplementary material.

4.2. Connectedness of the space of successful parameter sets

From an evolutionary perspective, it is important to know whether the domain of working parameter sets divides into disconnected regions, which would then represent distinct strategies for hitting the target pattern and suggest that viable mutations would not explore the full space. We note that Dayarian et al. [29] have considered a related issue, the characterization of the domain of parameter sets that can hold the final pattern without regard to the question of whether the pattern would be reached from the given initial conditions. Here we focus explicitly on the set of time delay parameters that produce a successful transient, given that the network logic is capable of holding the final pattern.

To test for the presence of such disconnected regions, we randomly chose 100 of the 3652 working sets and for each pair checked the pattern of en, wg and hh produced (with cell division) for 48 evenly spaced points along their interpolating line segment in parameter space. For some pairs, some of the 48 points failed to converge the correct target pattern, indicating that the space of working parameter sets is not convex.

We then constructed a graph in which each node corresponds to one of the 100 working parameters sets, and an edge is placed between all pairs of nodes for which all 48 interpolating points produced the correct pattern. Figure 6a shows the adjacency matrix for this graph. A red pixel indicates an existing edge; a black pixel indicates that at least one point on the interpolating line segment failed to produce the correct pattern. The fact that there are continuous paths of red pixels connecting the left edge to the bottom right corner reveals immediately that the graph consists of a single connected component. This result strongly suggests that the full space of working parameter sets forms a connected region, and therefore that the full set can be explored via sequences of small mutations.

Figure 6.

Geometry of the space of working parameter sets and robustness of the segment polarity network. (a) Adjacency matrix for 100 parameter sets randomly selected from the total 3652 working sets. Red pixels indicate that all 48 interpolating points on the edge connecting the two parameter sets produce the correct pattern, while black indicates at least one failure. (b) The fraction R of runs that produce successful pattern formation as a function of noise amplitude α. Two thousand independent simulation runs are used to compute R for each data point. (c) Details of upper left corner of (b) showing the threshold R* = 0.99 for developmental success. (d) Distribution of robustness (Embedded Image) values for the 3652 working parameters sets. (e,f) Variation of the robustness over two triangular regions in parameter space interpolating between three working parameter sets. Each point represents a parameter set and the colour indicates the robustness Embedded Image using the scale of (d), with grey indicating parameter sets that do not produce the correct final pattern even for α = 0.

4.3. Robustness of the Drosophila segment polarity network

To characterize the robustness of a given model, we introduced noise of amplitude α by setting the time delay for a particular update event toEmbedded Image 4.3where the random variable Embedded Image is drawn from a uniform distribution on the interval Embedded Image. For every update occurring after those specified by the initial conditions, a freshly sampled Embedded Image determines Embedded Image for each update triggered by the current one. In order to ensure consistency with equation (4.1), all SPR thresholds are reduced by a factor of (1 − α).

For a parameter set that produces the correct pattern for α = 0, non-zero α may lead to the wrong pattern or to extended oscillations in a given run. We define R(α) as the fraction of independent runs that produce a fixed point with the correct pattern. R(α) typically decreases monotonically with increasing α, as shown in figure 6b. In one of the 10 cases we examined, however, R increases substantially with increasing α after an initial rapid drop. The mechanism underlying this unusual behaviour is not yet understood. We define Embedded Image as the largest α for which R(α) exceeds a threshold value Embedded Image, with higher values indicating that the patterning is more robust against timing fluctuations.

It has been estimated that the normal detection rate of mutations causing visible phenotypes is about 2.5 × 10−4 in Drosophila melanogaster [31]. For our study, we assumed conservative Embedded Image as a condition for developmental success. For each parameter set, we search for Embedded Image using an algorithm that assumes Embedded Image is a monotonically decreasing function, consistent with the behaviour displayed in figure 6c. Figure 6d shows the distribution of Embedded Image values for our 3652 working parameter sets. The most robust cases score as high as Embedded Image.

The nature of variations in Embedded Image across the space of working parameter sets is illustrated in figure 6e,f. Each figure shows a lattice of interpolating points in a triangle defined by the three working sets corresponding to points A, B, and C. Colours match the scheme displayed in figure 6d. The robustness varies smoothly over the regions where it is non-zero; the grey protrusion confirms that the set of working parameters is not convex. More extensive data presented in the electronic supplementary material shows that figure 6f is typical in that points of high robustness lie on extended plateaus. Thus, the evolutionary exploration of the space can proceed without necessarily traversing regions of low robustness.

5. Discussion

The theoretical relevance of time delays in regulatory networks was first emphasized by Thomas [32]. Our analysis suggests that the important timing information can be adequately represented using binary variables.

We have also shown that the essential features of a working ODE model can be represented by ABN time delay parameters. In other words, the working ODE models are those in which the implicit time delays satisfy constraints that are explicitly expressible within the ABN framework.

Noting that the use of ODE models in this context represents a highly phenomenological approach to begin with, we suggest that for the purposes of understanding developmental processes controlled by regulatory circuitry (as opposed to external gradients), the Boolean model is just as true to the biology and more amenable to conceptual analysis. Moreover, the biochemical kinetic parameters entering ODE or Gillespie models are often difficult to measure in vivo; the time delays appearing in the ABN may be more easily accessible experimentally. While the time delays for each link are not necessarily easier to measure directly, databases that integrate in vivo spatio-temporal gene expression dynamics provide relevant information. For example, the sea urchin endomesoderm network database, represented in BioTapestry [1,33], allows us to see the times when a gene changes state and when its targets change state. Time-resolved expression datasets are also available for other systems showing genes that switch ON and OFF in response to their regulators [34,35].

We have explored several geometric features of the space of time delay parameters that lead to the correct pattern. The different working parameter sets identified within our model can be thought of as individuals in a population of embryos. While they all develop normally, genetic variations lead to slight differences in the rates at which different alleles perform the function of a given gene. The fact that the space is connected means that it is possible to evolve continuously from any working set to any other without losing function. Connectivity in this sense is complementary to the connectivity in the space of network architectures studied by Cotterell & Sharpe [36]. In their case, a given architecture may exhibit different behaviours for different parameter values. One assumes that it is possible to pass from one parameter set to another for a fixed architecture and then asks whether the different architectures are connected. We focus instead on the connectivity of working parameter sets within a fixed architecture.


We thank David McClay and Sam Gong for useful conversations. This work was supported by NIH grant no. P50-GM081883 and NSF grant no. DMS-10-68602.

  • Received July 18, 2012.
  • Accepted September 8, 2012.


View Abstract