We present a formal calculus, termed the chemtainer calculus, able to capture the complexity of compartmentalized reaction systems such as populations of possibly nested vesicular compartments. Compartments contain molecular cargo as well as surface markers in the form of DNA single strands. These markers serve as compartment addresses and allow for their targeted transport and fusion, thereby enabling reactions of previously separated chemicals. The overall system organization allows for the set-up of programmable chemistry in microfluidic or other automated environments. We introduce a simple sequential programming language whose instructions are motivated by state-of-the-art microfluidic technology. Our approach integrates electronic control, chemical computing and material production in a unified formal framework that is able to mimic the integrated computational and constructive capabilities of the subcellular matrix. We provide a non-deterministic semantics of our programming language that enables us to analytically derive the computational and constructive power of our machinery. This semantics is used to derive the sets of all constructable chemicals and supermolecular structures that emerge from different underlying instruction sets. Because our proofs are constructive, they can be used to automatically infer control programs for the construction of target structures from a limited set of resource molecules. Finally, we present an example of our framework from the area of oligosaccharide synthesis.
Living systems are unique in that they integrate molecular recognition and information processing with material production on the molecular scale. The predominant locus of this integration is the cytoplasm, where a multitude of biochemical compounds are highly organized in vesicular compartments that co-locate reactants of desired reactions, whereas separating those of undesired reactions. Surface markers on these compartments are used for vesicular trafficking, as well as vesicle budding and fusion, thereby allowing for the fine-tuned control of biochemical reaction cascades [1–3].
The desire to employ this complex molecular organization in next-generation chemical synthesis has led to various studies on supermolecular compartments as nanoscale ‘bioreactors’ [4–7]. Several pathways for vesicle fusion [8–10] have been suggested. In particular, Hadorn et al. [11–13] employ short single-stranded DNA tags for the specific interaction of various reaction compartments. In the European Commission-funded project MATCHIT [14–17], we are working on creating an artificial cellular matrix that seamlessly integrates information processing and material production in much the same way as its biological counterpart: the MATCHIT framework employs DNA-addressable bioreactors (termed chemtainers in the following), to mimic the topological organization of the cytoplasm. DNA tags open up for DNA computing operations akin to the ‘key–lock’ information processing mechanism found in biological protein–protein interactions. This form of molecular information processing is governed by autonomous chemical reaction kinetics and allows for tight integration of chemical production and information processing. In particular, we here employ the DNA join-and-fork gates to implement Boolean computation .
Whereas natural cells ultimately employ genomic information to regulate the resulting material production network, we envision programmable electronic control by embedding chemtainers into mechanoelectronic microsystems [19,20]. In such devices, dedicated microfluidic channels can be designed for vesicle generation , DNA tag insertion, tag and chemtainer trafficking , specific or unspecific fusion , vesicle rapture and encapsulation . Possibly paired with real-time feedback, this allows for control of chemical reaction cascades at the molecular level. This interplay of autonomous molecular computation and external electronic chemtainer manipulation enables the programmatic set-up of complex reaction cascades that allow for automated chemical production of a desired target molecule from a limited set of chemical resources.
Here, we propose a formal calculus to describe system states and transitions in an abstract representation of the artificial cellular matrix. We call this the chemtainer calculus. In essence, the chemtainer calculus allows us to describe the organization and manipulation of chemical compounds down to the molecular level in possibly nested, addressable reaction compartments. Sets, or more precisely, multisets of such compartmentalized reagents are situated at locations, e.g. at positions of a microfluidic machine. Several calculi for compartmentalized reaction systems have been proposed previously [25–27]. Our syntax for nested and tagged compartments follows relatively closely the one from brane calculi . Those calculi offer transitions for compartment transformations, such as fusion and splitting as well as molecular reactions. Transitions are integral components of the system state, and the calculus employs a process algebra to deduce the set of possible transitions from the current state. In the chemtainer calculus, we additionally define an explicit transition system that operates on states externally. This organization better reflects the difference between the chemical state transitions and external mechanoelectronic control.
The calculus was designed for the architecture described in reference . The tool chain for compilation of high-level directives of the chemtainer calculus into eventual electrode configurations to control the mechanoelectronic microfluidic hardware is presented in reference . Yet, the general framework discussed here is not tailored towards one specific technology. For example, instead of microfluidic channel segments, locations could equally denote test tube arrays, or wells in a high-throughput chip.
In this article, we first design a syntax that allows us to express the rather complex system states—or rather system arrangements—that we encounter in the artificial cellular matrix: located multisets of tagged and possibly nested chemtainers that carry cargo. Figure 1 schematically depicts an example of a possible state. We then introduce transitions between states that model possible changes of the physical state. Some of these transitions, e.g. DNA hybridization, capture autonomous chemistry, whereas other transitions are thought to be induced by control operations of the artificial cellular matrix. We proceed by defining a simple programming language that consists of sequences of parametric instructions that induce transitions on the system state. In §3, we discuss how the underlying instruction set of the chemtainer calculus affects the set of constructable objects, and we give our main result (theorem 3.3) that a set of eight instructions is sufficient to construct any well-formed target state. Because we use a non-deterministic semantics, our proofs demonstrate only the possibility, not the probability of creating a certain desired state. In §4, we apply our framework to the area of chemical manufacturing, where we present an algorithm for synthesis of branched oligomer structures by means of controlled chemtainer fusion and DNA computing.
2. The chemtainer calculus
2.1. System state
Objects of the calculus are molecules, address tags and chemtainers. Chemtainers represent compact microreactors, such as vesicles, oil droplets in water or water droplets in oil, DNA nanocages, etc. They can be decorated with address tags, and can hold chemical content and even other chemtainers within them. Here, we do not distinguish between different chemtainer types, but we could imagine a type system for chemtainers to specify their physical properties. All components of the system are situated at specified, discrete locations.
Our notion of space is rather simplistic: we assume a fixed set of locations; each component of the system state is situated at a certain location; we will introduce transitions that allow collocated components to interact (while objects at different locations may not interact), and we will introduce transitions that induce transport from one location to another by means of state transitions. Note that we currently do not introduce a specific topology on the set of locations (e.g. to move from one location to another, one might need to cross a third one), but such an extension is possible.
For countable index sets , and , let denote the set of locations, some set of molecules, and a set of DNA tags. We take if tags are explicitly given. Note that might intersect with or not.
The following context-free tree grammar Gs for system states formalizes the above verbal interpretation: 2.1 2.2 2.3with the start symbol . Here, the vertical broken bar is a metasymbol that indicates syntactic choice. We denote the empty state by the symbol . The binary operator denotes localization, where xi is a location identifier; the binary operator denotes composition of locations, whereas denotes composition within locations; 0 denotes the empty local state; * is the Kleene operator and signifies no or arbitrarily many repetitions of its argument. We write 2.4and 2.5with the empty tag to explicitly list tag and gate sets.
Following convention, we denote chemtainers by half-moon parentheses that enclose the chemtainer content with address tags associated with the left parenthesis [27,29]. Finally, the relation denotes DNA gates, which will be explained later. We write , , and for the grammars with the start symbols , , and , respectively.
To generate a state with the grammar , being a non-terminal, one recursively applies the above production rules starting from until the resulting state tree contains no more non-terminal symbols. The language is the set of all possible states that can be generated from the start symbol . In what follows, , , and denote arbitrary states of the respective languages.
Informally, we interpret states of to signify the following: the global system state is a composition of local states, where each local state has a location identifier xi and an associated local state description. The latter are compositions of molecules mj, gates qk as well as chemtainers with content P and gate set q*; gate sets, in turn, are compositions of gates qk as well as individual tags sk. See figure 1 for an example of a global state.
In order to conform with this interpretation, we introduce the following structural congruence relation (an equivalence relation with additional axioms that guarantee substitutivity of equals in context) over , , and : 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21 2.22 2.23 2.24 2.25 2.26 2.27
Thus, states that belong to the same equivalence class of ≡ represent the same physical state, even though they might be syntactically different. Equations (2.6)–(2.18) induce monoidal structures on , , and , where ‘:’ distributes over ‘ + ’, and equations (2.19)–(2.27) guarantee that we can substitute equals in any context.
We introduce some notational shortcuts. We write , and we introduce the notation 2.28which emphasizes the relation to mulitsets. We also allow ourselves to drop the explicit notation of empty locations by defining 2.29With these definitions, the example state depicted in figure 1 can be written as 2.30
We are now ready to introduce a transition system that codifies the possible outcome of both autonomous chemical reactions as well as externally induced operations that manipulate the system state. Autonomous reactions, in turn, are either arbitrary chemical reactions among molecules, which we refer to as application chemistry, or the working of DNA computing operations.
2.2.1. Application chemistry
Reactions are just transitions of the form 2.31where and are multisets of reactants and products with stoichiometric coefficients νi and μi.
We ensure that chemical reactions can occur among any co-located reactants that are not separated by chemtainer walls 2.32 2.33 2.34 2.35
We extend structural congruence to transitions by defining. 2.36and 2.37
In its ground form, the chemtainer calculus does not offer any transitions of the form (2.31). Instead, the user of the calculus is assumed to provide a set of autonomous transitions as axioms.
Note that there is no particular need to restrict to be finite. Our calculus can be applied without adaption to an infinite set of molecules, including polymers or branched structures—an example of which will be presented in §4. We also emphasize that we explicitly allow and to intersect. If they do, tags can occur in the reactant and product sides of chemical reactions, such that tags can be altered chemically.
2.2.2. DNA gate transitions
Here, DNA computation is implemented by join-and-fork gates  that irreversibly release a given set of output strands once they have bound a set of inputs , written . Note that does not physically contain the strands . Rather it means that the gate accepts those strands as input.
If a gate is co-located with all its input tags, it can release its output tag 2.38
We have to ensure that these transitions can occur between two co-located complementary tags in any context, unless they are separated by a chemtainer boundary. This is allowed by introducing the following inference rules 2.39and 2.40
Examples of gate transitions are depicted in figure 2. Owing to the inferences (2.32)–(2.37), it is guaranteed that DNA computing operations perform in any context.
2.2.3. Induced transitions
We now introduce transitions that model the controlled manipulation of states through operations of the artificial cellular matrix. We introduce eight such operations, responsible for feeding and moving of chemtainers and tags, controlled fusion of chemtainers, encapsulation of material into chemtainers, chemtainer bursting and flushing of content. Our exact transitions are motivated by the functionalities of the underlying microfluidics architecture , but they also serve as a guideline as to how alternative transitions in other hardwares can be codified. The impact of the exact instruction set on the constructive capabilities of the resulting calculus will be discussed in §3.
Induced transitions are of the form where I is a parametrized name. We first give the formal definition of these transitions (schematics are shown in figure 3) and will afterwards comment on their particular choice. In §2.3, we will introduce a small programming language based on these operations. 2.41 2.42 2.43 2.44 2.45 2.46 2.47 2.48 2.49Transitions (2.41)–(2.49) operate strictly on the top level of the state, meaning that we do not allow, for example, tagging or fusion of chemtainers that reside within another chemtainer. To ensure this, we simply do not provide inference rules that would allow us to derive such transitions. However, the above transitions are allowed to operate in any context through the following inference 2.50We now comment on the individual choice of operations and their transitions. feed allows one to feed chemtainers that are equipped with a certain number of molecules of a single molecular species into a location. This is the only means to provide elements of . Similarly, feed_tag allows one to feed a certain number of identical tags into a location. tag has been modelled to unspecifically attach some available tag to some container at a given location. No means are given to specify which tag or which chemtainer is transformed in this operation. The reason for this choice is that the actual linking of tags to a chemtainer involves linker molecules (e.g. biotin and streptavidin) that are common to all chemtainers and tags independent of the actual tag sequence or chemtainer type.
We provide a flush command to counteract the effect of feeding, where flushing simply removes content at a location. The operation is most easily implemented by literally flushing the material out of the system.
The move command allows one to move a specific tag, or a chemtainer decorated with a specific tag from one location to another. Notably, this allows for content separation: for example, the state will transition into the state in response to the operation . In a certain interpretation, this can be understood as chemtainer docking, where the locations correspond to volumes of bulk fluid that themselves move along a microfluidic channel decorated with the complementary tag (here τ). Assuming that both chemtainers are initially at location x which is decorated with τ, the complementarily tagged chemtainer is allowed to temporarily hybridize to the channel wall, whereas the bulk volume—and thus the locations—physically move along the channel. Non-matching content at the logical location x will remain at that location, although its physical position has changed. Matching content, however, will be at a different location, e.g. y and can be released into the corresponding bulk.
The operations fuse and wrap are more specific to the artificial cellular matrix and enable fusion and (vesicle) encapsulation. fuse induces unspecific fusion of co-located chemtainers, leading to mixing of both chemtainer surface and content. Besides unspecific fusion, tag-specific fusion of vesicles has been achieved experimentally [30,31]. A version of the chemtainer calculus featuring such tag-specific fusion is presented in . wrap enables encapsulation of material into vesicles, e.g. using interface transfer: it is assumed that the material at location x resides in an aqueous phase which is first immersed into an oil phase where it forms surfactant-coated water droplets—providing the inner layer of the future vesicle. Next, the water droplet passes a surfactant covered oil–water interface which provides the outer layer of the vesicle membrane. If the original state at x contains chemtainers, these are equally transferred into the interior of the new chemtainer, leading to nested chemtainers . Microfluidic implementations of this protocol are presented in . We will later analyse the power of this transition system with and without the wrap operation.
Complementing encapsulation, we provide a burst operation that releases the content of a chemtainer. We envision that burst does not fully dissolve the chemtainer, but rather ruptures the chemtainer wall temporarily, for example by means of a heat or salt shock. After bursting, the chemtainer will reconstitute itself and remain in the system, available for further processing.
In order to capture obvious implementation constraints of the physical machine, some of the operations are restricted to subsets of , e.g. feed(x) might require . It can be shown that the constructive power of the calculus is not affected by this limitation, as long as move, tag and burst are allowed to operate on the entire set . Likewise, we could constrain the move command such that the target location has to be in a certain neighbourhood set of the source location in order to capture, e.g. the channel topology of microfluidic devices.
We wish to clarify one point that might be counterintuitive. Assume, for example, the state . What will be the action to the operation move(σ, x, y)? Intuitively, we might expect that the system transitions into the state y : 10σ, meaning that all instances of σ have been moved. However, x : 10σ is structurally equivalent to owing to the distributive relation (2.18). To this state, we can apply the transition equating S = x : 9σ, which results in . Thus, move, and by similarity all other transitions, will operate only on a single instance. There is a simple way out, however: in order to move all 10 instances in the example above, we can simply execute the move operation 10 times in sequence. Our reason for this semantics is that it allows for consistent assignment of stochastic rates in a potential stochastic semantics [34,35].
In §2.2, we have informally given parametric names to transitions, such as move(s, x, y), tag(x), a.s.o. We next interpret parametric names as elemental operations of a programming language that can be used to operate the artificial cellular matrix. We do this by first defining another formal language, the formal language of chemtainer programs, and then expressing the semantics of this language by means of state transitions, the latter employing the language of chemtainer states.
As with states, we define programs by a recursive grammar, which we initially keep as simple as possible 2.51A program π can be either the empty program , or a (parametric) instruction followed by another program (the remainder), with instructions being separated by semicolons. Instructions I can be any of the formerly introduced parametric names feed(xa, mj, ν), feed_tag(xa, qk, ν), move(sk, xa, xb), tag(xa), wrap(xa, xb), fuse(xa), burst(xa) and flush(xa) with and . Each instruction I has a set of associated transitions, and we write I : S → S′ to denote any such associated transition (as was already done in the previous part).
Strictly, the concatenation operator ‘;’ is only defined to concatenate single instructions with programs. In order to concatenate arbitrary programs π and π’, we introduce an additional operator ‘;’ for program concatenation through the following recursive definition 2.52and 2.53Because both concatenation operators have different domains, it is clear from the context whether the concatenation refers to single instructions or programs. Therefore, we will drop the syntactic differentiation and use the symbol ‘;’ for both operators.
A program π together with a system state S is expressed by the tuple and we write to denote that the program π can transform the state S into the state S′, called a result of π. Results are defined by the following structural operational semantics : 2.54 2.55 2.56In words, given a program that starts with instruction I with some associated transition from S′ to S″, the second rule allows us to transform the system into S″, thereby reducing the original program to whatever remains after execution of I. If program execution encounters an instruction with no associated transition that is applicable with the current state S′ (e.g. tag is asked to operate on an empty cell), the third rule allows us to skip the instruction without modifying the system state. This latter rule primarily addresses erroneous conditions and ensures that a program does not get stuck, simply because the respective molecules are not present at some point of the execution. Using non-deterministic semantics, rule (2.56) might even be used when rule (2.55) is applicable, thereby skipping instructions of the program sequence. This could be remedied with a stochastic semantics that assigns an arbitrarily small transition rate to rule (2.56).
We can now distinguish between autonomous and induced transitions, by extending the semantics with the rules 2.57and 2.58where S → S’ and S’ → S’’ is an autonomous transition inferred through (2.35). These rules allow autonomous transitions to occur at any time during program execution without affecting the program.
We extend structural congruence to program application using the inference rule 2.59
Lemmas 2.1 and 2.2, proved in the electronic supplementary material, allow for uncomplicated composition of instructions.
Program application is not context sensitive: if the application of program π on the initial state S can lead to the result S′, the application on the composition can lead to the result , containing S′ as a substate. is an invariant of π: 2.60
Programs can be concatenated, and the start configuration of the second program will be the result of the first program: 2.61
2.4. Extension: control flow directives and parallel execution
We could easily define control flow directives known from standard programming languages, such as branched execution and loops. This is particularly interesting in the context of feedback control. Let 2.62be a conditional over the system state. For example, f could measure whether a fluorescent signal at a certain location exceeds a certain threshold. We could then extend our grammar to 2.63to allow for conditional loops, where the semantics of the while statement are given by 2.64 2.65As evidenced with these definitions, there is nothing particular about control flow directives in the chemtainer calculus compared with other imperative languages. In order to introduce more elaborate control flows, the reader can simply apply standard text book procedures.
To make the above definitions more familiar, we employ the chemtainer calculus in order to decorate chemtainers with addresses and use those to mix the contents of two chemtainers.
The left column shows which parametric instruction is executed, and the right column shows the effect of its associated transition. Read from top to bottom, the left column therefore simply gives a program that can construct some desired state denoted on the right. 2.67
To use this sequence later on in programs, we define it as the parametric macro resource(x, σ, P). Now, we use this macro to create two chemtainers and fuse them 2.68
Keep in mind that the right column only shows a possible response of the system state to the program on the left, not the necessary response. In general, because of the ambiguity of transitions, the system could have transitioned into alternative states in response to the same program.
3. Instruction sets and their constructive power
The instructions given in equations (2.41)–(2.49) serve as examples of possible operations, rather than the final specification of an artificial cellular matrix. Potential real-world implementations might involve only a subset of the operations mentioned, or might enable other means of chemtainer manipulation. It then becomes interesting to determine whether different instruction sets are equivalent in what they can construct or whether the set of constructable states differs.
In order to approach the problem formally, we define the constructive closure of the language as the set of states that can be constructed from the empty state through some program . Formally 3.1Similarly, we define the reset closure as the set of states that can be transformed back into the empty set. Formally 3.2Of primary interest is of course the case where the machine can be reset from any state. In this case, one can formally transform any initial state into any target state, thereby programming sequences of states: if and , the concatenation π; π’ will transform S into S’: . This result, however, is mostly of theoretical interest, as an interim reset of the machine might not be desired when programming such state sequences. In a practical application, one would define a distance measure between states and ensure that the transition path length of programs is minimized, thereby preferring for example a single move instruction over a sequence of flush;feed_tag;move instructions with equal outcome.
We consider the following three incremental instruction sets 3.3 3.4 3.5
Let , and be the sets of programs over the respective instruction set. We will show the following relations between the corresponding constructive and reset closures: 3.6
If and only if all locations are flushable, i.e. if , the right-most chain of relations simplifies to 3.7
Diagram (3.6) can be decomposed into several theorems, for which we need to define the nesting level of a state S or local state P as the following 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15
With this definition, we can state the relations of diagram (3.6) more precisely:
The constructive closure of is the set of states that do not contain nested chemtainers nor free floating molecules other than tags: 3.16
The constructive closure of is the set of states that do not contain free floating molecules other than tags: 3.17
The constructive closure of is the entire set of states: 3.18
The reset closure of all instruction sets is limited to states that do not contain free floating molecules other than tags in non-flushable locations:
Proofs for all theorems are given in the electronic supplementary material. Note that these proofs are constructive: for any given target structure we can automatically generate a program that will construct S. This observation forms the core of a compiler that is able to generate a sequence of transitions for building a given target structure.
Theorems 3.1–3.4 assume that the entire set of molecules can be fed as resources. We now address the case where only a subset of compounds can be directly provided by feed and where chemical reactions are employed to produce compounds outside .
For some reaction with we can employ the program 3.20to initiate the chemical production of , where m3, m4 are not necessarily elements of . Applying this procedure repeatedly allows us to obtain successively bigger subsets of . However, without means for content separation, the set of producible states will be constrained by the stoichiometries of the reactions. If the artificial cellular matrix would offer content separation, e.g. by means of electrophoresis , we could encode this with the induced transition 3.21
where the exact partitioning of chemtainer content into P and P’ would depend on its electrophoretic mobility. Continuing the above program, we could now derive 3.22
The tagged chemtainers are ready to be used as input for subsequent reactions. This recovers the constructive power of the chemtainer calculus, provided that all elements of can be constructed by some chain of reactions. However, in order to maintain a well-defined mapping from DNA tags to chemtainer content, the separate operation would need to assert the correct redistribution of surface tags along with the content separation such that products can be unambiguously identified by their tags.
4. Programmable reaction cascades
Here, we demonstrate how our calculus (even with the minimal instruction set Imin) can be used to integrate chemical production with molecular computation in order to control programmable reaction cascades. This example is motivated by and closely mimics the synthesis of oligosaccharides in the Golgi apparatus [1,2]. Oligosaccharides are branched, heterogeneous polymers composed of typically 5–10 individual sugar monomers such as mannose, galactose and glucose. This diverse class of biochemicals is involved in various physiological processes pertaining, e.g. to cell–cell recognition, intra- and intercellular trafficking and metabolic modulation . However, their combinatorial richness poses a challenge for chemical oligosaccharide synthesis based on conventional chemical manufacturing techniques .
Biological oligosaccharide synthesis proceeds in the Golgi apparatus by adding individual monomers one unit at a time to specific binding sites of the growing oligomer. Monomers are attached to enzymes that promote the specific binding reactions 4.1
Here, Pi denotes an intermediate oligomer, to which monomer Mi is added at the site j. If binding sites are unique, a given enzyme–monomer complex contains all the information required to build the specific product Pi + 1 from Pi. Prior to these polymerization steps, monomers have to be attached to the respective enzymes 4.2
Chemical one-pot synthesis of a given target structure is challenging, because repetition of bindings sites in the oligomer structure can lead to undesired side products. The number of potential side products can be controlled, however, by forcing some reaction steps to occur sequentially while others are allowed to proceed in parallel . Weyland et al.  present an algorithm that identifies such optimal reaction cascades. For example, assume that the structure shown in figure 4 can be produced with the reaction cascade 4.3 4.4 4.5 4.6
where it has to be ensured that reactions (4.3)–(4.5) occur in isolation and prior to reaction (4.6).
Our strategy here is to employ the chemtainers in order to control the encounter of reactants and hence the order of reactions by encapsulating reactants within chemtainers. Fusion of chemtainers co-locates desired reactants and triggers their polymerization. Simultaneously, DNA gate computation on the chemtainer surface will reflect the change of chemtainer content after reaction. The same DNA computation can ensure that reactions are started if and only if other reactions have been performed beforehand. We use one location x0 for processing and one location xS for storing intermediate products.
We start by preparing chemtainers with enzyme complexes using the program given in equation (3.20). For example 4.7creates the state . See figure 5 for a graphical representation of these steps. We use this algorithm repeatedly to set up the machine in the following state 4.8With this mapping from chemical compounds to DNA tags, we translate the reaction cascade (4.3) and (4.6) into a set of DNA gates 4.9 4.10 4.11 4.12
To carry out reaction (4.3), we feed the gate (4.9) at location x0. We then progressively move chemtainers from xS to x0, initiate their fusion and transport the intermediate product P0 back to xS (see figure 6 for a schematic representation): 4.13
The flush instruction cleans the processing location in the case that the execution of an instruction was skipped because of rule (2.56). This prevents the subsequent fuse operations from operating on chemtainers that where intended to be kept separate.
Repeating the above algorithm with gate (4.10) and tags as well as gate (4.11) and tags produces chemtainers with the intermediate products P0, P1 and P2 tagged with κ, λ and μ, respectively. Eventually, the above algorithm with gate (4.12) and tags κ, λ, μ produces the chemtainer which can readily be moved to some output location.
Note that the above program operates over a finite set of resources, tags and locations, in order to build a compound from a potentially unlimited universe of target molecules, and the mapping between tags and compounds is established only within the controlling program. This demonstrates again the universality of the programmatic synthesis approach.
We have formally introduced the chemtainer calculus which is capable of capturing system states and transitions of an artificial cytoplasm. The chemtainer calculus allows us to describe the organization and manipulation of chemical compounds in possibly nested, addressable bioreactors. Elemental operations of a simple programming language are proposed that allow for the programmatic control of chemtainer transitions. A constructive proof is given that a programming language based on eight elemental instructions is capable of constructing any possible system configuration that can be expressed in the grammar of the chemtainer calculus. This proof can serve as a core component of a compiler for automated program inference. We have presented how our machinery can be used to compile and execute complex chemical synthesis protocols and have given an example from oligosaccharide synthesis which still poses a challenging task for conventional chemical manufacturing techniques. This framework for programmable chemical synthesis demonstrates how molecular computing and chemical production can be integrated in much the same way as in biological systems, for example in the Golgi apparatus.
One might object that chemical synthesis in chemtainers works the same as sequential mixings of reaction solutions in test tubes, which is a conventional methodology. What then is the added values of chemtainers? First, we point out that microfluidics in general (even without employing chemtainers), allows for the programmed set-up of reaction cascades in small volumes. This gives a general advantage over setting up reactions in test tubes either manually or even by liquid handling robots: as pointed out by Füchslin et al. , small spatial dimensions offer the possibility for rapid transport of intermediate compounds among different reaction environments. The accompanying reduction in processing times can be of crucial advantage in synthesis steps where intermediate compounds are only stable for short periods of time. Consequently, compounds not viable in present processing may become interesting candidates for, e.g. catalysis in miniaturized systems. Employing vesicles as reaction compartments allows for the use of even smaller reaction volumes on the femtolitre scale. Second, DNA-addressable reaction compartments can sometimes avoid the need for distinct reaction environments, in the sense that all reactants can be provided simultaneously in the same ‘pot’, but individual reactions are controlled by DNA-mediated vesicle fusion. This ‘automated assembly’ is programmed not in the microfluidic control, but in the DNA tagging of vesicles. Although this has not been exemplified in this paper, we can envision synthesis pathways where DNA tags participate in the reaction cascades, e.g. as aptamers. Folding the DNA tag set with the set of chemicals would allow chemical reactions to directly ‘report’ on their progress, such that, e.g. a DNA tag operation is triggered only after a chemical compound has been consumed. DNA-tagged reaction compartments further allow for the extraction and recovery of unreacted compounds by their unaltered DNA tag, and the extraction of reaction products by their altered DNA tag. Third, encapsulation of compounds into vesicular reaction compartments offers additional advantages for chemical manufacturing in microfluidics. Compounds do not contaminate microfluidic channels as they are physically contained in vesicles. Vesicles can expose a unified physical ‘interface’ (in terms of friction, buoyancy, charge density, etc.) for microfluidic control independent of their content. By altering the composition of membrane molecules, these properties can be altered vastly independently of the vesicle content.
In all our derivations, we have made ample use of non-deterministic semantics: we have proved that there exists a program that is able to induce desired transitions, and is thus able to construct a desired state. We have not taken into account the likelihood of those transitions—especially with respect to possible but undesired side reactions. As this is an important issue in the area of programming chemistry, we have carefully designed our transition system with an extension towards stochastic semantics in mind . Application of these known techniques to the chemtainer calculus would be the subject of future work.
The research leading to these results has received funding from the Danish National Research Foundation through the Center for Fundamental Living technologies (FLinT) and from the European Community's Seventh Framework Programme (FP7/2007–2013) under grant agreements no. 249032 (MATCHIT) and no. 318671 (MICReagents).
Steen Rasmussen, Andrew Phillips, Rasmus Petersen, Rudolf Füchslin, Ehud Shapiro and members of the MatchIT consortium are acknowledged for helpful discussions.
- Received October 25, 2013.
- Accepted March 3, 2014.
- © 2014 The Author(s) Published by the Royal Society. All rights reserved.