## Abstract

The mapping between biological genotypes and phenotypes plays an important role in evolution, and understanding the properties of this mapping is crucial to determine the outcome of evolutionary processes. One of the most striking properties observed in several genotype–phenotype (GP) maps is the positive correlation between the robustness and evolvability of phenotypes. This implies that a phenotype can be strongly robust against mutations and at the same time evolvable to a diverse range of alternative phenotypes. Here, we examine the causes for this positive correlation by introducing two analytically tractable GP map models that follow the principles of real biological GP maps. The first model is based on gene-like GP maps, reflecting the way in which genetic sequences are organized into protein-coding genes, and the second one is based on the GP map of RNA secondary structure. For both models, we find that a positive correlation between phenotype robustness and evolvability only emerges if mutations at one sequence position can have non-local effects on the sequence constraints at another position. This highlights that non-local effects of mutations are closely related to the coexistence of robustness and evolvability in phenotypes, and are likely to be an important feature of many biological GP maps.

## 1. Introduction

Biological examples of genotype–phenotype (GP) maps are extremely wide-ranging. For example, a GP map can refer to the mapping between DNA sequences and their encoded proteins or between RNA sequences and their folded spatial structures. Even GP maps of networks have been considered, such as that of gene regulatory [1] or metabolic networks [2]. The properties of GP maps have been shown to strongly influence evolutionary outcomes [3,4], particularly in systems without direct interactions between genotypes, or epigenetic effects. In general, studying the properties of real biological GP maps is a challenging task due to the extraordinary complexity and size of these maps, which means that only small parts can be captured experimentally. Progress has been made by focusing on abstract or computational models of GP maps, which can be studied exhaustively. One of them is the map between RNA sequences and their secondary structure, a tractable abstraction of the final spatial structure. There exist computational models [5,6] that can predict the secondary structure of small RNA sequences, thereby allowing full coverage of the GP map. Another example is the map between amino acid sequences and their folded tertiary structure predicted by the HP model [7,8], a lattice model that approximates amino acids as being either hydrophobic (H) or polar (P). An even more abstract model is the so-called polyomino model [9,10], for which the map between different tile sets and the resulting assembled structures has been studied [11] as a representative of the formation of protein complexes (quaternary structure) by individual proteins.

The analysis of these GP maps [11–13] has shown that they share several properties which are likely to be universal for all GP maps. All maps show redundancy, meaning that there are many more genotypes than phenotypes and that different genotypes can map to the same phenotype. In addition, the maps show a skewed distribution for the number of genotypes per phenotype, also referred to as phenotype bias, meaning that a few phenotypes are mapped by a large number of genotypes, while many of the phenotypes are mapped by just a few genotypes. A further property shared by the maps is so-called shape space covering, which means that, from any genotype of the GP map, a large fraction of all phenotypes can be accessed through just a few mutation steps. This property was first described for RNA secondary structures by Fontana *et al.* [14].

A number of studies have discussed the concepts of robustness and evolvability in the context of GP maps. For the RNA secondary structure [11,15] and polyomino [11] GP maps, it was found that robustness and evolvability are negatively correlated for genotypes, whereas they are positively correlated for phenotypes. This positive correlation has also been found for the GP map of transcription factor binding sites [16], which can be both robust and evolvable. In addition, several studies [11,13,17] have shown that the robustness of a phenotype scales logarithmically with the number of genotypes mapping to that phenotype.

Quantitative definitions of robustness and evolvability have been put forward by Wagner [15] in the context of the RNA secondary structure GP map. For a genotype, the robustness is defined as the fraction of mutations that do not change the phenotype of that genotype, and the evolvability as the number of distinct alternative phenotypes that are accessible through a single mutation from that genotype. For a phenotype, the robustness is defined as the average over the robustness values of all genotypes mapping to that phenotype, and the evolvability as the number of distinct alternative phenotypes that are accessible through a single mutation from any of the genotypes mapping to that phenotype. The negative correlation found for genotypes is intuitive as a genotype cannot be both robust and evolvable. The positive correlation found for phenotypes is less intuitive, but crucial for evolution as it is an advantage if a phenotype on the one hand is robust against mutations but on the other hand also open to a large number of alternative phenotypes.

The space of all genotypes can be represented as a network in which each node corresponds to a genotype and each link to a single mutation step such that genotypes differing by a single mutation are connected. Since Maynard Smith and Eigen, this representation has been referred to as the protein space [18] or sequence space [19]. The subset of genotypes mapping to the same phenotype is often referred to as a neutral network since mutations between genotypes in this network do not change the phenotype. Wagner [15] states two causes for a phenotype to have a high robustness and evolvability. First, a large connected neutral network as this implies a high connectivity among genotypes with the same phenotype and thus a high robustness. Second, the fact that different genotypes sampled from a neutral network have different alternative phenotypes in their neighbourhood as this implies a high evolvability if a large neutral network spans the space of genotypes.

Recently, an analytically tractable GP map model has been introduced that exhibits all the introduced properties found for GP maps [20]. This so-called Fibonacci model defines the set of genotypes as all possible binary sequences of fixed length. The phenotype of a genotype is defined as the part of the genotype sequence up to and including the first stop codon, which is a pre-defined sequence—in this case two consecutive ones, i.e. ‘11’. This divides a genotype sequence into a constrained and unconstrained part as mutations of the sequence positions preceding the stop codon change the phenotype, whereas mutations of the sequence positions following the stop codon have no effect on the phenotype. The authors state that it is this organization into constrained and unconstrained parts that gives rise to the observed properties. In particular, the authors show that the positive correlation between phenotype robustness and evolvability results from the mutability of the stop codon as this can shift the boundary between the constrained and unconstrained part.

In this article, we further examine the possible causes for the positive correlation between phenotype robustness and evolvability. We do this by introducing two analytically tractable models. The first model follows the principles of gene-like GP maps and can be seen as a generalization of the Fibonacci model. The second model follows the principles of sequence-to-structure GP maps like that of RNA secondary structure. By using reference models in both cases, we show that the positive correlation between phenotype robustness and evolvability depends on the possibility that a mutation in one sequence position changes the sequence constraints at another position—or more concisely, that mutations have non-local effects on sequence constraints.

## 2. Gene-like GP map model

The gene-like GP map model is based on the organization of genetic sequences into protein-coding genes (genotypes) that are transcribed and translated into proteins (a form of phenotype).

The model works with an alphabet consisting of *k* > 2 letters, of which we choose one to be the stop codon, leaving *k* − 1 neutral letters. As the genotypes, we define all possible sequences of length *L* built from the letters of the alphabet, giving *k*^{L} genotypes in total. In analogy to the Fibonacci model [20], we define the phenotype of a genotype as the part of the genotype sequence up to and including the first stop codon (reading from left to right). We also refer to this part of the genotype sequence as the coding sequence. All genotypes that do not include a stop codon are assigned to a single so-called ‘undefined’ phenotype.

Within the genotype sequences, all letters are mutable, allowing changes between phenotypes. In order to determine the effect of a mutable stop codon, we consider a gene-like reference model that is identical to the gene-like model but that considers the first stop codon in a genotype sequence as rigid and allows no mutations of it. Both models are summarized in figure 1.

In the following, we will analyse and compare the key properties of the GP maps of both models. Figure 2 shows plots of these properties determined by an exhaustive enumeration of the GP maps.

### 2.1. Genotype robustness and evolvability

Using the definitions introduced by Wagner, genotype robustness and evolvability slightly differ for the gene-like model and its reference model.

For the gene-like model, consider a genotype with coding sequence length *l*, its robustness *r*_{g}(*l*) is given by
2.1as there are (*k* − 1)*L* total possible one-point mutations of which the (*k* − 1)(*L* − *l*) ones of the letters following the first stop codon do not change the phenotype. The gene-like reference model allows no mutations of the first stop codon. Thus, there are only (*k* − 1)(*L* − 1) possible one-point mutations in total and the robustness of a genotype is given by
2.2The robustness of genotypes with an undefined phenotype is not covered by these equations. Their robustness is in both cases given by *r*_{g,undef} = *r*^{ref}_{g,undef} = (*k* − 2)/(*k* − 1).

For both models, all (*k* − 1)(*l* − 1) one-point mutations of the letters preceding the first stop codon lead to distinct alternative phenotypes. In addition, only for the gene-like model, all *k* − 1 one-point mutations of the stop codon either also lead to distinct alternative phenotypes if there is a second stop codon in the genotype sequence or they all lead to the single undefined phenotype if there is no second stop codon. Hence, for the gene-like model, the evolvability *e*_{g}(*l*) of a genotype with coding sequence length *l* is given by
2.3By contrast, for the gene-like reference model
2.4Genotypes with undefined phenotype have in both models an evolvability of *e*_{g,undef} = *e*^{ref}_{g,undef} = *L*.

Expressing the genotype evolvability as a function of the robustness gives, for the gene-like model,
2.5The distinction between the cases of a second or no second stop codon in the genotype sequence explains the two separate lines found in figure 2*a*(top). The corresponding expression for the gene-like reference model is
2.6

For both models, there is a negative correlation between the genotype evolvability and robustness in agreement with the results found for the Fibonacci model [20] and other GP maps [11,15].

### 2.2. Phenotype robustness and evolvability

Consider a phenotype with coding sequence length *l*. For both models, all genotypes mapping to that phenotype have the same robustness. Hence, by definition, the robustness *r*_{p}(*l*) of that phenotype is identical to the corresponding genotype robustness
2.7and
2.8The same holds for the undefined phenotype: its robustness is given by *r*_{p,undef} = *r*_{g,undef} = *r*^{ref}_{p,undef} = *r*^{ref}_{g,undef} = (*k* − 2)/(*k* − 1).

By contrast, there is a fundamental difference in the phenotype evolvability for both models. For the gene-like reference model, from each genotype mapping to a particular phenotype, the same set of distinct alternative phenotypes is accessible through one-point mutations of the letters preceding the first stop codon. For this reason, the evolvability *e*^{ref}_{p}(*l*) of a phenotype with coding sequence length *l* is identical to the corresponding genotype evolvability
2.9This also holds for the gene-like model. However, due to the mutable first stop codon, the undefined phenotype or further *k* − 1 alternative phenotypes can be accessed from any genotype through one-point mutations of the first stop codon, depending on whether there is a second stop codon. Since the genotypes mapping to a phenotype differ in the non-coding sequences following the first stop codon, this adds an additional set of distinct alternative phenotypes that are accessible from that phenotype. Their number is given by one (undefined phenotype) plus *k* − 1 times the number of different letter sequences ending with a stop codon of length up to *L* − *l*, i.e. the number of distinct coding sequences placed in the current non-coding parts of the genotypes mapping to the phenotype. Thus, for the gene-like model, the evolvability of a phenotype with coding sequence *l* is given by
2.10For both models, the evolvability of the undefined phenotype is given by .

Expressed as a function of the phenotype robustness, the phenotype evolvability for the gene-like model reads
2.11Its derivative with respect to the robustness is
2.12This expression is larger than zero, i.e. there is a positive correlation between the phenotype evolvability and robustness, if *r*_{p} > *r*^{c}_{p} with
2.13As it is *r*^{c}_{p} < 1/*L*, the positive correlation holds for all robustness values except those corresponding to phenotypes with coding sequence length *l* = *L* as *r*_{p}(*l* = *L*) = 0. By contrast, for the gene-like reference model, expressing the phenotype evolvability as a function of the robustness gives
2.14This expression is identical to the corresponding one derived for the genotype quantities and so there is also a negative correlation between the phenotype evolvability and robustness for all robustness values.

The positive correlation found for the gene-like model agrees with the results found for the Fibonacci model [20] and qualitatively with those found for other GP maps [11,15]. The fundamental difference between the results for the gene-like model and its reference model confirms what has been found for the Fibonacci model [20], namely the importance of a mutable stop codon, which contributes the term in the expression for the evolvability that leads to the positive correlation. A mutable stop codon allows a part of the genotype sequence to become coding that was not coding before. As genotypes mapping to the same phenotype have different non-coding sequences following the first stop codon, this can cause the sets of accessible alternative phenotypes from these genotypes to differ, which is not the case for the gene-like reference model. In consequence, the smaller the coding sequence of a phenotype and thus the higher its robustness, the larger the number of genotypes with different non-coding sequences that map to it and thus the more significant the increase in evolvability.

### 2.3. Phenotype frequency, rank and robustness

Despite the stark difference in the correlation between phenotype robustness and evolvability, both gene-like models agree in several other properties mentioned in the Introduction.

One of them is the skewed distribution of the number of genotypes per phenotype, which is commonly displayed as a phenotype frequency–rank plot. By the frequency of a phenotype, we simply refer to the number of genotypes mapping to it. As both models are based on the same assignment of genotypes to phenotypes, the frequency *f*(*l*) of a phenotype with coding sequence *l* is in both cases identically given by
2.15where we use the fact that all genotypes that map to a particular phenotype comprise the same coding sequence of length *l* but differ in the *L* − *l* sites following the first stop codon, which can be occupied by any letter of the alphabet. Phenotypes with the same coding sequence length have the same frequency. Thus, the number of phenotypes *C*(*l*) with a given frequency *f*(*l*) is for both models identical to the number of phenotypes (coding sequences) of length *l*
2.16where we use the fact that the *l* − 1 sites preceding the stop codon in a coding sequence of length *l* can be occupied by any of the *k* − 1 neutral letters. By the rank of a phenotype, we refer to a ranking of the phenotypes according to their frequency such that the phenotype with highest frequency has rank 1. As the frequency decreases with the coding sequence length, it follows for the rank *r*(*l*) of a phenotype with coding sequence length *l* that
2.17For simplicity, we neglect the undefined phenotype which has a frequency of *f*_{undef} = (*k* − 1)^{L}. Expressed as a function of the rank, the frequency reads
2.18For large rank values, i.e. *r*≫1, this function behaves as a power law
2.19with *a* = *k*^{L−1−logk−1(k−2)} and exponent *α* = log_{k−1}(*k*). This highly skewed distribution is in accordance with what has been found for other GP maps [11,15]. In particular, the power law behaviour agrees with the results found for the Fibonacci model [20] and for model 1 in [21].

Both gene-like models also agree in the logarithmic scaling of the phenotype robustness with the frequency 2.20and 2.21The minor difference results from the different expressions for the phenotype robustness due to the consideration of a mutable first stop codon or not. Again, this logarithmic scaling is in agreement with what has been found for the Fibonacci model [20] and other GP maps [11,13,17].

## 3. RNA-like GP map model

In contrast to the gene-like GP map model, which considers concrete letter sequences as the phenotypes, the RNA-like model defines phenotypes as the structures formed by bonds between sequence positions of the genotypes, similar to the RNA secondary structure GP map.

For the model, we again consider an alphabet consisting of *k* > 2 letters and allocate a special role to one of the letters. Here, we choose one letter to be the ‘coding’ letter, while the other *k* − 1 letters are labelled as ‘non-coding’. As for the gene-like model, we define the genotypes as all possible sequences of length *L* built from the letters of this alphabet. However, as the phenotype of a genotype, we now define the structure that results by linking pairs of furthest-separated coding letters within the genotype sequence, proceeding from the outside to the inside to ensure a planar structure, i.e. such that there are no overlaps between links. Note that the positions of the linked coding letters suffice as a unique representation of the phenotype. Genotypes that do not form any links are again assigned to a single undefined phenotype.

In order to analyse the effect of the links, we again consider a reference model. It works with the same genotypes as the RNA-like model, but defines the phenotypes simply by the position of the coding letters within the genotype sequences. Both models are summarized in figure 3.

As before, we will analyse and compare the key properties of the GP maps of both models in the following. Figure 4 shows plots of these properties determined by an exhaustive enumeration of the GP maps.

### 3.1. Genotype robustness and evolvability

For the RNA-like model, the robustness of a genotype depends on the number of coding letters within the genotype sequence, which we denote by *i*. Consider a genotype with *i* coding letters. Each of the *L* − *i* non-coding letters within the sequence can be mutated to any of the other non-coding letters without changing the phenotype, giving (*k* − 2)(*L* − *i*) neutral one-point mutations. If *i* is odd, the innermost coding letter will stay unlinked in the phenotype. This letter can be mutated to any of the non-coding letters without changing the phenotype, giving further *k* − 1 neutral one-point mutations. If *i* is even, the robustness of the genotype depends also on the number of non-coding letters between the innermost pair of coding letters, which we denote by *j*. Each of these *j* non-coding letters can also be mutated to the coding letter without changing the phenotype, because this additional coding letter will stay unlinked. This gives further *j* neutral one-point mutations. Thus, it follows for the genotype robustness
3.1Genotypes with no or only one coding letter lead to the undefined phenotype, and their robustness is given by *r*_{g,undef} = 1 and *r*_{g,undef} = ((*k* − 2)(*L* − 1) + (*k* − 1))/((*k* − 1)*L*), respectively.

For the RNA-like reference model, the robustness of a genotype depends solely on the number of coding letters *i* and no distinction between cases is necessary. As for the RNA-like model, the phenotype does not change if one of the non-coding letters within the sequence is mutated to any of the other non-coding letters. However, all mutations from a non-coding letter to the coding one or vice versa change the phenotype. Hence, the robustness of a genotype with *i* coding letters is given by
3.2The robustness of genotypes mapping to the undefined phenotype, i.e. those with no coding letter, is given by *r*^{ref}_{g,undef} = (*k* − 2)/(*k* − 1).

For the RNA-like model, we also need to consider even and odd *i* separately when calculating the genotype evolvability. If *i* is odd, mutating any of the coding letters within the sequence to a non-coding one or vice versa changes the link structure and so the phenotype. The only exception is the one currently unlinked coding letter, as mutating this letter to a non-coding one has no effect on the phenotype. Thus, in total, *L* − 1 distinct alternative phenotypes are accessible through one-point mutations. If *i* is even, again all mutations from a coding letter to a non-coding one or vice versa change the phenotype, but now apart from the *j* non-coding letters between the innermost pair of coding letters because mutating one of them to the coding letter has no effect on the phenotype. In addition, mutating one of the two innermost coding letters to a non-coding letter has the same effect, namely it will lead to the removal of the innermost link. Hence, in total, *L* − *j* − 1 distinct alternative phenotypes are accessible through one-point mutations. Bringing everything together, the genotype evolvability is given by
3.3Genotypes with undefined phenotype, i.e. those with no or only one coding letter, have an evolvability of *e*_{g,undef} = 0 and *e*_{g,undef} = *L* − 1, respectively. In general, as both the genotype robustness and evolvability depend on *i* and *j*, it is not possible to express the evolvability as a function solely of the robustness. Nevertheless, a relationship between both quantities can be determined. As can be seen from (3.3), for all odd values of *i*, the evolvability is constant and uncorrelated from the robustness. In figure 4*a*(top), this is marked by a horizontal line, on which every second data point corresponds to an odd value of *i*. By contrast, for each fixed even value of *i*, there is a negative correlation between evolvability and robustness as the robustness increases with *j* (see (3.1)), while the evolvability decreases with *j* (see (3.3)), i.e. the larger the gap between the innermost pair of coding letters, the more robust and less evolvable a genotype. These negative correlations are marked by the lines with negative slope in figure 4*a*(top). Each of these lines of data points corresponds to a fixed even value of *i*∈{2, 4, …, 2⌊*L*/2⌋} and the respective values of *j*∈{0, 1, …, *L* − *i*}. As the robustness increases with decreasing *i*, the lines correspond to decreasing values of *i* from left to right. In addition, since the number of allowed values of *j* increases with decreasing *i*, lower evolvability values are reached from left to right, i.e. with increasing robustness. In the limit of large alphabet sizes, , the genotype robustness loses its dependence on *j* (see (3.1)) and the lines of data points with negative slope would converge to vertical lines. In consequence, the negative correlation for each fixed even value of *i* vanishes. But it still holds that lower evolvability values are reached with increasing robustness.

By contrast, for the RNA-like reference model, all genotypes have the same evolvability independent of the number of coding letters *i*,
3.4as all mutations from a coding letter to a non-coding one or vice versa change the position of the coding sites and so the phenotype, giving *L* distinct alternative phenotypes accessible through one-point mutations. This also holds for the genotypes belonging to the undefined phenotype, which have an evolvability of *e*^{ref}_{g,undef} = *L*. There is no correlation between the genotype evolvability and robustness in the RNA-like reference model.

This significant difference between the RNA-like model and its reference model shows that for the RNA-like GP map model the consideration of links between sequence positions of the genotypes seems to be crucial for the emergence of negative correlations between the genotype robustness and evolvability as it gives genotypes the possibility to be both more robust and less evolvable. The reason why we do not find a negative correlation for the RNA-like reference model but for the gene-like reference model is that for the RNA-like reference model only the position of the coding letters matters, whereas for the gene-like reference model the concrete letters in a coding sequence matter for the phenotype definition. This means that both the robustness and evolvability of genotypes depend on the length of the coding sequence, resulting in a negative correlation between the two.

### 3.2. Phenotype robustness and evolvability

For the RNA-like model, the robustness of a phenotype depends on two variables, namely the number of linked sites and the number of unlinked sites between the innermost pair of linked sites, which we denote by *i* and *j*, in analogy to the characterization of genotypes. A phenotype with *i* linked sites (even number) and *j* sites between the innermost linked pair results either from genotypes with *i* coding letters (even number) that are placed at the same positions as the linked sites, there are in total (*k* − 1)^{L−i} of such genotypes, or from genotypes with *i* + 1 coding letters (odd number) that are also placed at the same positions as the linked sites except for the innermost coding letter that can be placed at any of the *j* sites between the two second-innermost coding letters; there are in total *j*(*k* − 1)^{L−i−1} of such genotypes. For this reason, the phenotype robustness is given by averaging over the genotype robustness values of these two genotype kinds, weighted by the actual number of the genotypes
3.5In a similar way, the robustness of the undefined phenotype, originating from genotypes with no or only one coding letter, is given by
3.6

For the RNA-like reference model, all genotypes belonging to a phenotype with *i* coding sites have the same robustness and so the phenotype robustness is identical to the corresponding genotype robustness
3.7The same holds for the robustness of the undefined phenotype, i.e. *r*^{ref}_{p,undef} = *r*^{ref}_{g,undef} = (*k* − 2)/(*k* − 1).

For the RNA-like model, the evolvability of a phenotype with *i* linked sites and *j* sites between the innermost pair of linked sites is given by
3.8
3.9From all genotypes with *i* coding letters (even number) mapping to that phenotype, the sets of accessible alternative phenotypes are the same. Hence, they just contribute one time their genotype evolvability to the phenotype evolvability (first term in (3.8)). From all genotypes with *i* + 1 coding letters (odd number) mapping to that phenotype, the sets of accessible alternative phenotypes vary across the genotypes, but they are the same for genotypes which have the innermost coding letter positioned at the same site. As there are *j* allowed sites for the innermost coding letter, these genotypes, in principle, could contribute *j* times their genotype evolvability to the phenotype evolvability (second term in (3.8)). The sets of accessible alternative phenotypes from genotypes with *i* and those with *i* + 1 coding letters are distinct. However, not yet all considered accessible alternative phenotypes from genotypes with *i* + 1 coding letters are distinct and the second term needs to be corrected by the number of multiple counted phenotypes (third term in (3.8)). The form of this correction term can be explained in the following way. First, consider the set of distinct alternative phenotypes accessible from the genotypes for which the innermost coding letter is positioned at the ‘first’ site between the second-innermost pair of coding letters. Then, consider the genotypes for which this innermost coding letter is positioned at the ‘second’ site. Through a mutation of the non-coding letter at the ‘first’ site, an alternative phenotype occurs by linking the new pair of coding letters at the ‘first’ and ‘second’ site. However, this particular phenotype has already been accounted for by the genotypes for which the innermost coding letter is at the ‘first’ site (accessible there through a mutation of the non-coding letter at the ‘second’ site). In a similar way, for genotypes for which the unlinked coding letter is at the ‘third’ site, two accessible alternative phenotypes have already been accounted for, and so on. Summing up gives the third term. The evolvability of the undefined phenotype is given by . The phenotype evolvability depends only on *j*. But since the phenotype robustness depends on *i* and *j*, the phenotype evolvability cannot be expressed as a function of the robustness alone. However, a correlation between both quantities can be determined for fixed values of *i*∈{2, 4, …, 2⌊*L*/2⌋}. As can be derived from (3.5), the phenotype robustness increases with *j* for each fixed value of *i*. Also the phenotype evolvability increases with *j* for each fixed value of *i* since it holds that
3.10as *j*∈{0, 1, …, *L* − *i*} and so max(*j*) = *L* − 2. In consequence, for each fixed value of *i*, there is a positive correlation between the phenotype evolvability and robustness. In figure 4*b*(top), these positive correlations are marked by the lines with positive slope. Each of these lines of data points corresponds to a fixed value of *i*∈{2, 4, …, 2⌊*L*/2⌋} and the respective values of *j*∈{0, 1, …, *L* − *i*}. Similar to what has been found for the genotype quantities, the phenotype robustness increases with decreasing *i* and so the lines correspond to decreasing values of *i* from left to right. As a smaller *i* implies a larger value range of *j*, higher evolvability values are reached from left to right, i.e. with increasing robustness. In the limit of large alphabet sizes, , the phenotype robustness loses its dependence on *j* (see (3.5)) and the lines of data points with positive slope would converge to vertical lines. In consequence, the positive correlation for each fixed value of *i* vanishes. But it still holds that higher evolvability values are reached with increasing robustness.

By contrast, for the RNA-like reference model, from each genotype belonging to a particular phenotype with *i* coding sites, the same set of distinct alternative phenotypes is accessible and the phenotype evolvability is identical to the genotype evolvability
3.11The same holds for the undefined phenotype: its evolvability is given by *e*^{ref}_{p,undef} = *e*^{ref}_{g,undef} = *L*. Hence, there is no correlation between the phenotype evolvability and robustness in the RNA-like reference model.

This significant difference between the RNA-like model and its reference model shows that the positive correlations between the phenotype robustness and evolvability depend on relationships between sequence positions of the genotypes. If a mutation occurs at one site of the genotype sequence it can affect the role of a letter at a different site. This becomes important when the innermost coding letter is unlinked in the case of an odd number of coding letters. Mutating another letter from non-coding to coding or vice versa can ‘switch on’ this innermost coding letter and link it such that it has an effect on the phenotype. As this innermost coding letter can be positioned at different sites within the genotype sequences of genotypes mapping to the same phenotype, this can cause the sets of accessible alternative phenotypes from genotypes with the same phenotype to differ. The larger the gap between the innermost linked pair in a phenotype and thus the higher its robustness, the more significant this effect and the higher the increase in evolvability.

### 3.3. Phenotype frequency, rank and robustness

Similar to what has been found for the gene-like model and its reference, the RNA-like model and its reference qualitatively agree on other properties of the GP maps such as the shape of the phenotype frequency–rank plot and the logarithmic scaling of the phenotype robustness with the frequency. Here, we give just a qualitative description of the results shown in figure 4*c* and figure 4*d*; a detailed analytical description can be found in the appendix A.

Instead of a pure power law behaviour, the phenotype frequency–rank plots for both RNA-like models show a strong cut-off for large rank values (see figure 4*c*). This shape agrees with the one found for the frequency–rank plot of RNA secondary structure of the same length [11], thereby highlighting the agreement of our simple model with real sequence-to-structure GP maps. The reason for this shape can be explained in the following way. For the RNA-like reference model, the phenotype frequency depends only on the number of coding sites within the phenotype sequence, and decreases exponentially with it. Thus, the number of phenotypes with a given frequency is identical to the number of phenotypes with the corresponding number of coding sites, which, in turn, is identical to the number of different ways to place these coding sites across the phenotype sequence of fixed length (mathematically described by a binominal coefficient). For a large number of coding sites, corresponding to a low phenotype frequency, the number of different ways to place the coding sites is highly limited and so is the number of phenotypes with low frequency, thereby explaining the observed cut-off in the frequency–rank plot. For the RNA-like model, the frequency of a phenotype depends not only on the number of linked sites but also on the size of the gap between the innermost linked pair. For this reason, a much higher number of data points appears in the frequency–rank plot (see figure 4*c*(top)). Both frequency–rank plots show a similar overall shape because, for the RNA-like model, the phenotype frequency as well as the phenotype rank depend in a quite similar way on the number of linked sites as both quantities depend on the number of coding sites in the RNA-like reference model.

The exponential scaling of the phenotype frequency with the number of linked sites (RNA-like model) or the number of coding sites (RNA-like reference model) also causes the logarithmic scaling of the phenotype robustness with the frequency observed for both models (see figure 4*d*).

## 4. Discussion and conclusion

We introduced two simple and analytically tractable GP map models, a gene-like model that follows the principles of gene expression and an RNA-like model that is analogous to sequence-to-structure GP maps like that of RNA secondary structure. Both models reproduce several of the properties found for other GP maps, such as a skewed distribution of the number of genotypes per phenotype, a negative correlation between genotype robustness and evolvability, a positive correlation between phenotype robustness and evolvability and a logarithmic scaling of the phenotype robustness with the frequency. By introducing a reference model for each of the models, we were able to isolate the characteristics of the models that are responsible for the positive correlation between phenotype robustness and evolvability. For the gene-like model this is the mutable first stop codon, and for the RNA-like model this is the existence of links between sites. At first sight, both of these characteristics may seem very different, but they have a similar effect. Both mean that the mutation of a letter at one site of the sequence can affect the role of another letter elsewhere in the sequence, possibly a great distance away. In the gene-like model a mutation of the first stop codon in a sequence can ‘switch on’ another stop codon in the sequence, and in the RNA-like model a mutation from a non-coding to the coding letter or vice versa can ‘switch on’ (‘switch off’) another coding letter that is currently unlinked (linked). In general, these characteristics can allow genotypes that map to the same phenotype to access different sets of alternative phenotypes through mutations. This is compatible with Wagner's findings [15] that the sets of accessible alternative phenotypes from genotypes sampled from the same neutral network mostly differ, and his suggestion that this is one of the causes of the positive correlation between phenotype robustness and evolvability. For the RNA-like model, the dependence between sequence positions also leads to the negative correlations between genotype robustness and evolvability, which are not found in the RNA-like reference model.

The identified non-local mutation effects resemble the widely studied phenomenon of epistasis, but also differ from the conventional notion of epistasis in that a mutation in one location is not necessarily linked to a mutation in another specific sequence position, but instead can change sequence constraints at a range of different sequence positions elsewhere. The mutation of a stop codon for instance would normally not be regarded as an epistatic effect.

For the Fibonacci GP map model [20], it has been proposed that it is the organization of the sequences into constrained and unconstrained parts that determines the fundamental properties of GP maps. This can be further supported by our introduced models. For both the gene-like and RNA-like model, we have found that the respective main and reference models agree in properties such as the shape of the frequency–rank plot and the logarithmic scaling of the phenotype robustness with the frequency. This shows that, for these properties, the additional features of a mutable stop codon or links are not relevant and they are likely to be determined solely by the organization of the sequences into constrained and unconstrained parts. For the gene-like and gene-like reference models, it is the first stop codon that divides a genotype sequence into a constrained and an unconstrained part. For the RNA-like model, the linked sites can be seen as constrained and the unlinked sites as largely unconstrained. This is because we restrict ourselves to an alphabet with only one coding letter but at least two non-coding letters such that the letter of an unlinked site can be mutated without changing the phenotype, while it always changes if the letter of a linked site is mutated. In a similar way, this holds for the RNA-like reference model, for which the coding sites can be seen as constrained and the non-coding sites as largely unconstrained.

In [21], it has been investigated how the organization of sequences into constrained and unconstrained parts affects the form of the neutral network size distribution. This distribution ranges from a pure power law distribution if the constrained and unconstrained sites occupy fixed positions at each end of the sequence, respectively, to a log-normal distribution if the constrained sites can occupy any position in the sequence. We did not explicitly consider the distribution of the neutral network sizes, but the phenotype frequency–rank function, which is related to it. Here, we find a pure power law behaviour for the gene-like model, for which the sequences are just split into one constrained and one unconstrained part, and a strong deviation from a power law behaviour for the RNA-like model, for which the constrained (linked) sites can occupy any position in the sequence. Both are in accordance with the findings in [21].

To summarize, we introduce two GP map models that exhibit the same properties observed in previous analytical and computational GP map models, but that allow us to examine the specific requirements for a positive correlation between phenotype robustness and evolvability, one of the most striking properties of GP maps. One might argue that these models are very abstract, and far removed from real biological GP maps. We have considered a number of more advanced models, such as, in the gene-like case, models that include a start codon or even multiple start and stop codons. For the RNA-like case, we also considered models that allow only links between specific letter pairs. Apart from losing the analytical tractability, these alternative, more complex models did not give any significant new insights as all of the observed GP map properties can be observed in the simpler models we have introduced here. It therefore seems that non-local effects of mutations on sequence constraints are indeed likely to be an important feature in biological GP maps.

## Data accessibility

This article has no additional data.

## Authors' contributions

M.W. and S.E.A. designed the analysis, M.W. carried out the analysis, and M.W. and S.E.A. wrote the manuscript.

## Competing interests

We declare we have no competing interests.

## Funding

M.W. was supported by the EPSRC and the Gatsby Charitable Foundation. S.E.A. was supported by the Royal Society and the Gatsby Charitable Foundation.

## Appendix A. Further analytical description of the RNA-like model and its reference model

**A.1. Phenotype frequency**

For the RNA-like model, the frequency of a phenotype depends on the number of linked sites, denoted by *i*, and the number of unlinked sites between the innermost pair of linked sites, denoted by *j*. As mentioned above, a phenotype with *i* linked sites (even number) and *j* sites between the innermost linked pair results either from genotypes with *i* coding letters (even number) that are placed at the same positions as the linked sites, there are in total (*k* − 1)^{L−i} of such genotypes, or from genotypes with *i* + 1 coding letters (odd number) that are also placed at the same positions as the linked sites except for the innermost coding letter that can be placed at any of the *j* sites between the two second-innermost coding letters; there are in total *j*(*k* − 1)^{L−i−1} of such genotypes. Thus, the phenotype frequency is given by
A 1The frequency of the undefined phenotype, which results from genotypes with no or only one coding letter, is given by *f*_{undef} = (*k* − 1)^{L} + *L*(*k* − 1)^{L−1}.

For the RNA-like reference model, the frequency of a phenotype depends solely on the number of coding sites, denoted by *i*. A phenotype with *i* coding sites simply results from all genotypes with *i* coding letters placed at these sites and so its frequency is given by
A 2The frequency of the undefined phenotype is given by *f*^{ref}_{undef} = (*k* − 1)^{L}.

**A.2. Phenotype frequency versus rank**

For the RNA-like model, in general, the total number of phenotypes with a given frequency *f*(*i*, *j*) is not simply identical to the number of phenotypes with *i* linked sites and *j* sites between the innermost pair of linked sites as there might be phenotypes characterized by different values (*i*′, *j*′)≠(*i*, *j*) that also have the same frequency *f*(*i*′, *j*′) = *f*(*i*, *j*). In consequence, it is not possible to find a simple analytical expression for the total number of phenotypes with a given frequency *f*(*i*, *j*), which would be useful to calculate the phenotype rank. However, it is at least possible to find an analytical expression for the number of phenotypes characterized by the same values of *i* and *j*, which definitely have the same frequency *f*(*i*, *j*). We denote this number by
A 3This is identical to the number of ways to arrange *i* linked sites across a sequence of length *L* under the constraint that the innermost linked pair is separated by *j* sites. In order to derive this expression (first equality), we make use of an analogy. Imagine a ‘test sequence’ consisting of *i* linked sites and *j* unlinked sites representing the innermost gap, i.e. *i* + *j* sites in total. Now, by analogy, the number of ways to arrange *i* linked sites across a sequence of length *L* under the constraint that the innermost linked pair is separated by *j* sites is given by the number of ways to fill up the ‘test sequence’ with *L* − (*i* + *j*) further unlinked sites such that a sequence of length *L* results while *i* and *j* remain unchanged. For the first additional site, there are (*i* + *j* + 1) − (*j* + 1) allowed slots available in the sequence. In total, there are (*i* + *j* + 1) slots available, but adding to the (*j* + 1) slots in the innermost gap would change *j*. For the second additional site, there are (*i* + *j* + 1) − (*j* + 1) + 1 allowed slots, and so on. Thus, the total number of ways is given by the product of these factors over the *L* − (*i* + *j*) unlinked sites that have to be added, divided by (*L* − (*i* + *j*))! as the concrete order of these added sites does not matter.

Now, the rank of a particular phenotype is given by the sum of over all values of *i* and *j* that correspond to phenotype frequencies that are higher than the one of this considered phenotype. However, in general, there exists no simple analytical expression for the phenotype rank since the phenotype frequency does not decrease in a simple sortable order with the values of *i* and *j* and since there may be phenotypes characterized by different values of *i* and *j* but with the same frequency, as mentioned before.

For the RNA-like reference model, the phenotype frequency depends solely on the number of coding sites *i*. Thus, the number of phenotypes with a given frequency *f*(*i*) is identical to the number of phenotypes with *i* coding sites, which is given by a binominal coefficient
A 4As the frequency decreases with increasing *i*, it follows for the rank of a phenotype with *i* coding sites that
A 5For simplicity, we neglect the undefined phenotype. Owing to the sum over binominal coefficients, there exists no simplified expression for this equation and so no analytical expression for the frequency as a function of the rank. However, it is possible to determine the value pairs (*r*(*i*), *f*(*i*)) for all values of *i*∈{1, 2, …, *L*}, which agree with the data points shown in figure 4*c*(bottom). As mentioned above, the observed cut-off in the frequency–rank plot for large rank values results from the fact that the frequency decreases exponentially with *i* for all values of *i*, while the rank only marginally increases with *i* for large values of *i*.

For the RNA-like model, as can be seen from the formulae, the phenotype frequency as well as the number of phenotypes characterized by the same values of *i* and *j* depend in a quite similar way on *i* as the corresponding quantities do for the RNA-like reference model. In addition, for the phenotype frequency, this dependence on *i* dominates over the additional dependence on *j*. Thus, the frequency–rank plot (see figure 4*c*(top)) shows a similar overall shape to that for the RNA-like reference model. Because of the additional dependence on *j*, the data points appear in a group-like pattern, whereby each group roughly corresponds to one fixed value of *i* and the respective values of *j*.

**A.3. Phenotype robustness versus frequency**

For the RNA-like model, it is not possible to express the phenotype robustness as a function solely of the phenotype frequency because both the robustness and frequency depend on *i* and *j*. However, it is possible to determine a dependence between the robustness and frequency by assuming either *i* or *j* to be fixed. Solving the phenotype frequency (A 1) for *j*
A 6and inserting it into the expression for the phenotype robustness (3.5), gives the robustness as a function of the frequency and *i*
A 7In consequence, for fixed *i*, the robustness scales with the frequency according to *r*_{p}(*f*) = *α* − *β*/*f* with *α* = constant and *β* = constant. By contrast, solving the phenotype frequency (A 1) for *i*
A 8and inserting it into (3.5), gives the robustness as a function of the frequency and *j*
A 9Hence, for fixed *j*, the robustness scales with the logarithm of the frequency according to *r*_{p}(*f*) = *α*′log_{k−1}(*f*) + *β*′ with *α*′ = constant and *β*′ = constant.

For the phenotype robustness as well as the phenotype frequency, the dependence on *i* dominates over the dependence on *j* (see original expressions for both quantities: (3.5) and (A 1)). Thus, when expressed as a function of each other, overall, the second derived scaling of the phenotype robustness with the logarithm of the frequency will dominate. This is reflected by the plot in figure 4*d*(top), which shows a group-like pattern of data points. Each group of data points roughly corresponds to a fixed value of *i*∈{2, 4, …, 2⌊*L*/2⌋} and the respective values of *j*∈{0, 1, …, *L* − *i*}. Overall, the groups of data points lie on a line in the semi-log plot, i.e. behave according to (A 9) for fixed *j*, while the single points within each group behave according to (A 7) for fixed *i*.

For the RNA-like reference model, it is possible to express the robustness as a function solely of the frequency A 10The robustness scales linearly with the logarithm of the frequency, which is the same overall behaviour as found for the RNA-like model.

- Received August 25, 2017.
- Accepted December 7, 2017.

- © 2018 The Author(s)

Published by the Royal Society. All rights reserved.