## Abstract

Many biological networks tend to have a high modularity structural property and the dynamic characteristic of high robustness against perturbations. However, the relationship between modularity and robustness is not well understood. To investigate this relationship, we examined real signalling networks and conducted simulations using a random Boolean network model. As a result, we first observed that the network robustness is negatively correlated with the network modularity. In particular, this negative correlation becomes more apparent as the network density becomes sparser. Even more interesting is that, the negative relationship between the network robustness and the network modularity occurs mainly because nodes in the same module with the perturbed node tend to be more sensitive to the perturbation than those in other modules. This result implies that dynamically similar nodes tend to be located in the same module of a network. To support this, we show that a pair of genes associated with the same disease or a pair of functionally similar genes is likely to belong to the same module in a human signalling network.

## 1. Introduction

Genes and their regulatory interactions form a large-scale cellular signalling network and a multitude of studies have examined their structural characteristics. One such characteristic is the *network modularity*, which is the tendency for a network to be divisible into subsets of nodes called modules within which interactions are much denser than interactions between modules [1–3]*.* Meanwhile, experiments have shown that the signalling network is considerably robust against various external and internal mutations. It is often assumed that such a high robustness is owing to some structural characteristics harboured in the signalling networks, and therefore researchers have investigated the role of modularity for network robustness. However, this relationship has yet to be fully elucidated and findings regarding various kinds of biological networks have been inconsistent. For example, it was shown that modular stabilizing in protein–protein interaction networks can be recombined to create highly robust chimeric proteins in evolution [4], whereas it was presented that robustness against structural perturbation and modularity are not significantly correlated [5]. It is also argued that gene regulatory networks (GRNs) are typically modular and this enhances the mutational robustness by reducing pleiotropy or allowing for the malfunctioning of one module without producing failure in other modules [6–8]. By contrast, modularity reduces robustness against mutation in metabolic networks [9] and no significant correlation was present between structural modularity and mutational robustness in evolved developmental GRNs [10]. Taken together, these findings make it difficult to reason the relationship between network modularity and mutational robustness in case of signalling networks, and thus a more comprehensive investigation is needed.

To unravel the mystery, we performed an investigation into both real and simulated signalling networks based on the Boolean network model [11–14]. As a result, we found that the network robustness is negatively related to the network modularity. In addition, this negative relationship became more definite as the network density grew sparser, which is interesting given that real human signalling networks are considerably sparse [15,16]. In other words, the human signalling networks have an interesting characteristic such that the robustness against perturbations can be efficiently controlled by the modularity property. Moreover, we found that the negative relationship between the network robustness and the modularity occurs mainly because the nodes in the same module with the node that is subjected to a perturbation are likely to be more sensitive to the perturbation than those in the other modules. This suggests that dynamically or functionally similar nodes tend to be located in the same module of a network. We supported this finding by showing that a pair of genes associated with the same disease or a pair of functionally similar genes is likely to belong to the same module in a human signalling network.

## 2. Material and methods

### 2.1. Network modularity and density

Given a network represented by a simple directed graph *G*(*V*, *A*), where *V* and *A* are the set of nodes and the set of interactions, respectively, we employ the modularity measure introduced in a previous study [17]. A partition *P* = {*V*_{1}, *V*_{2},…,*V _{M}*} of

*V*is a set of non-empty and non-overlapping subsets of

*V*, which covers

*V*(i.e.

*V*∩

_{i}*V*= Ø and ). Then, the directed modularity of the partition

_{j}*M*(

*P*) is defined as follows: 2.1where is the number of interactions whose both starting and ending nodes are included in module

*V*, or are the numbers of interactions whose starting or ending nodes only are included in module

_{i}*V*, respectively, and

_{i}*w*is the total number of interactions in the network. Then, the modularity of the network is

*M*(

*G*) = max

*(*

_{P}M*P*). To obtain

*M*(

*G*), we employed the optimization algorithm proposed in a previous study [3]. In addition, the density can be defined on

*G*(

*V*,

*A*) as

*|A|/*(

*|V|*× (

*|V| −*1)), and thus it is in the range of [0, 1].

### 2.2. Boolean network dynamics

The Boolean network model has been employed to evaluate robustness of signalling networks and has been intensively used to investigate the dynamics of various biological networks [13,18]. In the model, each state *s*(*t*) = (*v*_{1}(*t*), *v*_{2}(*t*),…,*v _{n}*(

*t*)) at time

*t*transits to the next state

*s*(

*t*+ 1) according to the set of update rules

*F*= {

*f*

_{1},

*f*

_{2},…,

*f*}, i.e.

_{n}*s*(

*t*+ 1) =

*F*(

*s*(

*t*)), where we randomly select either a logical conjunction or disjunction for

*f*with a uniform probability distribution. For example, if a Boolean variable

_{i}*v*has a positive relationship with

*v*

_{1}, a negative relationship with

*v*

_{2}and a positive relationship with

*v*

_{3}, then the conjunction and disjunction update rules are and , respectively. In the case of the conjunction, the value of

*v*at time

*t*+ 1 is 1 only if the values of

*v*

_{1},

*v*

_{2}and

*v*

_{3}at time

*t*are 1, 0 and 1, respectively. Owing to these update rules, the generated Boolean network will operate in the ordered regime. The network eventually converges to a fixed state or a limit-cycle attractor. We denote that the attractor converged starting from state

*s*(

*t*) by 〈

*s*(

*t*)〉. The network is called robust against the mutation at

*v*if 〈

_{i}*s*〉 is equal to where indicates the state perturbation of

*s*subjected to

*v*. This concept to measure robustness has been widely used [19–21]. More specifically, the robustness of a network,

_{i}*γ*(

*G*), is defined as follows: 2.2where

*S*is a set of whole states (i.e. |

*S*| = 2

*), and*

^{n}*I*(·) is an indicator function. Because |

*S*| is a very large number, we used a sample subset with instead of

*S*to calculate

*γ*(

*G*). With the partition

*P*= {

*V*

_{1},

*V*

_{2},…,

*V*}, in-module and out-module robustness,

_{M}*γ*

_{in}(

*G*) and

*γ*

_{out}(

*G*), are defined as follows: 2.3and 2.4where represents a projection operator to extract the partial attractor of a given subset

*V*⊆

_{i}*V*from an attractor 〈

*s*〉 and

*H*(〈

*s*〉, 〈

*s′*〉) denotes a similarity measure between two attractors 〈

*s*〉 and 〈

*s′*〉. More specifically, given

^{1}and (1 ≤

*l*≤

*l′*is assumed without generality),

*H*(〈

*s*〉, 〈

*s′*〉) is defined as follows: 2.5where

*h*is the Hamming distance (i.e. the number of different bits between two binary sequence) and

*K*is the size of the states. Figure 1 shows an illustrative example to compute the attractor similarity. Therefore, in-module robustness represents how well the module subjected to a mutation sustains the local robustness. When there is only one module in the network, the in-module robustness is similar to the network robustness

*γ*(

*G*).

## 3. Results

### 3.1. Modularity and robustness of the signalling networks

We first analysed two large-scale cell signalling networks:^{2} the human signal transduction network (HSN) and the canonical cell signalling network (STKE). HSN is composed of 5443 genes and their 37 663 interactions after removing neutral^{3} interactions from the original dataset (version 4) downloaded from www.bri.nrc.ca/wang [22], and STKE consists of 754 proteins and 1624 interactions after removing all neutral interactions from the integrated network by using the 51 canonical pathways (not specific to any organism or cell type) obtained from http://stke.sciencemag.org/ (see electronic supplementary material, table S1). We examined the modularity (*M*(*G*)) and the robustness (*γ*(*G*)) of the real signalling networks (see Material and methods for definitions of *M*(*G*) and *γ*(*G*)), and compared them with those of the random networks. To this end, we generated two groups of 100 random Boolean networks, ‘Random(STKE)’ and ‘Random(HSN)’, by shuffling interactions of STKE and HSN while preserving a degree distribution [23], respectively. We plotted the network modularity and robustness value distributions (figure 2). As shown in figure 2*a*, the modularity values of STKE and HSN are 0.732 and 0.547, respectively, which are significantly larger than the modularity values of the random networks (both *p* < 0.0001). On the other hand, the robustness values of STKE and HSN are significantly smaller than those of the random networks (figure 2*b*; both *p* < 0.0001). This observation raises the question of whether modularity is correlated with robustness in a network.

### 3.2. The relationship between the modularity and the robustness

Inspired by the observations in figure 2, we investigated whether network robustness was correlated with network modularity. In particular, we noted that the robustness of Random(HSN) is larger than that of Random(STKE) (*p* < 0.0001), whereas the modularity of Random(HSN) is smaller than that of Random(STKE) (*p* < 0.0001). More specifically, we hypothesized that the network robustness is negatively correlated with the network modularity. To reveal such a relationship between the network modularity and the robustness, we performed extensive simulations with the random scale-free Boolean networks generated by using the Barabási–Albert network-growth model^{4} [24] (figure 3). Specifically, we generated random Boolean networks with various modularity (figure 3*a*) or density (figure 3*b*) values and computed the robustness and the modularity of each network. As shown in figure 3*a*, the network robustness is negatively correlated with the network modularity. In other words, the network robustness decreases as the network modularity increases. In addition, we observed that results of real networks, i.e. STKE and HSN, are very close to the trend line of the random Boolean networks. This observation indicates that the negative relationship between modularity and robustness is conserved in real signalling networks. Furthermore, network robustness has been shown to be dependent on network density [25,26]. In this regard, we examined the change in the correlation coefficient between *M*(*G*) and *γ*(*G*) according to the change in network density (figure 3*b*) (see Material and methods for the definition of the network density). Specifically, the random Boolean networks were sorted by network density in ascending order and grouped into eight categories of the same size (see electronic supplementary material, table S2 for details). As shown in figure 3*b*, we found that the negative correlation coefficient increases in magnitude as group density decreases. In other words, the negative relationship between *M*(*G*) and *γ*(*G*) is most definite when the network is less dense. We note that the densities of real signalling networks are very small. (Densities of STKE and HSN are 0.0028 and 0.0013, respectively, and both of them correspond to the first group in figure 3*b*.) Taken together, these findings indicate that the signalling network represented by a sparse scale-free network shows a distinguished characteristic such that its robustness can be efficiently controlled by its modularity in a negative-correlated way.

### 3.3. Module-bounded robustness in highly modularized networks

To get a deeper insight into the negative relationship between robustness and modularity, we investigated whether there were any particular spatial characteristics of the non-robust nodes in a highly modularized network (see Material and methods for the definition of a non-robust node). A highly modularized network tends to form a highly clustered subset of nodes, and this trend may make the network less robust against mutation. Therefore, we subdivided the whole network robustness into the robustness measured either inside or outside the module: in-module (*γ*_{in}(*G*)) and out-module (*γ*_{out}(*G*)) robustness (see Material and methods). We examined the relationship of network modularity to in-module and to out-module robustness over the random Boolean networks used earlier (figure 4). As shown in figure 4*a,b*, the in-module robustness is clearly negatively correlated with the network modularity, whereas the out-module robustness is not significantly related to the network modularity. We examined the change in the correlation coefficients of *M*(*G*) with *γ*_{in}(*G*) and of *M*(*G*) with *γ*_{out}(*G*) against the network density. As shown in figure 4*c*, we sorted the random Boolean networks by network density in ascending order and grouped them into eight categories of the same size. The trend for in-module robustness is very similar to that of network robustness shown in figure 3*b*. That is to say, the negative correlation coefficient of *M*(*G*) with *γ*_{in}(*G*) decreased in magnitude as the network group density increased (*p* < 0.05), and therefore the negative correlation between *γ*_{in}(*G*) and *M*(*G*) was most definite in the most sparse network group. Conversely, there was no similar trend with network density for the correlation between *M*(*G*) and *γ*_{out}(*G*). This indicates that the in-module robustness is positively correlated with the network robustness (figure 4*d*), whereas the out-module robustness is not positively related to the network robustness (figure 4*e*). The correlation between the mutation modules and the network suggests that the relationship between network robustness and modularity is mainly caused by the relationship between in-module robustness and network modularity. In other words, the effect of a mutation tends to be limited to inside the module, which is where the mutation occurs. As shown in figure 4*f*, the strong contribution of *γ*_{in}(*G*) to *γ*(*G*) is sustained irrespective of the network density. From these observations, we can conclude that all the nodes affected by a mutation are likely to belong to the same module, based on Boolean dynamics.

### 3.4. Modularization of functionally similar genes

The findings illustrated in figure 4 imply that the node that is subjected to perturbations and the induced non-robust nodes are likely to be located in the same module. To find relevant evidence, we employed two datasets about functional similarity among genes to examine whether functionally similar genes tend to be located in a same module or not. Firstly, we investigated the distribution of genes associated with a particular disease in the HSN, considering that two genes related to the same disease are functionally similar to each other. We found a total of 20 modules in the HSN. Combined with the list of disease genes published in a previous study [27], we found 286 diseases caused by at least two genes and eventually 6849 same-disease pairs of genes (see electronic supplementary material, table S3 for details). Figure 5 shows whether each pair of disease genes is included in the same module. In the figure, a node represents a disease gene, and an edge (*v*,*w*) represents that the pair of genes, *v* and *w*, are associated with the same disease. In addition, the solid or the dashed shape of an edge (*v*,*w*) denotes that the two genes *v* and *w*, are included in the same or different modules, respectively (see electronic supplementary material, table S4 for details). The numbers of solid and dashed edges were 1457 and 5356, respectively. In other words, the probability that a pair of same-disease genes exists in the same module is significantly higher than the expectation at random (*p*-value < 0.0001; the random expectation was 0.1 calculated by , where *M*, *n* and *k _{i}* are the number of modules in the HSN, the number of disease genes and the number of disease genes in module

*i*, respectively). This means that disease-shared gene pairs tend to be located in the same module. In addition, many genes that are incident to solid edges only as C1QA, C1QB, C1S, C2, C3, C4A, C4B, C6, C7, C8B and C9 are observed. Most of these genes are known to be relevant to regulation of immune system process. On the other hand, few genes exist that are incident to the dashed edges only. In particular, GCK and MEN1 are connected by dashed edges of many different colours. In other words, the genes are not included in the same module with any same-disease gene while they are associated with many different complex diseases. Considering that mutations subject to MEN1 and GCK can cause a rare hereditary endocrine cancer [28] and maturity-onset diabetes [29,30], respectively, this implies that the dynamic behaviour of these genes cannot be easily analysed.

For other evidence, we investigated the distribution of functionally similar genes/proteins in the signalling network. In a previous study [31], a functional linkage network was constructed to quantify the likelihood of function associations between a pair of genes by using a common set of 25 564 protein encoding genes that was integrated from various gene data sources. After integrating the functional linkage network with the HSN, we found 195 160 gene pairs commonly included in those two databases. We sorted them by functional linkage weight in ascending order and classified them into eight groups of almost the same size (see electronic supplementary material, table S5 for details). For each group, we counted the proportion of gene pairs belonging to the same module in the HSN (figure 6). The proportion of gene pairs in the same module and the functional linkage degree are significantly and positively correlated. This means that, in general, gene pairs that are highly likely to be in the same function are also likely to be in the same module and vice versa.

## 4. Conclusion

By investigating both simulated and real cellular signalling networks, we demonstrated that network modularity and robustness against mutation are negatively correlated. In addition, we determined that the correlation was stronger for networks that were less dense. This result is particularly meaningful because real signalling networks are relatively sparse. This implies that the human signalling network has an interesting characteristic such that the robustness against perturbations can be efficiently controlled by the modularity property and that can be an explanatory factor for the modularized disease genes in the human signalling network. Also interesting is that the negative relationship between network robustness and network modularity exists mainly because nodes in the same module are more sensitive (less robust) to a perturbation on another node in that module than to perturbations in other modules. This implies that dynamically similar nodes tend to be located in the same module of a network. We also demonstrated that a pair of genes associated with the same disease or a pair of functionally similar genes is likely to belong to the same module in a human signalling network. In summary, the high modularity in a biological network can be considered a design principle that makes the network effectively sensitive to perturbation. However, it must be noted that modularity is one of many design principles that controls robustness.

A substantial number of studies have shown that network robustness against mutation is correlated to other structural characteristics, for example feedback loops [21]. Therefore, analysing the effects of the collective structural properties, including modularity and other aspects of network robustness, will hold promise as a future area of study within this field.

## Funding statement

This work was supported by the 2012 Research Fund of University of Ulsan.

## Acknowledgement

None declared.

## Footnotes

↵1 It means that the attractor 〈

*s*〉 is represented by a sequence of state transitions*s*_{0}→*s*_{1}→ … →*s*_{l−}_{1}.↵2 As the signalling networks consist of a huge connected component including most nodes and a number of small connected components, we considered only the largest connected component for analysis.

↵3 In the human signalling network, the neutral interactions represent physical interactions, and thus they do not necessarily need to be considered in analysing the dynamics.

↵4 In this study, Barabási–Albert network-growth model was used so as to generate random networks with a scale-free property inducing a few hub nodes and many other nodes, which is observed in real signalling networks.

- Received August 20, 2013.
- Accepted August 29, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.