## Abstract

Most large cities are spanned by more than one transportation system. These different modes of transport have usually been studied separately: it is however important to understand the impact on urban systems of coupling different modes and we report in this paper an empirical analysis of the coupling between the street network and the subway for the two large metropolitan areas of London and New York. We observe a similar behaviour for network quantities related to quickest paths suggesting the existence of generic mechanisms operating beyond the local peculiarities of the specific cities studied. An analysis of the betweenness centrality distribution shows that the introduction of underground networks operate as a decentralizing force creating congestion in places located at the end of underground lines. Also, we find that increasing the speed of subways is not always beneficial and may lead to unwanted uneven spatial distributions of accessibility. In fact, for London—but not for New York—there is an optimal subway speed in terms of global congestion. These results show that it is crucial to consider the full, multimodal, multilayer network aspects of transportation systems in order to understand the behaviour of cities and to avoid possible negative side-effects of urban planning decisions.

## 1. Introduction

In the last decade, urban transportation networks have been largely studied by means of spatial networks [1] in order to understand aspects of urban systems and their evolution. These studies comprise the morphology of street networks [2–17], their evolution [18–21] and their relationships with socio-economic indicators [22,23]. In parallel, there are also studies on the structure of subway networks [24–26], their evolution [27] and their robustness [28,29]. However, these networks are *not* independent, and the important result in [30] showed that the coupling between networks can be critical and can affect the global behaviour of a system. It is in this context that multilayer (or multiplex) networks [31–34] are studied and provide the convenient conceptual framework. A few recent studies considered the impact of the multilayer structure on various general processes [35,36], and specifically in the case of transportation networks [37,38]. Multilayer networks offer a good theoretical framework for understanding how interconnected transportation networks are shaping cities and how they may affect their operation. Moreover, given the increasing interest in urban systems, an empirical study of the impact of the multilayer structure of transport systems on mobility appears both crucial and timely.

In this study, we consider the mutually connected underground and street networks in the large metropolitan areas of Greater London and New York City, and explore how their coupling affects their global properties. In particular, we analyse the effect of varying the subway speed and show that increasing it can lead to unexpected counter-effects. Our analysis focuses on three main network features and findings: (i) the behaviour of quickest paths at the city scale, (ii) the local outreach and the urban horizon, and (iii) the spatial distribution of betweenness centrality (BC). It is important to stress that studies on urban transportation networks have important implications for urban policies and private investment, and, in general, play an important role in the urban planning chain. In fact, inter-modality transportation efficiency and simulations have been extensively studied in the transportation engineering literature [39], where the typical supply–demand approach prevails but where the analysis of topological properties of networks is almost wholly neglected and where the different transportation modes are often treated separately. One goal in this study is to shift the focus onto this topological coupling aspect of transportation network design: we show this to be extremely relevant, and suggest that the multilayer network view of these systems should be integrated into elaborated models of urban planning.

## 2. Data and network construction

Using data from Open Street Map (http://www.openstreetmap.org/ (accessed on 8 December 2014)), we construct both the street and the subway networks for London (UK) and New York City (USA). We downloaded data on street and underground networks in geo-referenced vectorial format from Open Street Map, which contains detailed streets and rail tracks networks, including train depots and double tracks. (The rationale behind the geographical extent of these networks is to include the full underground systems and surrounding street networks.) In addition, a series of automatic and manual topological cleaning operations were needed in order to extract consistent and usable graphs. The size and geography of the two cities are clearly different as we can observe it in figure 1*a,b*.

We thus obtained the weighted graph of the connected street network in its ‘primal’ representation, with nodes being street junctions and edges representing the street segments connecting them, and the weights given by the street length. Similarly, we obtained the connected underground network with nodes representing underground stations and links connecting successive stations on the same line, and weighted by the length of the line segment. From a theoretical point of view, the interdependent or multilayer [31] network, *G*_{multi} is defined as the union of these two networks. Here, we have subway stations and road intersections that we consider to be different nodes. Underground stations are accessible from more than one access on the street, but for the sake of simplicity we construct the multilayer network by connecting each underground station to its closest street junction only (a simplification that would not change the structure of quickest paths). In order to create the adjacency lists, we used a combination of Python scripts, ArchMap geo-tools (ArchGIS 10.2) and *ad hoc* manual corrections. Tools have been set to remove link redundancies, to correct the topology of the networks and to create the proximity matrix between street nodes (street junctions) and underground nodes (stations). The scripts have been corroborated with a full check of the data and corrections of the topology in editing sessions in the ArchMap environment (the computation of the various statistical measures have been done in the Python environment using NetworkX library and the maps have been produced using ArchGIS v. 10.2).

## 3. The generic nature of quickest paths

New York is composed of two large and almost disconnected components with the underground systems covering a similar spatial extent and carving-up the different boroughs. London instead presents—at a large scale—a typical radiocentric urban structure with the underground systems connecting satellite districts and peripheries to the urban core. Differences both in size and geography between these cities are also reflected by basic network descriptors shown in table 1. For both cities, the (spatial) diameter of the multiplex is essentially dominated by the street network. We also observe that the topological diameter of the multiplex is lower than the street layer, thanks to the subway structure allowing for topologically shorter paths. The efficiency of the subway is however also due to its speed which is in general larger than that of overground modes, such as private cars, taxis or buses. In order to reflect this, we introduce a parameter that describes the ratio of speeds in both systems, similar to the theoretical analysis proposed in [37]. This parameter *β* measures the travel cost in time units associated with the underground links. This means that the number of time units taken to traverse an underground link of length *l* metres is *βl*, which is 1/*β* times faster than the time taken to traverse the same length on the street network. Thus, a smaller *β* corresponds to a faster underground speed, when compared with the speed on the street network. The introduction of this parameter allows us to study the properties of the multilayer system as a function of underground speed. Naively, one could expect that the system as a whole will be more efficient for faster subways, but we will show here that it is not always the case and that in some cases we can observe an optimal value for *β*. Finally, *β* can be measured empirically, and we obtain for London and a slightly larger value for NY

We denote by the travel cost (i.e. the number of time units) of the quickest path between street nodes *i*, and by the cost of the quickest path between *i* and *j* in the multilayer network (i.e. a path that can traverse *both* street *and* underground links). The normalized quantity
3.1follows a distribution that is roughly constant for *β* larger than 0.2–0.3, as shown in figure 1*c*, demonstrating that the effect of *β*, is essentially contained in the average and variance of This is a rather surprising result, given that the two cities display many geographical and structural differences. The cost between nodes *i* and *j* can be written as
3.2where the sum is over all links *e* that belong to the quickest path *P*(*i*, *j*) and where is the cost on this link. (We neglect inter-modal change costs in this simple argument.) If the path is long enough, and if the random variables do not display long-range correlations and are not broadly distributed, the central limit theorem applies and the distribution of the follows a Gaussian distribution in a certain range. There are obviously deviations observed for small values of *β* coming from the fact that the paths' durations become very heterogeneous depending on the proximity of their origin or destination to subway stations. In this respect, a very high relative subway velocity enhances spatial differences in the city and may lead to an uneven distribution of accessibility, a fact that will be confirmed below with the local outreach analysis.

We also compute the average ratio between the travel costs from *i* to other street nodes through the multilayer network and through the street network, defined as
3.3where *N*_{s} is the number of street nodes. The larger this ratio, the larger the effect of the underground on travel costs. We see in figure 1*d* that typical values are of the order of 0.5 for both cities and that the effect of *β* is rather weak: a decrease from *β* = 1 to 0.5 leads to a decrease in of the order of 20%. In addition, it seems that the effect of subways in London is less important than in New York, which is probably due to the lesser extent of the subway in the Greater London area.

A central quantity for describing the importance of inter-modality is given by
3.4where is the total number of shortest paths between *i* and *j* (using either one or two networks), and the number of paths using edges of both networks at least once. It characterizes the importance of multi-modality for the path from *i* to *j*. If we sum over all possible destination nodes *j*, we can quantify the added value of the interlayer coupling to the reachability of nodes, and obtain the interdependency [37] of a street node defined as
3.5(Note that a similar measure has been used in the transportation design literature under the name of *inter-modal connectivity* [39].) In order to understand the effect of scale on the interdependence, we also define the interdependence profile as
3.6where is the Euclidean distance between *i* and *j* and *N*(*d*) is the number of pairs of nodes at Euclidean distance *d*. In figure 2*a*, we show the average interdependence among all street nodes as a function of *β* and the resulting interdependence profile figure 2*b*.

We see from these figures that, in both cities, the existence of the underground has a very large impact. For example, for *β* = 0.8 we obtain *λ* around 0.7, meaning that even when the underground is only 1.25 times faster than the street network, already about 70% of the quickest paths are going through the underground. A slight decrease in *β* for *β* close to one thus has a large impact on the structure of the quickest paths, while for smaller values of *β*, improving the subway speed does not bring a significant improvement of the quickest paths. In both cities, there is a sharp increase in *λ* for small Euclidian distances, meaning that already for relatively short trips, it is worth ‘hopping on’ to the underground. (Note that we neglect here waiting, walking and connecting times which can be significant [38].) The slope of the interdependence profile at small is increasing as *β* is decreasing, suggesting that a slight increase in the underground speed could make the networks highly interdependent even at very small scales.

Both cities therefore display a remarkably similar behaviour over all these interdependency-related quantities (in particular, see figure 2*b*), suggesting here again a possible common behaviour for multiplex transportation networks in cities. While further studies are needed to substantiate a claim of ‘universality’, our results point to the possible existence of some kind of statistical law of large numbers that applies to quickest paths in multiplex urban transportation networks.

We note that it is not trivial that the central limit theorem applies here, and it does not mean that the network topology is irrelevant. The fact that we can sum a large number of quantities, which are essentially uncorrelated (a necessary condition for the central limit theorem to apply) comes from the specific structure of these transportation systems (spatial constraints for example certainly play an important role). In addition, more complex quantities (such as the interdependence for example) also display a high level of similarity for the two cities, a fact that cannot at this stage be simply related to a central limit theorem. These different results point to the potentially useful fact that actually few parameters seem to govern the behaviour of these quantities, which could lead to many useful simplifications in more elaborated models that contain a large number of parameters.

## 4. Local outreach and the urban spatial horizon

The presence of a transportation mode such as a subway affects the overall performance of a city in terms of efficiency of transport and the accessibility of certain locations, but also has an important impact on how pairs of locations are connected. In order to measure this effect, we define the *spatial outreach* of a street node as the average Euclidean distance from *i* to all other street nodes that are reachable within a given travel cost, *τ*:
4.1where is the Euclidean distance between node *i* and *j*, and is the number of nodes reachable on the multilayer network within a given travel cost *τ*. In figure 3, we show the average local outreach as a function of the travel cost threshold *τ*, which displays a nonlinear behaviour due to the different speeds achievable in the two transportation modes. This provides support for a general effect that is already known: for longer trips, faster transportation modes are used (see for example [38] the UK case).

For New York, the unit of time is given by the average car speed on the street network which is 15.6 km h^{−1} (e.g. [40]). Rescaling *τ* by this velocity, we then obtain an effective maximum speed (for *β* = 0.1) of 30 km h^{−1} (close to the 28 km h^{−1} discussed in [41]). For London, the same calculation with an average car speed of 16 km h^{−1} (see Transport for London, http://www.tfl.gov.uk/ (accessed on 8 December 2014)) yields an effective maximum speed of 21 km h^{−1}. (This difference in speeds is due to the areas considered, as New York is almost entirely covered by the underground network.)

As shown in figure 4*a*,*b*,*d*,*e* as *β* decreases, the nodes having a high local outreach are concentrated close to underground stations where the underground is the most accessible, and the graph consisting of high-outreach nodes (red nodes on the map) becomes less fragmented. In other words, as the underground becomes faster, a continuous area of high-outreach nodes emerges (the *commutable zone*) in the city centre and around the nodes of the underground network, implying that a person can travel from this area to faraway places (large Euclidean distance) at a small travel cost *τ*. The location of this highly accessible zone cluster from a dispersed configuration (as in figure 4*d*) to a centralized one (as in figure 4*a*) which shows a centralization effect due to the accessibility provided by the underground. The dispersion of the local outreach also displays a very interesting result demonstrated by its Gini coefficient Indeed, in figure 4*e*,*f* we see that for both cities for *β* > 0.5 the accessibility is distributed almost uniformly among all the places in the cities, while for smaller *β* (faster underground) the shift to an uneven distribution of accessibility is clear. This result suggests that transportation policies that focus on increasing the speed on a single travel modality may lead to undesirable spatial heterogeneity in the accessibility of different locations.

We show in figure 5 the probability that the outreach is larger than a certain fraction *αL* of the size of the city, and we observe the existence of a threshold *α _{c}*. The existence of a threshold less than one means that, for given values of

*β*and

*τ*, there is a maximal value

*L*

_{m}for the outreach. We can estimate the value of

*L*

_{m}by using a simple argument: the maximum value is reached when the path is ‘essentially’ made on the quickest transportation mode, the subway. This transportation mode has a velocity given by

*v*/

*β*, and the probability that a station is within reach (in a circle of radius

*d*

_{0}corresponding to the typical walking distance to reach the subway) is 4.2where is the density of subway station (

*A*=

*L*

^{2}is the area of the city and

*N*

_{u}is the number of subway stations). The maximal outreach is then given by 4.3and is then given by 4.4

This last equation shows in particular that the quantity should increase linearly with with a constant of proportionality depending on the geometry of the city, and we observe that this scaling is in agreement with simulations (figure 5*c*,*d*). In particular, we see that the slopes for London and New York are different: the ratio of the constant pre-factors is about 10, suggesting that the subway system in London is more efficient in terms of the outreach that can used as a measure of the ‘urban horizon’.

## 5. The geography and distribution of urban centrality

The BC [42] is one of the important quantities in complex networks, and in street networks in particular [8]. It quantifies the importance of a node as being the amount of traffic going through it, assuming uniform demand where the traffic between all pairs of nodes is the same. This quantity is very relevant in urban systems: in particular, it is correlated with the locations of shops and other micro-economic activity [22,23], urban growth [19,20] and land-use intensity [43].

In the case of car traffic and congestion, the absence of detailed traffic models or mobility data leads us to use the BC in order to identify the *potentially* congested locations and the effects of spatial structure on the shortest path structure. Even if we know that the assumptions used in the BC calculation can lead to some inaccuracies [44], it is the simplest proxy that contains some level of information about real traffic. We thus explore in this section the spatial distribution of BC in the street network and how it is affected by the underground system. The BC of a street node in the street network is defined as
5.1where is the number of quickest paths between *i* and *j* in the street network, of which goes through street node *v*. Similarly, we define the BC of a street node in the multilayer network as
5.2where is the number of quickest paths between *i* and *j* in the multilayer network, of them goes through street node *v*.

We can then observe how the parameter *β* impacts the mobility distribution and the geography of potentially congested areas. The maps in figure 6*a–d* show the BC spatial distribution for both cities computed on streets for *β* = 1 (*a*,*b*) and *β* = 0.1 (*b*,*c*). These maps clearly display a dramatic change in the spatial distribution of central places when introducing an underground system, shifting congestion from internal street routes and bridges to inter-modal places located at the terminal points of the underground networks, which presumably are used as entry/exit gates for suburban flows to reach core urban areas. Remarkably enough, in both cities, these places are located in urban areas that do not overlap with the underground system, thus possibly creating congestion in unexpected places. In other words, the introduction of underground networks operate as a decentralizing force creating congestion in places located at the ends of underground lines and not, for example, in the city centre as one might expect referring to classical results on rewiring processes for chain or lattice networks [1] in which BC is correlated with the distance to the gravitational centre. The statistical dispersion of BC can be measured by its Gini coefficient and also suggests that congested places always become more critical in the system as *β* decreases. In fact, as shown in figure 6, the Gini coefficient of BC increases as the underground becomes more efficient (faster, decreasing *β*), meaning that a larger fraction of quickest paths use it; and the BC distribution is less homogeneous, making the system more fragmented and less resilient.

Examining the BC Gini as a function of *β* and the interdependency *λ* in London (figure 7), we observe a non-trivial optimal value for *β* for which flows are the most homogeneously distributed across street junctions. In New York (figure 7*b*), however, there seem to be room for small *β* and small congestion and the absence of a non-trivial optimum for New York suggests (as discussed theoretically in [37]) that—surprisingly—it has a more marked monocentric aspect than London. In other words, the congestion in central places in New York is so large that introducing an efficient subway system is always better, even if it creates congestion at other points. Remarkably, these results on the BC and on the existence of an optimal point are thus in agreement with a recent theoretical model of coupled transportation networks, where—depending on the distribution of trip targets—two regimes were observed: one in which the optimal coupling is trivially the maximum, and another where a non-trivial optimal coupling exists [37].

## 6. Discussion

We have considered the effect of the coupling between two transportation layers on various quantities and we can summarize our results as follows. For quantities relating to quickest paths (interdependency, average quickest path duration), we observe a remarkable similarity between the two cities considered, suggesting the possibility of a universal behaviour requiring further study. This universality might originate in the fact that the quickest path can be seen as a sum of random variables, which inevitably leads to some sort of central limit theorem. This seems to be the case for the probability distribution of the quickest path time duration, which (once normalized) is a universal function for a reasonable range of subway speeds. More involved quantities such as the local outreach and the urban horizon also display a simple common behaviour that cannot be recovered using a back-of-the-envelope argument about the quickest path. This possible universality suggests that few parameters seem to govern the behaviour of quickest paths, which could lead to many useful simplifications in more elaborated models that contain a large number of parameters. More data on more cities are however needed in order to validate this universal idea and to understand its origin.

We also studied the impact of the coupling of layers on the spatial distribution of the BC. We observe results in agreement with previous theoretical findings, with in particular the existence for London of an optimal subway velocity in terms of congestion. These results on spatial distribution of centralities can also be understood in the framework of another study showing how decentralizing housing in London leaves room for commercial activities [45]. Even if the direct causal link between land-use change and transportation network evolution is still not clear [46], our results seem to go in that direction. It would be interesting to confront our results with realistic models used by the transportation community, in particular when the subway speed is modified. More generally, we believe that more empirical studies are needed in order to better understand the complex coupling between land-use and the structure of multimodal networks.

It thus seems clear that it is important to consider full multimodal, multilayer network aspects in order to understand the behaviour of an urban transport system—and thus to understand the effects of transport on other features of interest. Even if these studies are still very theoretical, they show convincingly that reasoning with only one transportation mode can be extremely misleading, and that policymakers cannot limit themselves to a single aspect of an urban system without risking making decisions that are locally correct but globally wrong.

## Data accessibility

Original network data are available on Open Street Map (http://www.openstreetmap.org/ (accessed on: 8 December 2014)). The cleaned and topologically corrected networks are also freely available (doi:10.6084/m9.figshare.1317306). Direct link to data: http://figshare.com/s/56135c96bcd311e4b49106ec4b8d1f61 (date of creation: 25 February 2015)). They include ArchMap shape file (.shp) containing streets and underground networks with their adjacency lists. These files can be opened on any GIS platform.

## Authors' contributions

S.S. wrote the computational scripts and performed the computations on the multiplex networks. E.S. prepared the data and produced the maps. All authors designed the research and wrote the paper.

## Competing interests

The authors declare no competing financial interests.

## Funding

M.B. acknowledges funding from the European Commission FET-Proactive project PLEXMATH (grant no. 317614). S.S. thanks the James S. McDonnell Foundation 21st Century Science Initiative—Complex Systems Scholar Award (grant no. 220020315) and the Scottish Informatics and Computer Science Alliance for financial support.

## Acknowledgements

E.S. thanks Bilal Farooq, Riccardo Scarinci, Michel Bierlaire, Sergio Porta and Luis Bettencourt for their suggestions at various stages of the research. E.S. and S.S. thank Sergio Porta for hosting us at his laboratory in Glasgow at the early phase of the project. M.B. thanks Riccardo Gallotti for discussions.

- Received July 21, 2015.
- Accepted August 28, 2015.

- © 2015 The Author(s)

Published by the Royal Society. All rights reserved.