## Abstract

Despite the long history of modelling human mobility, we continue to lack a highly accurate approach with low data requirements for predicting mobility patterns in cities. Here, we present a population-weighted opportunities model without any adjustable parameters to capture the underlying driving force accounting for human mobility patterns at the city scale. We use various mobility data collected from a number of cities with different characteristics to demonstrate the predictive power of our model. We find that insofar as the spatial distribution of population is available, our model offers universal prediction of mobility patterns in good agreement with real observations, including distance distribution, destination travel constraints and flux. By contrast, the models that succeed in modelling mobility patterns in countries are not applicable in cities, which suggests that there is a diversity of human mobility at different spatial scales. Our model has potential applications in many fields relevant to mobility behaviour in cities, without relying on previous mobility measurements.

## 1. Introduction

Predicting human mobility patterns is not only a fundamental problem in geography and spatial economics [1], but it also has many practical applications in urban planning [2], traffic engineering [3,4], infectious disease epidemiology [5–7], emergency management [8–10] and location-based service [11]. Since the 1940s, many trip distribution models [12–18] have been presented to address this challenging problem, among which the gravity model is the prevailing framework [13]. Despite its wide use in predicting mobility patterns at different spatial scales [19–22], the gravity model relies on specific parameters fitted from systematic collections of traffic data. If previous mobility measurements are lacking, the gravity model is not applicable. A similar limitation exists in all trip distribution models that rely on context-specific parameters, such as the intervening opportunity model [12], the random utility model [14] and others.

Quite recently, the introduction of the radiation model [15] has provided a new insight into the long history of modelling population movement. The model is based on a solid theoretical foundation and can precisely reproduce observed mobility patterns ranging from long-term migrations to intercounty commutes. Surprisingly, the model needs only the spatial distribution of population as an input, without any adjustable parameters. Nevertheless, some evidence has demonstrated that the radiation model may be not applicable to predicting human mobility at the city scale [23,24]. Understanding mobility patterns in cities is of paramount importance in the sense that cities are the foci of disease propagation, traffic congestion and pollution [6,25], partly resulting from human movement. These problems can be resolved through developing more efficient transportation systems and optimizing traffic management strategies, all of which depend on our ability to predict human travel patterns in cities [26]. Despite the success of the radiation model in countries, we continue to lack an explicit and comprehensive understanding of the underlying mechanism accounting for the observed mobility patterns in cities. We argue that this is mainly ascribed to the relatively high mobility of residents in cities compared with larger scales, such as travelling among counties. Inside cities, especially metropolises, high development of traffic systems allows residents to travel relatively long distances to locations with more opportunities and attraction. In this sense, the models that are quite successful in reproducing mobile patterns at large spatial scales fail at the city scale. Yet, revealing the underlying driving force and restrictions for such mobility to predict mobile patterns in cities remains an outstanding problem.

In this paper, we develop a population-weighted opportunities (PWO) model without any adjustable parameters as an alternative to the radiation model to predict human mobility patterns in a variety of cities. Insofar as the distribution of population in different cities are available, our model offers universal prediction of human mobility patterns in several cities as quantified by some key measurements, including distance distribution, destination travel constraints and flux. By contrast, the models that succeed in predicting mobility patterns at large spatial scales, such as countries, are inappropriate at the city scale because of the underestimation of human mobility. Our approach suggests the diversity of human mobility at different spatial scales, deepening our understanding of human mobility behaviours.

## 2. Results

### 2.1. Population-weighted opportunities model

The model is derived from a stochastic decision-making process of individual's destination selection. Before an individual selects a destination, she/he will weigh the benefit of each location's opportunities. The more opportunities a location has, the higher the benefit it offers and the higher the chance of it being chosen. Although the number of a location's opportunities is difficult to measure straightforwardly, it can be reflected by its population. Insofar as the population distribution is available, it is reasonable to assume that the number of opportunities at a location is proportional to its population, analogous to the assumption of the radiation model [15].

In contrast to the radiation model's assumption that individuals tend to select the nearest locations with relatively larger benefits, we enlarge the possible chosen area of individuals to include the whole city regarding the relatively high mobility at the city scale. As shown in figure 1, our assumption leads to much better prediction than that of the radiation model. Nevertheless, the possibility of travel in the observed data still decays as the distance between the origin and destination increases. Such decay, as predicted by different models and common in real observations, results from the reduction of attraction associated with a type of cost. For example, the gravity model [13] assumes that the attraction of a destination, i.e. its opportunities, is reduced according to a function of the distance from the origin. However, the distance function inevitably includes at least one parameter. To capture the mobility behaviours and avoid adjustable parameters, we simply assume that the attraction of a destination is inversely proportional to the population *S _{ji}* in the circle centred at the destination with radius

*r*(the distance between the origin

_{ij}*i*and destination

*j*, as illustrated in the inset of figure 1), minus a finite-size correction 1/

*M*, i.e. 2.1 where

*A*is the relative attraction of destination

_{j}*j*to travellers at origin

*i*,

*o*is the total opportunities of destination

_{j}*j*and

*M*is the total population in the city. Further, assuming that the probability of travel from

*i*to

*j*is proportional to the attraction of

*j*and recalling the assumption that the number of opportunities

*o*is proportional to the population

_{j}*m*, we have the travel from

_{j}*i*to

*j*as 2.2 where

*T*is the number of trips departing from

_{i}*i*and

*N*is the number of locations in the city.

The presented model reflects the effect of competition for opportunities among potential destinations: for a traveller at origin *i* travelling to a potential destination *j*, more population between *i* and *j* will induce stronger competition for limited opportunities, so that the probability of being offered opportunities will be lower. In this regard, it is reasonable to assume that the attraction of a destination for a traveller is the destination's opportunities inversely weighted by population between the destination and the origin. We therefore name our model the PWO model. We then demonstrate the universal predictability of mobility patterns in cities via the PWO model through a variety of real travel data in several cities.

### 2.2. Predicting mobility patterns

To validate the PWO model by comparison with the performance of the radiation model (see details in §4.3), we employ human daily travel data from four cities collected by GPS, mobile phone and traditional household surveys (see details in §4.1 and 4.2).

Figure 1 exemplifies travel from a downtown and a suburban location in Beijing predicted in an intuitive manner by the PWO model and the radiation model in comparison with real data. It shows that the radiation model underestimates the travel areas in both cases, whereas the travel patterns resulting from our model are quite consistent with empirical evidence, demonstrating the relatively higher mobility in cities than at larger spatial scales where the radiation model succeeds, such as countries.

We systematically investigate the travel distance distribution obtained by both models based on real data. Travel distance distribution is an important statistical property to capture human mobility behaviours [27–29] and reflect a city's economic efficiency [1]. We find that, as shown in figure 2, the distributions of travel distance predicted by the PWO model are in good agreement with the real distributions. By contrast, the radiation model underestimates long-distance (longer than approx. 2 km) travel in all cases. This implies that the assumption of the radiation model is inappropriate at the city scale by precluding individuals from choosing relatively long journeys to find better locations with more opportunities. The success of the PWO model in predicting real travel distance distributions in cities provides strong evidence for the validity of its basic assumptions.

We next explore the probability of travel towards a location with population *m*, say, *P*_{dest}(*m*), for both observed data and the predictive models. *P*_{dest}(*m*) is a key quantity for measuring the accuracy of origin-constrained mobility models (the radiation model and PWO model used here are both origin-constrained), because origin-constrained models cannot ensure the agreement between modelled travel to a location and real travel to the same location [3]. In figure 3, we can see that our model equally or better predicts empirical observations compared with the radiation model.

A more detailed measure of a model's ability to predict mobility patterns can be implemented in terms of the travel fluxes between all pairs of locations produced by a model in comparison with real observations, as has been used in [15]. As shown in figure 4, we find that—except the case of Abidjan—the average fluxes predicted by the radiation model deviate from the real fluxes, whereas the results from the PWO model are in reasonable agreement with real observations.

Note that the boxplot method used here cannot allow an explicit comparison to distinguish the performance of the two models. For example, in figure 4*e*, although the results deviate from the empirical data significantly, the boxes are still coloured green, suggesting the need for an alternative statistical method. Thus, we exploit the Sørensen similarity index [31] (see details in §4.4) to quantify the degree of similarity with real observations to offer a better comparison. We have also applied both models to six European cities and another four US cities to make a more comprehensive comparison (details are available in the electronic supplementary material, §§S1 and S2). The results are shown in figure 5. For all studied cases, our model outperforms the radiation model and exhibits relatively high index values, say, approximately 0.7, indicating that the PWO model captures the underlying mechanism that drives human movement in cities.

## 3. Discussion

We developed a PWO model as an alternative to the radiation model to reproduce and predict mobile behaviours in cities with different sizes, economic levels and cultural backgrounds. Our model needs only the spatial distribution of population as an input, without any adjustable parameters. The mobility patterns resulting from the model are in good agreement with real data with respect to travel distance distribution, destination travel constraints and flux, suggesting that the model captures the fundamental mechanisms governing human daily travel behaviours at city scale.

The radiation model, despite having the advantage of being parameter-free and performing well at large spatial scale, cannot offer satisfactory predictions of mobility patterns at the city scale. The problem lies in the underestimation of the relatively high mobility at the city scale. In particular, the radiation model assumes that limited mobility prevents people from selecting a farther location with more opportunities to gain more benefits than a nearby location. This assumption is reasonable at the intercity scale, but inappropriate in cities. The PWO model can successfully overcome this problem by assuming the attraction of a potential destination is inversely proportional to its population, which results from competition for opportunities in the whole city. Insofar as only population distribution is available, our model presently offers the best prediction of mobility patterns at the city scale, significantly deepening our understanding of human mobility in cities and demonstrating the universal predictability of mobility patterns at the city scale.

We have also compared the PWO model with three classical parametrized models: the gravity model [13], the intervening opportunity model [12] and the rank-based model [17] (see details in the electronic supplementary material, §S3). Although in rare cases the parametrized models can yield better predictive accuracy than the PWO model, their parameter-dependence nature limits their scope to the cases with particular previous systematic mobility measurements and relatively stable mobility patterns. The PWO model, without such limitations, has much more predictive power. By exploring the relationship between all the previous models and our PWO model (see details in the electronic supplementary material, §S4), we find that although these models have different hypotheses, they share an underlying mechanism: the probability that an individual selects a location to travel is decreased along with the increment of some prohibitive factors (distance or population). The key difference lies in that the gravity model, the intervening opportunity model and the rank-based model need adjustable parameters to quantify the decrement, whereas the decrement is naturally determined by population distribution in the radiation model and the PWO model.

It is noteworthy that despite the advantages of the PWO model in predicting mobility patterns at the city scale, the predictability could be improved further. The travel matrices established by the model share approximately 70% common part with the real data (figure 5). Although this accuracy can suffice for the requirements in many areas of applications, for example, in urban planning and epidemic modelling [32], it is still below the average upper limit of the predictability of human mobility [26]. In principle, the PWO model is essentially a type of aggregate travel model [3] based on the collective behaviours of groups of similar travellers, whereas the diversity of real individuals' behaviours [33,34] is in contrast to the assumptions of aggregate models, accounting for their inaccuracy in reproducing and predicting movement patterns. Microscopic mobility models, such as agent-based models [35–39] may offer better prediction of mobility patterns as an alternative but suffer from much higher computational complexity. Therefore, an efficient macroscopic mobility model taking the diversity of individual behaviours into account would be worth pursuing in the future to further deepen our understanding of human mobility.

## 4. Material and methods

### 4.1. Datasets

1. *Beijing taxi passengers*. This dataset is the travel records of taxi passengers in Beijing in a week [40]. When a passenger gets on or gets off a taxi, the coordinates and time are recorded automatically by a GPS-based device installed in the taxi. From the dataset, we extract 1 070 198 taxi passengers travel records. Some evidence indicates that in Beijing, the average travel distance of taxi passengers is similar to the commuting distance [41], and the spatial distribution of taxi passengers is similar to that of populations [42]. Thus, the taxi passengers’ data can capture the travel pattern of urban residents to some extent, although taxi passengers only constitute a small subset of the population in a city.

2. *Shenzhen taxi passengers*. The Shenzhen taxi passenger tracker data have the same data format as that of Beijing. The dataset records 2 338 576 trips by taxi passengers in 13 798 taxis in Shenzhen from 18 April 2011 to 26 April 2011.

3. *Abidjan mobile phone users*. The dataset contains 607 167 mobile phone users' movements between 381 cell phone antennas in Abidjan, the biggest city of Ivory Coast, during a two-week observation period [43]. Each movement record contains the coordinates (longitude and latitude) of the origin and destination. The dataset is based on the anonymized call detail records (CDRs) of phone calls and SMS exchanges between five million of Orange Company's customers in Ivory Coast. To protect customers' privacy, the customer identifications have been anonymized by Orange Company.

4. *Chicago travel tracker survey*. Chicago travel tracker survey was conducted by the Chicago Metropolitan Agency for Planning during 2007 and 2008, which provides a detailed travel inventory for each member of 10 552 households in the greater Chicago area. The survey data are available online at http://www.cmap.illinois.gov/travel-tracker-survey. Because some participants provided 1-day travel records but others provided 2 days, to maintain consistency, we only extracted the first-day travel records from the dataset. The extracted data include 87 041 trips, each of which includes coordinates of the trip's origin and destination.

### 4.2. Data pre-processing

The raw travel data of four cities contain latitude and longitude coordinates of each traveller's origin and destination. The raw data cannot be immediately used in mobility models. Alternatively, we used coarse-grained travel data through partitioning a city into a number of zones, each of which corresponds to a location in the literature [3]. Because of the absence of natural partitions in cities (in contrast to states or counties), we simply partition all cities into equal-area square zones, each of which is of dimension 1 × 1 km. Figure 6 shows the zone partition results and the number of zones in four cities. We assign an origin (or destination) zone ID to each trip if its origin (or destination) falls into the range of that zone. Then, we can accumulate the total number *T _{i}* of trips departed from an arbitrary zone

*i*, and the total number

*T*of trips from zone

_{ij}*i*to zone

*j*. In general, the number of trips departed from a zone is proportional to the population of the zone [15]. The spatial distributions of population density estimated from travel data in the four cities are shown in figure 6.

### 4.3. The radiation model

The radiation model [15] is a parameter-free model to predict travel fluxes among different locations based on population distribution:
4.1
where *T _{ij}* is the number of trips departing from location

*i*to location

*j*,

*T*is the total number of trips departing from location

_{i}*i*,

*m*is the population at location

_{i}*i*,

*m*is the population at location

_{j}*j*,

*s*is the total population in the circle of radius

_{ij}*r*centred at location

_{ij}*i*(excluding the origin

*i*and destination

*j*).

### 4.4. Sørensen similarity index

Sørensen similarity index is a statistic tool for identifying the similarity between two samples. It has been widely used for dealing with ecological community data [31]. Lenormand *et al*. [18] used a modified version of the index to measure whether real fluxes are correctly reproduced (on average) by mobility prediction models, defined as
4.2
where is the number of trips from location *i* to *j* predicted by models and *T _{ij}* is the observed number of trips. Obviously, if each is equal to

*T*, the index is 1; if all s are far from the real values, the index is close to 0.

_{ij}## Data accessibility

Data used in this work can be downloaded from http://sss.bnu.edu.cn/%7Ewenxuw/data%5fset.htm.

## Funding statement

This work was supported by NSFC grant nos. 61304177 and 61174150, Doctoral Fund of Ministry of Education (20110003110027) and partly supported by the opening foundation of Institute of Information Economy, Hangzhou Normal University (PD12001003002004).

## Acknowledgements

We acknowledge the organizers of the D4D Challenge for permitting us to use the Abidjan mobile phone dataset. X-Y.Y. thanks Prof. Ke Xu and Xiao Liang for their enthusiastic sharing of Beijing taxi GPS data.

- Received July 26, 2014.
- Accepted August 22, 2014.

- © 2014 The Author(s) Published by the Royal Society. All rights reserved.