## Abstract

We present an exhaustive study of the rank-distribution of city-population and population-dynamics of the 50 Spanish provinces (more than 8000 municipalities) in a time-window of 15 years (1996–2010). We exhibit compelling evidence regarding how well the MaxEnt principle describes the equilibrium distributions. We show that the microscopic dynamics that governs population growth is the deciding factor that originates the observed macroscopic distributions. The connection between microscopic dynamics and macroscopic distributions is unravelled via MaxEnt.

## 1. Introduction

### 1.1. Patterns in social systems

Orderliness, reflected in either scaling properties [1] or power laws [2–6], is encountered in different frameworks involving social groups. A salient example is Zipf's Law [7,8], a power law with exponent −2 for the density distribution function that is observed in describing urban agglomerations [9–11] and firm sizes all over the world [12]. This kind of ‘order’ has received immense attention in the literature. Pertinent regularities have been found in other scenarios as well, ranging from scientific collaboration networks [13] and total number of cites by physics journals [14] to Internet traffic [15,16] or Linux package links [17]. Another special regularity has also been found in the density distribution function of the number of votes in Brazilian elections [18]: a power law whose exponent is −1 (this kind of distribution is the origin of the so-called First-Digit Law, or Benford's Law, as shown in Weisstein [19]). This behaviour is also found in Hernando *et al.* [20] for the city size-distribution of the province of Huelva (Spain), and in the distribution of the number of votes in the 2008 Spanish General Elections. Undoubtedly, some aspects of human behaviour reflect a kind of ‘universality’.

Common to all these disparate systems is the lack of a characteristic size, length or frequency for the observable of interest, which makes it scale-invariant. In our previous work [20], we have introduced an information-theoretic technique based upon the minimization of Fisher's information (MFI) measure [21] that allows for the formulation of a ‘thermodynamics’ for scale-invariant systems. The methodology establishes an analogy between such systems and physical gases that, in turn, shows that the two special power laws mentioned in the preceding paragraph lead to a set of relationships formally identical to those pertaining to the equilibrium states of a scale-invariant non-interacting system, the scale-free ideal gas (SFIG). The difference between the two distributions is thereby attributed to different boundary conditions on the SFIG.

Many social systems, however, are not easily included in the two previous descriptions. Inspired by opinion dynamics models [18,22,23], we described in a previous work [24] a numerical process that reproduces the shapes of the empirical city-population distributions. The model is based on a competitive cluster growth process inside a scale-free ideal network (a scale-free network in which degree distribution is described as an SFIG). The finite size of the network introduces competition, and thus correlations in the cluster sizes. The larger the competitiveness, the larger the deviation from the SFIG distribution. The equivalence with the workings of physical gases was shown to be compelling indeed. However, some aspects of the concomitant problems defied full understanding, because an analytical prescription for the classification of the size-distribution of social groups was missing.

### 1.2. Entropy measure of information

*In order to do look for such analytical understanding, we will appeal here to a variational approach centred on information theory*. On the basis of ideas by Hernando *et al.* [25] we deal with the well-established MaxEnt technique [26–29] by including, within the associated Lagrangian, information regarding equations of motion, i.e. ‘dynamical’ information of a kind that goes beyond the customary one, based upon expectation values.

We have demonstrated that proportional growth and hyper-exponential growth (also known as *q*-growth where *q* is a real value that parametrizes the growth dynamics) can both be described in these terms, accurately predicting the equilibrium size-distribution for such systems [25]. This is tantamount to giving a dynamical interpretation to the ‘information cost’ introduced in Baek *et al.* [30]. Our approach relates the microscopic dynamics of the system elements with the observed macroscopic distribution. We remark that the introduction of dynamical information results in a net decrease of the entropy. Take for example geometric Brownian motion. In the case of a pure diffusive process, the final state covers all space with equiprobability, while in the geometric Brownian process the final state does not *uniformly* cover all space. Of course, a lower value of the entropy ensues then. See in this vein Corominas-Murtra & Solé [8], where Zipf's Law has been shown to be the asymptotic solution for a wide range of entropy losses. The important concept of entropy loss is lucidly described in Harremoes & Topsoe [31,32]. See also Gabaix [11] and Zanette & Manrubia [33] for interesting related work.

### 1.3. Aim of this work

Our main present purpose is to apply what we learned in Hernando *et al.* [25] to empirical social systems, thus demonstrating the applicability of MaxEnt to collective human behaviour, and the potential use of an explicit *social thermodynamics*. The applicability of this theoretical procedure will allow us to reduce the large number of microscopic degrees of freedom to a few macroscopical parameters. In this work we test two different tableaus for the migration dynamics underlying the empirical city population:

Proportional growth with a constraint in the total-population. The city populations evolve as geometrical Brownian walkers but total population remains constant.

*q*-growth with total-population constraint plus proportional drift. This more sophisticated evolution uses*q*-metric Brownian walkers [25] to reproduce migration patterns and adds some drift of proportional nature to simulate the growth of the total population.

We fit each microscopic dynamical model to the empirical data on city-populations, and compare the MaxEnt prediction for macroscopic size-distribution with the actual one. *If compatibility is encountered between the microscopic and macroscopic measure*, *our theoretical description should be correct.*

This work is organized as follows. In §2 we describe the empirical dataset used here. In §3, we revisit the theoretical MaxEnt approach to be followed for tableau (i) and (ii). In §4, we present the results, first for the proportional growth approximation and then for the *q*-growth with drift. We analyse the results for a numerical simulation with random walkers as the control case, and apply our methodology to the empirical data for the Spanish provinces. The ensuing results clearly show the usefulness of the theoretical method. In §5, we discuss some noteworthy features of the results and their possible uses in other contexts. A summary is given in §6 and, finally, some technical aspects are the subject of appendix A.

## 2. Empirical dataset

The raw data are obtained from the Spanish state institute INE [34] and cover the years in the 1996–2010 (with the exception of 1997) period. It encompasses 8116 municipalities (the smallest Spanish administrative unit) distributed within 50 provinces (the building blocks of the autonomous communities). The autonomous cities of Ceuta and Melilla are not included. We use provinces and municipalities as the closest representatives of the ideal of a closed system's fundamental elements. However, some arbitrariness remains in the data, because (i) there are many municipalities that actually possess more than one population-centre, (ii) there are provinces that contain more than a single socioeconomic cluster, and (iii) there are economic regions that extend beyond the border of a province. Those facts introduce systematic errors into the data, but we retain enough accuracy for the purposes of this work.

The data cover a total Spanish population of 39 106 917 inhabitants in 1996, reaching 47 254 510 in 2010. For this last year, the largest municipality (Madrid) covers 3 273 049 people and the smallest one just five persons (Illán de Vacas, Toledo). The total population of each province ranges between 6 458 684 inhabitants (Madrid) and 95 258 (Soria), and the number of municipalities, between 371 (Burgos) and 34 (Las Palmas). Figure 1 shows the spatial distribution of the Spanish municipalities.

The number of municipalities in each province is not large enough to build up an accurate macroscopic size-density distribution. Accordingly, we have systematically worked with rank-distributions (RDs) instead. RDs are usually built from the vector , where *x*_{i} is the population of the *i*th municipality and *n* is the total number of municipalities in that province. We assign rank numbers ranging from 1 to *n*, from the largest (max(**x**) → *r* = 1) to the lowest in population (min(**x**) → *r* = n). However, in order to compare with theoretical, continuous RDs, we have found it to be more accurate to assign ‘middle-point’ values from *r* = 0.5 to *r* = *n*−0.5, instead. We show some examples of RD for different provinces in figures 2 and 4 (the RDs for all the 50 provinces can be found in the electronic supplementary material). For our microscopic dynamical analysis, we generate, for each province and for each time-period, the pair , with the logarithm of the population *u*_{i} correlated with the relative population increment . They are computed as
2.1
where *t*_{1} and *t*_{2} are consecutive years for which data are available. We show in the insets of figures 2 and 4 the corresponding dynamics for each province. To unravel the relationship between these dynamic pictures and city-population RDs is the main purpose of this work.

## 3. Theoretical approach

### 3.1. Basics

Following the methodology described in Hernando *et al.* [25], let us define our relevant quantities. They are

*N*, the total population of a province,*n*, the total number of municipalities into which the population is apportioned,*x*_{i}(*t*), the population of the*i*th municipality at time*t*,, the time-derivative of

*x*_{i}(*t*),*a priori*knowledge of the dynamics at hand, i.e. where*g*(*x*) describes the functional dependence of the dynamics in the population*x*,*x*_{0}, the minimum possible population (at least*x*_{0}= 1), and*p*_{X}(*x*), the number of cities with exactly an*x*–population.

Considering the continuous limit of the distribution *p*_{X}(*x*), the conservation of both *n* and *N* guarantees
3.1

The variables *x*_{i}(*t*) and describe the explicit microscopic state of the system, and they are those involved in the dynamical equation. On the other hand, quantities as *N*, *n*, *x*_{0} or the distribution *p*_{X}(*x*) can be obtained from macroscopic observations. The theoretical approach that follows is our attempt to describe macroscopically the state of the system univocally using the latter macro-observables. To this aim, we introduce in the MaxEnt principle the dynamical information of the equation of motion. We do that in the fashion of Hernando *et al.* [25], i.e. turning on MaxEnt and conjecturing, à la Jaynes, that in dynamical equilibrium the appropriate distribution *p*_{X}(*x*)d*x* is the one that maximizes Shannon's entropy *S* subject to the above constraints. One thereby concocts the variational problem
3.2
with * μ* and

*the corresponding Lagrange multipliers. The functional form of the dynamics*

*λ**g*(

*x*) affects the way Shannon's entropy is measured [25]. The equilibrium city size-distributions

*p*

_{X}(

*x*) are determined univocally by

*g*(

*x*),

*x*

_{0},

*N*and

*n*, as will be seen next for two different guesses for

*g*(

*x*): proportional growth (

*g*(

*x*) ∝

*x*) and

*q*-growth (

*g*(

*x*) ∝

*x*

^{q}).

### 3.2. MaxEnt for proportional growth dynamics

Hernando *et al.* [24] showed that city population expansion can be modelled by means of cluster growth in complex networks. Such a process (diffusion) in networks generally starts (i) using a node as a seed, (ii) its first neighbours being added to the cluster in the first iteration, (iii) the neighbours of those neighbours afterwards and so on. Mathematically, if *Δ**t* is the interval of time for each iteration and *x*(*t*) is the population of the cluster at the time *t*, we can write
3.3
where *k*_{ij}(*t*) is the number of neighbours of the *j*th node accepted to the cluster per unit time at time *t*. Because it is our intention to pass to the continuum, we appeal to the central limit theorem so as to generalize this equation to
3.4
where *k*_{i}(*t*) is a mean value of accepted neighbours per unit time at time *t* and *σ*_{k} its standard deviation. For large enough values of the population the last term—related to *finite size effects*—can be neglected; so we write for the continuous limit
3.5
where the dot represents the time derivative. This equation describes the so-called proportional growth, whose generalization to a random process leads to the so-called geometrical Brownian motion. A system governed by these dynamics exhibits *scale-invariance* since a scale transformation in the fashion *x*'(*t*) = *cx*(*t*), where *c* is an arbitrary constant, leaves invariant the dynamical equation. Equation (3.5) is linearized by introduction of the new variable *u*_{i} = log(*x*_{i}/*x*_{0}), which leads to . Considering that *k* is a Wiener coefficient (following a normal distribution with covariance ), the Shannon measure is expressed in terms of *u* as
3.6
where p(*u*)d*u* = *p*_{X}(*x*)d*x*. With our two conservation rules (equation (3.1)) written in *u*-terms, the MaxEnt variational problem becomes
3.7
where we have used the definition . The solution is the equilibrium density , that one now recasts in terms of the observable *x* getting
3.8

From the conservation rules, we can easily obtain for the pertinent constraints the values
3.9
with the so-called incomplete Gamma function. In this way, the 2*n* microscopic degrees of freedom (each *x*_{i} and pair) are reduced to four parameters (*n*, *N*, *x*_{0} and * Λ*) related between them via this last equation. The shape of the macroscopic distribution is then determined by a single value

*N*/

*nx*

_{0}.

The concomitant cumulative function *P*_{X}(*x*) reads
3.10
and the associated RD—obtained from the inversion of the cumulative one—becomes
3.11
where *r* is the (continuous) rank from 0 to *n*, and denotes the inverse function of * Γ*: . We derive from here what we call the

*Gamma Scaling Law*3.12 obtaining a ‘scaled’ RD 3.13 which no longer depends on

*N*,

*n*or

*x*

_{0}.

### 3.3. Beyond proportional growth generalizing equation (3.5)

The description based on equation (3.5) has limitations that we discuss in §4.1. A different possibility consist in appealing to the more general expression
3.14
where *q* is a dimensionless parameter. Using the definitions of the *q*-logarithm and *q*-exponential of Tsallis' statistics [35–37]
3.15
linearized by means of the new variable . The subindex + takes care of the so-called Tsallis cut off [38]: *e*_{q} vanishes if the bracket becomes negative. We obtain the dynamical relationship , and so the Shannon entropy is measured as
3.16
where *v*_{m} = 1/(*q* − 1) if *q* > 1 or infinite otherwise. The concomitant MaxEnt problem is now of the form
3.17
whose solution is . Changing back to the observable *x*, we find
3.18

The constraints derive from the conservation rules in the usual manner as 3.19

The current cumulative function turns out to be 3.20 and for the associated RD one has 3.21

We now consider a *q*-equilibrium system subject to proportional drift that may account for the ‘natural’ population-growth or fluctuations of proportional nature. One is here including a kind of noise that affects the underlying dynamical equation via
3.22
or, in terms of *u* = log(*x*),
3.23
where *k*_{1} characterizes proportional drift and *k*_{q} the hyper-exponential growth. Considering both *k*_{1} and *k*_{q} as independent Wienner coefficients, the system departs equilibrium via proportional diffusion. As seen in Hernando *et al.* [25], the kernel function for that kind of diffusion is a lognormal distribution. We face the convolution
3.24
where *σ* is the drift-scale parameter. The mean value reads
3.25

In this model, the 2*n* microscopic degrees of freedom are then reduced to six parameters (*n*, *N*, *x*_{0}, * Λ*,

*q*and

*σ*) related between them via this last equation. The shape of the macroscopic distribution is then determined by the value of

*N*/

*nx*

_{0},

*q*and

*σ*.

Note that we have derived our distributions using MaxEnt (with the standard Shannon entropy, but with modified variables). Many other authors use MaxEnt in conjunction with different entropic forms for which one of the four Khinchin's axioms is modified (the axiom of extensivity) [38,39], but keeping the original variables, not changing them as we do.

### 3.4. Numerical simulation with random walkers

We have performed numerical simulations with random walkers to test the theoretical results of our connection between the microscopic dynamics and the macroscopic state via MaxEnt. This is achieved following the algorithm described in Hernando *et al.* [25] for the dynamics. We first find the equilibrium state attained by *n q*-metric walkers following equation (3.14) with a given value of *q*, a initial total population *N*, and a minimum allowed population *x*_{0} (or the equivalent value of *u*_{0} if working with the linearized variable *u*). Without this lower bound, the dynamics may reach a peculiar stable state in which the population tends to zero owing to the eventual disappearance of the settlements. The effect of this kind of constraint on the final size-distribution is also studied in Blank & Solomon [2], where the above-referred peculiarity is avoided by recourse to introducing an exponential growth that takes place if one forces the average of *k* to be greater than zero. In our case, growth is introduced as a proportional drift, via equation (3.23), with a positive-valued mean for *k*. Accordingly, the system evolves in the following fashion:

walkers ‘move’ following the

*q*-dynamics, and populations are corrected as described in Hernando*et al.*[25] to reproduce the total population,we make the walkers to evolve by adding a proportional noise,

the ensuing drift generates population-growth and the value of

*N*obtained afterwards is updated accordingly, andwe keep going on during a specified number

*m*of iterations.

The walkers' final macroscopic size-distribution and their dynamics are analysed in the same way as we did for empirical data in §2. This final distribution is used as a control case. We will show in §4 that our approach is able to macroscopically describe the effects of (i) the lower bound and (ii) the positive mean growth (microscopic dynamics). Also, the preceding discussion is revisited in §5.2.

## 4. Results

### 4.1. Proportional growth analysis

We use as paradigmatic example of proportional growth the province of Alicante. We also use it to illustrate our empirical analysis. The inset in figure 2*a* depicts *u* versus obtained using equation (2.1) for all the municipalities in the last 15 years. The correlation coefficient between the two microscopic variables is found to be 0.16, small enough to consider, at zeroth-order approximation, that the growth is of proportional nature, and thus independently of the size and obeying equation (3.5).

As seen in §3.4, the macroscopic equilibrium RD predicted for proportional growth by MaxEnt follows equation (3.8), where * Λ* is univocally determined via equation (3.9), with the given total population

*N*= 1 926 285 (in 2010), the number of municipalities

*n*= 141 and an estimation of the minimum population

*x*

_{0}.

*N*and

*n*are well determined in all cases. The simplest way of estimating

*x*

_{0}is via extrapolation in the rank-plot of a linear fit to the logarithm of the lowest populations. A more sophisticated method (that we use here) uses a nonlinear fitting of the raw data to the MaxEnt rank-distribution via this single parameter, using equation (3.11) together with the definition of

*equation (3.9). We have obtained log(*

*Λ**x*

_{0}) = 4.83, with a correlation coefficient of 0.9997. The comparison of the raw data with the analytical rank-distribution is displayed in figure 2

*a*. The shadowed area represents the 90% CI owing to the finite value of

*n*(numerically estimated as described in appendix A). The empirical RD falls inside the CI, and the observed microscopic dynamics are compatible with the dynamical assumption of equation (3.5). We consider then that this case constitutes strong evidence for the applicability of the MaxEnt principle for scale-invariant social systems.

We have applied the same approximation of proportional growth to the rest of the Spanish provinces, also including the values of the total populations *N* in the fitting procedure. Using the corresponding values of *n* together with the estimated values of *x*_{0} and *N*, we get the value for * Λ*, and apply then the Gamma Scaling Law described in §3.4. The scaled RDs are displayed in figure 2 (raw data in the inset), and the numerical values of the fitted results can be found in electronic supplementary material, table S1. A quite nice adjustment ensues in general but not in all instances. This ‘failure’ is linked to the strong correlations between and

*u*for some provinces, that reaches significant values (of up to 0.57 for the province of Lugo). Although such correlations may compromise the validity of the approximation of proportional growth, the scaling features exhibited by this plot are remarkable indeed.

It is worth mentioning that the fitted value for the total population *N* is found to be systematically lower than the actual value for it. Such a scenario usually changes when the capital city of the province is not considered in the fitting process, and thus in the estimation of *N*. This indicates that the population of the capital city is systematically larger than what one would expect from the MaxEnt prediction (of the 90% confidence level). This is not surprising because, as mentioned earlier, provinces are not ideal, isolated systems: the actual administrative municipality for capital cities is usually the sum of the historical ones plus some near neighbours, and their economy is expected to be highly correlated with that of other capital cities. Although only 50 cities is a small sample, we have studied the rank-distribution of these capitals and their dynamics to shed some light on this observation. We have found the pleasing result that these capitals form *a scale-free system of their own*. We compare in figure 3 the RD with the MaxEnt prediction using the actual value *N* = 15528025 with the only fitting parameter log(*x*_{0}) = 10.80(5) (with a correlation of 0.99989). When studying the relative increment versus the log-size *u*, we find very low size-dependencies (correlation of 0.15, lower than the one prevailing for Alicante). This observation confirms the appropriateness of a proportional growth dynamics. Again, the capital city of Madrid exhibits a larger-than-expected population (just in the limit of the 90% confidence level), indicating the possible existence of a higher, international system of correlations.

### 4.2. *q*-Growth with proportional drift analysis

The main limitation of the proportional growth approximation is that the model is not able to describe some dynamical patterns such as, for example, the generalized migration from small villages to big cities. To reproduce such behaviour, some dependence of the relative growth with the population via *u* is needed. We start from a Taylor expansion in *u* up to second-order . Considering it as the second-order expansion of an exponential function, we write
4.1
where *a* = *k*_{1} + *k*_{q}, *b* = *k*_{q}(*q*-1) and *c* = *k*_{q}(*q*-1)^{2}/2. Thus, we face a microscopic ‘*q*-dynamics’ with proportional noise (equation (3.23)) whose macroscopic distribution will follow equation (3.24). The actual *q*-value can then be directly determined either from the macroscopic RD or from the microscopic dynamical data. If both estimations match, the theoretical procedure should be correct. For the first estimation, we obtain from the RDs the parameters *x*_{0}, * Λ*,

*q*and

*σ*of equation (3.24) using the fitting method described in appendix A. For the second estimation, we fit the ‘local’ mean value of in equation (3.23) to the expression (see appendix A) 4.2

#### 4.2.1. Results on random walkers simulation

To test both fitting procedures, we have performed a numerical simulation of drifted *q*-metric walkers as described in §3.4. We have arbitrarily used *n* = 200 walkers with an initial total population of *N* = 50 000, minimum allowed population *x*_{0} = 1 and a dynamical value *q* = 1.2. We use in the simulation a time interval of *Δ**t* = 0.03, and generate normally distributed random numbers for *k*_{1} and *k*_{q} with standard deviations *σ*_{1} = 0.6 and *σ*_{q} = 0.1, respectively. We also set a mean value 〈*k*_{1}〉 = 0.05. We stop after *m* = 2650 iterations. At the end of the simulation both the total population *N* and the minimum size *x*_{0} have grown by a factor of exp(3.98). Proportional noise contributes to the final width with . The final * Λ* value predicted by MaxEnt via equation (3.25) using these values is . We show the final RD in figure 4, with the dynamical data obtained using equations (2.1) in the inset.

Submitting the RD data to the fitting procedure obtains the estimated values , *q* = 1.32, log(*x*_{0}) = 3.56 and *σ* = 1.15. We show in the first panel of figure 4 the fit as a red line to the data. Comparing these values with the actual ones, we find a general discrepancy of 10 per cent (in particular, 20% for *σ*). This result gives us the order of the error of our fitting methodology. From the fit to the dynamics, we have obtained *q* = 1.16 ± 0.10 (red line in the inset of first panel of figure 4), which is closer to the actual value of 1.2 with a discrepancy of 3 per cent. The factor between the two independent *q*-fits is 0.89, which gives an idea of the accuracy of the method, approximately 10 per cent.

#### 4.2.2. Results on empirical city-population

As a general trend, we can fit the macroscopic RDs without too many technical complications. In many cases, the capital city has to be excluded, as found in §4.2.1. Also, in some other instances, a few of the largest cities have to be excluded. We have also found ‘outliers’ for very low population centres (details can be found in appendix A). We find for all the Spanish provinces (mean value ± s.d.): ; ; ; and . We show in figure 4 some interesting examples of RD (plots for all provinces are found in the electronic supplementary material, together with the table with all the numerical values). Most of the provinces follow the analytical curve quite well, with very few exceptions. The most dramatic of those exceptions correspond to Salamanca, Orense and Zamora, where a few cities account for the main part of the province's total population and the small villages follow a lognormal distribution. It is expected that owing to their small size, the finite effect of the last term in equation (3.4) becomes important. Thus, the distribution is found to be noisy in these cases.

Proportional drift noise, scaled by *σ*, has shown to be negligible in some cases but quite important in others. In general, better fits are obtained for those provinces for which the distributions' evolution during the last 15 years has been smooth and slow. High rates of change correlate with high *σ*. This rather surprising result tells us that if the system is able to reach dynamic equilibrium, it converges to the MaxEnt prediction. To illustrate this, we plot in figure 5 the evolution in time of *q* for some provinces. We have also found that the correlation between the pair *σ* and and the pair *q* and is rather important (*R* = −0.40 and −0.37, respectively). Indeed, equation (3.25) relates these parameters in an equation-of-state fashion.

On the other hand, the microscopic dynamical fit presents some preliminary difficulties. The main one is high noise for small city sizes (owing to the last term of equation (3.4), which cannot be neglected). Anyhow, because the mean value of this noise is zero, we can safely use equation (4.2) to fit the ‘local’ mean value of via *q*. However, in few cases, we have found technical troubles when the value of 〈*k*_{q}〉 is small in relation to the drift. Indeed, a low value of 〈*k*_{q}〉 generates a nearly linear dependence of with *u*, neglecting the second-order term of equation (4.1). This generates an ambiguous result for *q*, falling down to a generic *q* = 1 value in the fitting process. We have found this in nine of the 50 provinces. Unfortunately, an accurate estimation of *q* for these cases cannot be achieved from the dynamics.

We depict in figure 6 the comparison between both independent sources for *q*, macroscopic versus microscopic, which represent *the main result of this work* (we display in grey the cases where the dynamical fit fails, near the dynamical value *q* = 1). We find a mean proportionality of 0.85 with a correlation of 0.97, and compared with the proportionality found in the control case (the random-walker simulation), we can assume that the discrepancy is the expected one using the fitting procedure.

In such instances, *the microscopic dynamical process determines the macroscopic rank-distribution of the city-population. The equilibrium distribution is that found via maximization of the Shannon entropy in the terms discussed above.*

## 5. Further remarks

### 5.1. Origin of the *q*-dynamics

In §4.2.2, we have successfully parametrized the pertinent dynamics *without discussing possible underlying mechanisms*. We encounter two general trends: (A) migration from small villages to big cities, and (B) saturation for large cities. The cases of Valladolid, Castellón, Navarra, León and Ávila constitute good examples of the first scenario, the most frequent for Spain's provinces. We usually find a value *q* > 1 for them. Several of these examples illustrate the second possibility for cities with a population larger than 10 000, with a relative growth lower than expected. Such a situation is due to the finiteness of the resources needed to make a city grow, which impedes arbitrarily large growth rates. We also find examples, as shown for Guipúzcoa, for which the system follows the B-trend with values *q* < 1. These cases clearly indicate a generalized migration from cities to small towns, owing to local socioeconomic conditions. In other remarkable cases, both situations A and B occur in *the same region*, as in Madrid and Barcelona, where a migration to medium-sized towns occurs, a sign of non-monotonic behaviour for with respect to *u*, which in such instances compromises the dynamical fit (depicted here for Barcelona). On the other hand, the case of Girona deserves some comment. Even if it presents a tiny migration tendency, the system is very near total equilibrium, with an excellent fit for the RD.

The use of power laws to model both migration and saturation is here of a heuristic nature. We have no evidence of any ‘microscopic’ mechanism generating them, and the value of *q* is merely obtained from empirical observation. Identifying the mechanism that generates the dynamics constitutes a formidable challenge. It is worth mentioning that growth with saturation has been traditionally modelled using the so-called logistic function, or its generalization *X*(*t*) with parameters *α*, *K* and *v* that follow a differential equation of the type (http://en.wikipedia.org/wiki/Generalised_logistic_function)
5.1
which formally is identical to equation (3.22) with *v* = *q* − 1, * α* =

*k*

_{1}and

*/*

*α**K*=

*k*

_{q}. Accordingly, migration and growth with saturation relates with hyper-exponential growth. However, as far as we know, a pure theoretical determination of the

*q*-value from the underlying mechanism is still unknown.

### 5.2. The importance of the constraints in the microscopic dynamics

The constraints imposed on both the dynamics and the MaxEnt treatment play a fundamental role in the success of our approach. Indeed, fixing a minimum allowed population *x*_{0}, different from zero, prevents the eventual ‘death’ of the city, and thus maintains constant the total number of cities in the system. Without this condition, the dynamics could lead to a stable state in which the population tends to zero, a problem that could be overcome by introducing a continuous growth via a positive value of the mean of the growth. However, this situation is very hard to maintain for a long period, so that it cannot be considered as a normal situation on which to base a general theory. On the other hand, we need a mean growth different from zero to describe migration patterns. These could lead to an unstable situation of continuous growth. The condition of constant total population for internal migration solves this issue in a natural fashion, and macroscopic equilibrium configurations can then be arrived at.

The above-mentioned constraints, together with a smooth growth for the total population, allow one to simultaneously describe the RDs for both large populations and small ones. Thus, our distribution equation (3.24) describes the tails in a better fashion than simple power laws. Indeed, the tail for small sizes was usually not considered if one wants to check the adequacy of a power law or other functional form. The use of administrative data for calculating the size of a settlement is subject to some limitations: (i) the arbitrary administrative area may be too small in the case of large cities, or (ii) too large in the case of small settlements, including more than one town so that their actual population cannot be properly assessed (as commented in §2). The latter problem is particularly evident when one considers the difference between the number of municipalities in France (36 781) and Spain (8116), although the population of France is only 38 per cent higher than that of Spain. Equation (3.24) overcomes this limitation. *Per contra*, as we have seen in §4.2.2, it fails in some cases (and for a few populations in the distribution tail, due to ‘outliers’, see appendix A) indicating that a case-by-case analysis is needed to describe some areas in order to identify whether the hypothesis of the model match the empirical situation.

### 5.3. Beyond Spanish city populations

We have also found evidence of MaxEnt equilibrium both in other countries and in other kinds of social systems, such as electoral results, in current and previous years. As an example, we plot in figure 7*a* the rank distribution with theoretical fit for the state of Ohio [40] (, *q* = 1.62, log(*x*_{0}) = 10.1 and *σ* = 0.36). We also show in the bottom panel the electoral results for the 2005 UK general election [41] (, *q* = 13.5, log(*x*_{0}) = 4.15 and *σ* = 0.05). Deeper understanding of each territory/election correlated with the kind of study presented here may help to better appreciate the effects of regional policies. Work along such lines is in progress.

We emphasize that the formalism applied here is of a quite general nature and can be applied to other kind of systems as well. If a dynamical equation

can be parametrized, and

a change of variables able to linearize it can be concocted,

one would be able to (i) recover the present scenario and (ii) comfortably use the present procedure.

## 6. Summary

We have shown that the observed macroscopic city-population distributions of the Spanish provinces are, in general, MaxEnt equilibrium distributions, according to the extant microscopic growth dynamics. The MaxEnt variational process is constrained by appropriate constraints (minimum allowed size different from zero and conservation of the total population for internal migration flows). Our procedure in this communication can be summed up as follows.

We first considered a proportional growth approach to the problem, finding that the empirical distributions can be nicely scaled following the Gamma Scaling Law, derived from the equilibrium distribution of a scale-free system.

Afterwards, we dealt with

*q*-growth processes, with some proportional drift. We accounted in this way for both migration and ‘natural’ growth of the population, obtaining better fits to the data than in case (i).

We have checked that the value of the dynamical exponent *q*, estimated from the RDs, is equivalent to the one estimated directly from the dynamics. We read this result as a confirmation of the validity of our approach, establishing thus a connection between microscopic and macroscopic worlds via MaxEnt.

## Acknowledgements

We greatly appreciate the insightful comments of the two reviewers and their help in the improvement of the manuscript. This work was partially supported by ANR DYNHELIUM (ANR-08-BLAN-0146-01) Toulouse, and the Projects FQM-2445 and FQM-207 of the Junta de Andalucía, PIP1177 of CONICET (Argentina), FIS2008-00781/FIS (MICINN) and FEDER (EU; Spain, EU). A.P. acknowledges support from the Senior Grant CEI Bio-Tic GENIL-SPR.

## Appendix A

#### A.1. Estimation of the distribution's 90% CI

The confidence levels of the RDs for a given number *n* are estimated as follows.

A list of

*n*random numbers is generated, following the desired distribution, by inverse transform sampling.The list is sorted from largest to lowest in order to obtain the rank-distribution. The list is saved for further use.

A large number of lists is generated following (i) and (ii), obtaining a distribution of values for each rank.

The 0.95th and 0.05th quantiles are obtained form each rank, determining the lower and upper limits of the 90% CI.

#### A.2. Fit for q-exponential growth

A first method to estimate the MaxEnt *q*-distributions' parameters * Λ*,

*x*

_{0}and

*q*for a given empirical data is a direct fit of the rank-distribution to equation (3.21). We have used the Mathematica software (Wolfram Research Inc. 1988–2011) to this end, by means of the

`Nonlinear Model Fit`function using different guesses for the initial values. We also compare the pertinent results with a solution obtained from the reproduction of the first three moments of the distribution, finding, statistically and numerically, better stability using the logarithmic moments 〈(log(

*x*/

*x*

_{0}))

^{m}〉 instead of 〈(

*x*/

*x*

_{0})

^{m}〉. We have tabulated them as functions of

*q*and

*in the form , which are defined via A1 where*

*Λ**G*stands for the so-called Meijer G-functions. The associated system of equations is A2 which we deterministically solve using

*m*= 1,2,3 via

*,*

*Λ**x*

_{0}and

*q*, with the empirical expected values .

We have also introduced into our equations the proportional drift parametrized with *σ*. Its inclusion in the above equations can be easily materialized using the additive property of cumulants for convolutions (Wikipedia, en.wikipedia.org/wiki/Cumulant). Taking into account that for the centred normal distribution only the second cumulant does not vanish, we face the following system of equations:
A3
where T_{i}^{m} is the *i*th element of the *m*th row of the triangle of Bessel numbers [42] .

The ensuing system of equations is solved using the `FindRoot` functionality of Mathematica with different starting points so as to ensure the best possible result. We accept a set of parameters if the three different results (fit, and equations' system with/without drift) are reasonably similar. When that does not happen, we look for *outliers*, excluding some points from the tails (largest or smallest) until reaching a satisfactory convergency of the three results. The most frequently outlier case obtains for the capital city. The second case refers to a few small villages with undersized population. All outliers are indicated in the electronic supplementary material, table S1, together with the results of the three estimations. On the other hand, as commented in the text, we have found three notorious cases—Salamanca, Orense, and Zamora—where no satisfactory outcome has been achieved with this procedure.

#### A.3. Fitting the dynamical data

As commented in the text, we have fitted the mean value to equation (4.2) via 〈*k*_{1}〉, 〈*k*_{q}〉 and *q*. We have used again the `Nonlinear Model Fit` function, with different guesses for the initial values. The dataset for (and also the standard deviation) is systematically computed in bins of size *Δ**u* = 0.25 for all the provinces. Very few points can be found in some of the bins, which introduces high numerical error. Accordingly, we include the bin in the dataset if, and only if, a minimal number of points exists. We have assumed, for all the provinces, that this number is the 15 per cent of the bin with the larger number of points. The bins used in each fit are shown in the inset of the rank-distribution plots.

- Received September 19, 2012.
- Accepted October 24, 2012.

- © 2012 The Author(s) Published by the Royal Society. All rights reserved.