Migration is a key mechanism for expansion of communities. In spatially heterogeneous environments, rapidly gaining knowledge about the local environment is key to the evolutionary success of a migrating population. For historical human migration, environmental heterogeneity was naturally asymmetric in the north–south (NS) and east–west (EW) directions. We here consider the human migration process in the Americas, modelled as random, asymmetric, modularly correlated environments. Knowledge about the environments determines the fitness of each individual. We present a phase diagram for asymmetry of migration as a function of carrying capacity and fitness threshold. We find that the speed of migration is proportional to the inverse complement of the spatial environmental gradient, and in particular, we find that NS migration rates are lower than EW migration rates when the environmental gradient is higher in the NS direction. Communication of knowledge between individuals can help to spread beneficial knowledge within the population. The speed of migration increases when communication transmits pieces of knowledge that contribute in a modular way to the fitness of individuals. The results for the dependence of migration rate on asymmetry and modularity are consistent with existing archaeological observations. The results for asymmetry of genetic divergence are consistent with patterns of human gene flow.
Interesting phenomena emerge in population dynamics in heterogeneous environments. For example, experimental and theoretical studies have shown that spatial heterogeneity accelerates the emergence of drug resistance [1,2] and solid tumour evolution in heterogeneous microenvironments . On a larger scale, heterogeneity plays a central role in population biology of infectious diseases  and emerges in the development of large physics projects, such as ATLAS and CERN . Finally, in heterogeneous environments, evolved networks are modular when there are local extinctions .
Populations experience heterogeneous environments during migration. Migration can occur in different dimensions: for example, cells undergo one-, two- or three-dimensional migration . In two- or three-dimensional migration, the environmental gradient can additionally be distinct in the different directions. For example, in the case of human migration, the north–south (NS) direction has a greater environmental gradient than does the east–west (EW) direction . The heterogeneity is important in simulating human dispersal in the Americas . In the EW direction, food production spread from southwest Asia to Egypt and Europe at about 0.7 miles per year around 5000 BC, while in the NS direction, it spread northwards in the American continent at about 0.2–0.5 miles per year around 2000 BC . This spread is of the same order as the velocity of human migration, so we estimate that the human migration velocity in the EW direction is about two to three times faster than in the NS direction. Previous work has generated detailed migration paths using geographical data  as well as results that match existing archaeological evidences well after considering spatial and temporal variations . We do not try to generate a detailed map of human migration in this paper. Instead, we use a general model to generate EW NS asymmetry and study the role of a modular knowledge system.
Knowledge of local environments, such as effective agricultural or animal husbandry techniques, was vital to the survival of these early migrants . Evolutionary epistemology views the gaining of knowledge as an adaptive process with blind variation and selective retention . Communication of knowledge between individuals is also an efficient means to spread this discovered, locally adapted knowledge . Similarly, models of social learning theory stress the importance of social learning in the spread of innovations . Here we model the adaptation of a population to the local environment using an evolutionary model with natural selection, mutation and communication. The knowledge of an individual determines his or her fitness. Evolutionary psychology and archaeology posit that the human mind is modular , and that this modularity is shaped by evolution  and facilitates understanding of local environments . Conjugate to this modularity must be dynamical exchange of corpora of knowledge between individuals [16,17].
2. Material and methods
Table 1 shows the symbols in this paper. The observed emigration time and asymmetry of emigration time are critical in the determination of the values of these parameters. We consider migration in random, asymmetric, modularly correlated environments. We use 9 × 25 correlated, random environments, where 25 is the number of environments in the NS direction at the same longitude , and 9 is chosen so that 9/25 is approximately the ratio of the EW to NS dimension of the Americas. See figure 1 for an illustration, where each square block corresponds to an environment.
Each individual a has a fitness fa, as well as a sequence Sa that is composed of L loci, representing the knowledge of the individual. Fitness describes reproductive success and is proportional to the reproduction rate. For simplicity, we take We first consider a linear fitness landscape, later generalizing to an interacting landscape: 2.1where Ji is a quenched, Gaussian random interaction parameter, with variance and the offset 2L is chosen so that fitness is non-negative, as Hmin is For a given instance of the model, the interaction parameters Ji are randomly chosen and then fixed for that instance of the model. When for each i from 1 to L, siJi > 0, the fitness reaches its highest value, and natural selection selects the sequence with the best configuration.
The fitness of the population is influenced by the environment, quantified by interaction parameters J, describing the interaction between knowledge element i of the individuals and the environment (see also equation (2.1)). The interaction parameters J in two adjacent environments, J and J′, are correlated, 2.2where χ = χEW if the two have the same latitude, and χ = χNS if they have the same longitude. The smaller the χ, the bigger the environmental gradient is. Here 0 < χ < 1, and χNS < χEW, since the gradient of environment in the NS direction is more dramatic .
In each environment, we use a Markov process to describe the evolutionary dynamics, including replication with rate f, mutation with rate μ coming from discovering new knowledge through trial and error, and transfer of a corpus of knowledge of length L/K with rate ν. When individuals reproduce, they inherit the knowledge and genes from their parent without error. Both mutations and knowledge transfers are random, and they do not depend on the fitness of individuals. The relative rates of replication, mutation and transfer are f, μL and νK, respectively, so on average each individual makes μL/f ≈ μ/2 mutations, as f ≈ 2L at short times for which these populations evolve, and νK/f ≈ νK/(2L) knowledge transfers per lifetime of an individual. We set the information sequence length L = 100. Discovery of new facts, represented by mutation, changes one site, or 1% of the knowledge of an individual, whereas knowledge transfer changes 1/K of the knowledge. Discovery of new facts should be rare, and in our simulation we set μ = 0.5, so that approximately one-quarter of the individuals attempt to make a discovery through trial and error during his or her lifetime. We consider K = 5 corpora of knowledge. Transfer of one corpus, for example, could be one farmer attempting to communicate to another farmer how to grow a new crop in a new environment. Knowledge transfer must be rare, so we set ν = 6, so that approximately νK/(2L) ≈ 1/7 of the individuals attempt a knowledge transfer process during his or her lifetime. We additionally consider various values of ν in this work to investigate the coupling of ν to modularity. Selection is based on the fitness of the knowledge and it determines the the utility of theses mutation and knowledge transfer events.
This dynamics of migration is described by a Markov process, whose master equation is detailed in appendix A. Initially, one of the environments with the highest latitude is occupied by 1000 individuals with random sequences, as Native Americans are believed to have entered the Americas through Alaska in the north. Since the population migrates from north down to south, we only allow migration to the east, west and south. In each environment, the population evolves according to the Markov dynamics.
The qualitative behaviour of the migration depends on the carrying capacity, N*, and the fitness threshold, f*. The carrying capacity is defined as the maximum population load of an environment . After the population size reaches N*, we randomly kill an individual every time another individual reproduces, as described in detail in equation (A 1). As a result, the total number of individuals does not exceed N*. The initial colonization of the Americas occurred before the Common Era, for which there are no reliable population data.
It is estimated that there were seven million people in the Americas at the start of the Common Era , corresponding to 7 000 000/(25 × 9) = 31 111 individuals in each environment. We choose the carrying capacity to be N* = 10 000, less than 31 111, reflecting that the population size was smaller at the earlier time of initial population expansion. We show the results for various N* in figure 2. We introduce the fitness threshold, f*, because individuals need to be well prepared before emigrating to the next environment. For example, young male ground squirrels appear to disperse after attaining a threshold body mass , and dispersing males tend to have greater fat percentage for their bodies . The increased body mass and fat percentage are thresholds required for migration. Similarly, naked mole-rats migrate more frequently after body mass reaches a certain value . It is possible that some individuals try to emigrate without reaching the fitness threshold when the local population size reaches environmental capacity. However, they are not fit enough to colonize the new environment. Thus, we employ a fitness threshold in our approach, and allow no emigration before the average fitness value reaches f*. When the population size reaches N* and the average fitness reaches f* in an environment, we move N0 = 1000 randomly chosen individuals to one of the unoccupied adjacent environments. Fitter individuals may be more likely to migrate since they are physically better prepared to migrate, while on the other hand, less fit individuals may have more desire to migrate since they do not live well in the current environment. We randomly choose individuals to migrate because of this ambiguous relationship between fitness and migration. If we move fitter individuals instead of randomly chosen individuals, the initial fitness of the individuals in the new environment will be higher. Thus, effectively the χ would be higher. The time required for a population to emigrate from an environment is denoted by the emigration time, t, and the emigration velocity v is defined as v = 1/t. The emigration time of an environment is the time from the arrival of the first individuals to the departure of the first individuals.
To compare our results with current human genetic data, we assign to each individual another sequence S′, also composed of L loci, and each locus can take values ±1. These sites correspond to automosal microsatellite marker genotype data , which we will compare with later in this paper. The traits of the genetic data are neutral in our model. That is, the values of the loci in the sequence S′ have no effect on the fitness. The genetic sequence mutates at a rate μ′. When an individual reproduces, both the knowledge sequence and the genetic sequence reproduce.
We set the time scale in our simulation by the observation that Native Americans took about 10 000 years to migrate from the north tip of the American continent to the south tip , experiencing about 25 climate zones , so migration to a new environment occurred approximately every 400 years, i.e. approximately every 20 generations. In our simulation, we define a generation as the time period during which, on average, each individual is replaced by another individual. We find that the population migrates approximately once per 20 generations when χEW = 0.8, χNS = 0.4 and f* − 2L = 0.3L. One can estimate how many generations it takes to migrate to the next environment. The rate of change of fitness at short time roughly follows : 2.3
Since for migration from the north or for migrating from the east or west, the emigration time is 0.18 or 0.06 depending on the origin of migration, and this is consistent with figure 3. We use Δt = 0.1 as a rough estimate for emigration time. To convert this time in our simulation to number of human replications, we consider that one replication takes around dt = 1/f = 1/2L time, so one emigration takes Δt/dt = 20 generations. To compare the genetic data with current human data, we allow all environments to evolve for another 10 000 years after all environments are occupied, without migration between environments. We assume no gene flows between these environments, as previous work  assumes that the asymmetry in the genetic distance originates from the asymmetry of gene flows in different directions. Here we investigate another possible origin of the asymmetry of genetic distance, that is, the asymmetry already exists when the population colonized the Americas. It is quite possible that both mechanisms help to create this asymmetry, but in order to show that the initial colonizing process itself could generate this asymmetry, we suppress the possibly asymmetric genetic flows.
In figure 1, we show a snapshot of population distribution, approximately half way through the migration. Migration sweeps south and spreads both to the east and west. Migration forms a tilted front, with slope magnitude equal to vNS/vEW = 0.35, indicating the velocities of migration in different directions are different.
In figure 2, we show the three possible phases for different carrying capacity, N*, and fitness threshold, f*. Different phases correspond to whether the migration is limited by the fitness threshold or the population size threshold. In the left phase, the population is limited by the population size threshold, and there is no EW NS asymmetry. In addition, as the population migrates, the maximum fitness value increases since the population is allowed to evolve further after reaching the fitness threshold, as shown in the left inset of figure 2b. In the middle phase, the migration in the EW direction is limited by population size threshold while the migration in the NS direction is limited by the fitness threshold. The maximum fitness value increases as the population migrates in the EW direction, but in the NS direction, the maximum fitness value is f*. The degree of the EW NS asymmetry increases in this phase from the boundary with the left phase to the boundary with the right phase. In the right phase, migrations in both directions are limited by the fitness threshold, and the maximum fitness value remains the same as the population migrates. The EW NS asymmetry is approximately unchanged in this phase. The boundaries of these phases are determined by noting the times to reach the carrying capacity and the fitness threshold: 3.1where N0 is the initial population of one environment. Here we have used that the evolution of the fitness in one generation is small compared with the offset 2L, and that the evolution within one environment at steady state is from χ(f* − 2L) + 2L to f* in the rightmost phase. The left phase boundary in figure 2 is given by the condition in the NS direction, and the right phase boundary is given by in the EW direction. We note that our current choice of parameters is deep in the right phase, indicating that the EW NS asymmetry is robust to the change of f* or the ratio N*/N0.
We determine quantitatively how the environmental gradient influences the velocity of migration. In figure 3, we show the emigration time versus 1 − χ, the change between adjacent environments. It is interesting that the emigration time is approximately proportional to 1 − χ. This occurs because in our simulation for these parameters, the population reaches N* earlier than f*, so the emigration time is the time required to reach f*. For our model, f* − 2L = 0.3L, while max(f − 2L) ≈ 2L, so f* is still far from optimal, and the fitness increases linearly with time in the regime we are discussing. So , where vf is a constant for a fixed modularity. So and we quantify the ratio of velocity in the two different directions as 3.2
In the linear model, it is quite easy to evolve the optimal pieces of knowledge, while in reality, finding the best knowledge is difficult at the individual level. We now show that these results are robust to considering an interacting model, while also demonstrating the significance of the modularity order parameter in the interacting model. As finding optimal knowledge for a local environment is difficult, the fitness landscape is rugged , and we use a spin glass to represent the fitness: 3.3where Jij is a Gaussian random matrix, with variance 1/C. The offset value 2L is chosen by Wigner's semicircle law  so that the minimum eigenvalue of f is non-negative. The entries in the matrix Δ are zero or one, with probability C/L per entry, so that the average number of connections per row is C. The optimization of this fitness model is hard when L is large, and here we give a simple example to show why. Consider a case when Jij > 0, Jik > 0 and Jjk < 0 given i, j and k. To make Jijsisj positive and fitness value larger, si and sj must have the same sign. Similarly, to make Jiksisk and Jjksjsk positive, we need sk to have the same sign as si, and sk to have different sign from sj. This indicates that si and sj have different signs, contradicting that si and sj have the same sign. This phenomenon is called frustration in physics , making the fitness hard to optimize. Let us exemplify this using an example from the human knowledge system. Humans developed three pieces of knowledge: the knowledge of toxicity of mushrooms, the knowledge of red food and the knowledge of apples. The interaction between the first two pieces of knowledge implies that red food is bad and undesirable, while the latter two pieces of knowledge imply something to the contrary. As a result the human knowledge system can be difficult to optimize. We will discuss how modularity helps to reduce this frustration and thus makes it easier to optimize the fitness in §4.
We introduce modularity by an excess of interactions in Δ along the l × l block diagonals of the L × L connection matrix. There are K of these block diagonals, and K = L/l. Thus, the probability of a connection is C0/L when and C1/L when The number of connections is C = C0 + (C1 − C0)/K, and modularity is defined by M = (C1 − C0)/(KC). In figure 4, we illustrate three 20 × 20 matrices with modularities 1, 0.5 and 0 and C = 9.
Modularity, coupled with knowledge transfer, accelerates the evolution of a population in a new environment . We now check how modularity and knowledge transfer influence the velocity of migration. For different M and ν, the results are shown in figure 5. For small M, a larger ν implies a smaller migrating velocity, indicating that the transfer of (non-useful) knowledge slows down evolution. As modularity increases, the migration velocity at larger ν catches up with that of smaller ν. At M = 1, in the range of ν shown, the population migrates faster for larger ν.
We fit the curve of νNS − M for ν ≤ 4 in figure 5 with linear regression, observing R2 ≥ 0.95, except for ν = 0, which has a zero slope and larger noise. We also fit the data for ν = 1 and ν = 3, not shown in figure 5. We show dvNS/dM versus modularity for different ν in the inset to figure 5. For ν ≤ 4, the slope is proportional to ν. So, dνNS/dM = αNSν, and after integration we have 3.4where is determined by the evolutionary load of knowledge transfer. Linearity originates from perturbation of knowledge transfer when ν is small. Note that for ν = 6, the value used in most of this paper, the linear relationship no longer holds, indicating that ν = 6 is large enough to break the linearity.
From our model, we make a prediction by calculating genetic distances between populations in different environments, using the genetic sequence S′. For each pair of environments, we calculate the fixation index FST between them using eqn (5.12) from : 3.5where pi1 is the probability of the value of locus i being +1, and pi2 is the probability of the value of locus i being −1 in the first environment. is the corresponding probability in the other environment. Here n is the sample size drawn from the population to estimate FST, and in our case n = 18, in accordance with the average sample size used in .
The EW distance between environments (x1, y1) and (x2, y2) is dEW = |x1 − x2|, and the NS distance is dNS = |y1 − y2|. We also calculated heterozygosities of the population of environment a, defined as 3.6where pi1 and pi2 have the same meanings as those in equation (3.5). Each fixation index FST was regressed onto the sum of mean heterozygosity and geographical distance, which can be either EW distance or NS distance. The R2 of the regression is around 0.9. For each pair of environments a and b, we express the FST as 3.7and 3.8
The coefficient of geographical distance term using EW distance is cEW, and cNS using NS distance. The ratio of them, r = cNS/cEW, indicates the asymmetry of rate of change of genetic distance. For humans in the Americas, the ratio is approximately 1.26 . The mutational rates of genetic sequences at which the FST ratio is 1.26 depend on modularity, as shown in figure 6. The estimated mutation rate of human automosal microsatellites ranges from 10−4 to 10−2 . In our model, we can calculate the mutational rate per generation μg = μ′ × L/2L. So for M = 1 the mutational rate is 0.005 per locus per generation, and for M = 0 the mutational rate is 0.025 per locus per generation. Thus, the mutational rate for the M = 1 case falls within the range of experimental results, indicating that human knowledge system is probably modular.
So why is having a modular knowledge system so helpful in the human migration process? A migrating human population must adapt knowledge quickly. New knowledge is generated through trial and error (mutation, μ). Communication (knowledge corpus transfer, ν) propagates useful new knowledge in the population. If the knowledge system is non-modular, however, communication causes confusion. This is because transfer of simply a L/K segment does not transfer useful information in a non-modular knowledge system. For example, a hunter can teach a wood gatherer how to hunt, including how to make stone arrowheads. If the knowledge system of the wood gatherer is non-modular, the hunting module can interact with the wood gathering module, and the wood gatherer may wrongly believe that arrow-shaped tools could also work for cutting trees, and replace his or her axe with arrows. For a modular knowledge system, this frustrating confusion will not happen, and modularity reduces frustration. So if the knowledge system is modular, the population can take advantage of faster knowledge communication, while if the system is non-modular, knowledge communication can cause confusion and is deleterious between individuals with different specializations.
For a population with a modular knowledge system, a smaller mutational rate of genes creates the same FST ratio, so the evolutionary rate is higher than the non-modular counterpart when the mutational rates are the same. The population with a modular knowledge system evolves faster, and from Fisher's fundamental theorem of natural selection  we expect that the genetic diversity is higher in the more rapidly evolving population.
Why does environmental heterogeneity create an asymmetry of genetic distance in different directions, even if environmental change does not directly influence genes in our model? For a population migrating in the NS direction, the new environment poses severe challenges to the immigrants, and fewer founders may survive compared with a EW migration. This founder effect increases the genetic distance between the immigrant population and the population they originate from . For the population migrating in EW direction, much milder environmental changes largely reduce the founder effect, thus reducing the genetic distance from the original population.
In addition to spatial heterogeneity, our stochastic model naturally creates temporal inhomogeneity. Even though the average fitness of a population changes smoothly, fitness spikes appear occasionally, corresponding to knowledgeable people or ‘heroes’ in human history. Immediately after the initial colonization of one environment, the highest individual fitness value is more than five times the average fitness value of the population in our model. After evolution of the population for approximately 400 generations, the fitness is ‘saturated’, and the highest fitness is only 50% better than the average fitness. This is consistent with our impression that more heroes emerge in a fast-changing society than a stagnant one.
In conclusion, we built a model of population migration in an asymmetric, two-dimensional system. We have shown the vital role that modularity plays in the migration rates and gene flows. We have shown that a modular knowledge system coupled with knowledge transfer accelerates human migration. Our results demonstrate an EW and NS migration rate difference, and we have related environmental variation with longitude and latitude to migration rate. We have shown that the asymmetry of migration velocity originates from asymmetric environmental gradients. The asymmetry of migration velocity exists only if migration is limited by fitness. Predictions for asymmetry of genetic variation are in agreement with patterns of human gene flow in the Americas. Our model may be applied to other systems such as the spread of invasive species, cancer cells migration and bacterial migration.
D.W. wrote the codes and collected and analysed the data. Both D.W. and M.W.D. developed and analysed the model and drafted the manuscript. All authors gave final approval for publication.
We declare we have no competing interests.
The dynamics of evolution in one environment is described by a master equation: A 1where na is the number of individuals with sequence Sa, with the vector index a used to label the 2L sequences. The notation means the L sequences created by a single mutation from sequence Sa. The notation a/bk means the sequence created by transferring module k from sequence Sb into sequence Sa. Here N* is the environmental capacity of the environment.
- Received September 23, 2016.
- Accepted November 11, 2016.
- © 2016 The Author(s)
Published by the Royal Society. All rights reserved.