The evolution of hyperactivity, impulsivity and cognitive diversity

Jonathan Williams, Eric Taylor


The evolutionary status of attention deficit/hyperactivity disorder (ADHD) is central to assessments of whether modern society has created it, either physically or socially; and is potentially useful in understanding its neurobiological basis and treatment. The high prevalence of ADHD (5–10%) and its association with the seven-repeat allele of DRD4, which is positively selected in evolution, raise the possibility that ADHD increases the reproductive fitness of the individual, and/or the group. However, previous suggestions of evolutionary roles for ADHD have not accounted for its confinement to a substantial minority. Because one of the key features of ADHD is its diversity, and many benefits of population diversity are well recognized (as in immunity), we study the impact of groups' behavioural diversity on their fitness. Diversity occurs along many dimensions, and for simplicity we choose unpredictability (or variability), excess of which is a well-established characteristic of ADHD.

Simulations of the Changing Food group task show that unpredictable behaviour by a minority optimizes results for the group. Characteristics of such group exploration tasks are risk-taking, in which costs are borne mainly by the individual; and information-sharing, in which benefits accrue to the entire group. Hence, this work is closely linked to previous studies of evolved altruism.

We conclude that even individually impairing combinations of genes, such as ADHD, can carry specific benefits for society, which can be selected for at that level, rather than being merely genetic coincidences with effects confined to the individual. The social benefits conferred by diversity occur both inside and outside the ‘normal’ range, and these may be distinct. This view has the additional merit of offering explanations for the prevalence, sex and age distribution, severity distribution and heterogeneity of ADHD.


1. Introduction

ADHD is defined by a constellation of clinical features, including hyperactivity, inattention and impulsivity (American Psychiatric Association 2000). Despite an apparently clear definition, ADHD investigation remains in a preparadigm phase (Richters 1997). Many attempts have been made to explain ADHD, and have led to a vast array of experimental results—molecular, neurophysiological, structural, interactional and societal. Of these classes, perhaps the most successful in accounting for disparate findings has been the neuropsychological—yet even here there is a considerable array of theories and positive results (Sergeant 2000; Sonuga-Barke 2003a; Sagvolden et al. 2005). Though some believe that ADHD will turn out to have only one or two causes, it is becoming increasingly likely that ADHD (table 1) is far more heterogeneous than this.

View this table:
Table 1


Relevant to all these classes of explanation—and a prerequisite for understanding them thoroughly—is the evolutionary status of ADHD. This is the obverse of the crucial issue of whether modern society caused it through toxins (Holene et al. 1998; Berger et al. 2001; Schettler 2001) or poor social environment (Joseph 2000; Timimi & Taylor 2004)—or whether apparent increases in its prevalence in recent decades could be due to improved recognition (see Prendergast et al. 1988) or indeed mis-attribution (Normand 1998; Timimi & Taylor 2004). It is important to note that even though evolutionary accounts of ADHD necessarily do address the question of whether ADHD is solely a modern problem, they do not generally favour one of the neuropsychological hypotheses over others. Neuropsychological hypotheses attempt to understand the brain mechanisms that create behavioural traits (Bolhuis & Macphail 2001), whereas this paper addresses the past and current effects of one such trait on fitness.

There is some environmental contribution, but several twin, family and adoption studies have established that ADHD has high heritability (Levy et al. 1997; Swanson et al. 2000). Statistically, ADHD represents the high end of the normal distribution of hyperactivity, impulsivity and inattention in the population, consistent with the main cause being the additive effects of several genes (Gjone et al. 1996). Several genes and alleles have been reliably identified as associated with ADHD, having odds ratios, pooled over three or more studies, in the range 1.16–1.44 (all significant) for variants of DRD4, DRD5, Dopamine transporter (DAT), Dopamine-beta-hydroxylase, SNAP-25, 5HT transporter (5HTT) and 5HT receptor-1B (Faraone et al. 2005). However, genome-wide linkage scans have revealed other unidentified associations (reviewed in Sklar 2005), and the proportion of ADHD attributable to these, possibly non-neurotransmitter, genes is not known. The basis of diagnosis is essentially symptom scores combined with subjective judgement of impairment (American Psychiatric Association 2000); these two measures show considerable agreement (Leung et al. 1996).

The need to base diagnoses on subjective information obtained from parents and teachers creates considerable difficulties when comparing diagnostic rates from widely varying cultures. It is, therefore, unsurprising to find considerable variation in estimates of ADHD prevalence—from 4 to 19% using various diagnostic methods in different countries (Bird 1996; Leung et al. 1996). However, when assessment methods are carefully standardized, prevalence is about 5–10% around the world (Leung et al. 1996; Rohde et al. 1999).

In this paper, we focus on the hyperactive/impulsive aspect (or dimension) of ADHD. The inattentive aspect of ADHD is correlated with lower IQ and more developmental difficulties than controls (e.g. Willcutt et al. 1999; Hinshaw et al. 2002; Pitcher et al. 2003). Clinically referred girls rarely have just hyperactivity/impulsivity problems (Rucklidge & Tannock 2001), and this tends to be true in epidemiological groups as well (Conners et al. 2003; Kooij et al. 2005), so girls are largely excluded. For simplicity we also exclude contributory environmental and rare genetic factors, though these are central in some cases (Gjone et al. 1996; Jensen et al. 1997).

In comparison with diagnoses, individual gene alleles offer more objective information on evolution. The seven-repeat subclass (DRD4-7R) of the dopamine receptor DRD4 appears to be associated, though weakly, with novelty-seeking (see Burt et al. 2002), and somewhat more strongly with ADHD (Kluger et al. 2002; Grady et al. 2003; Rogers et al. 2004), Tourette's (Diaz-Anzaldua et al. 2004), and possibly depression (Lopez et al. 2005). The prevalence of DRD4-7R varies from 0 to 60% in widely separated geographical areas (Chang et al. 1996). For example, China probably has an ADHD prevalence somewhat lower than the West (Leung et al. 1996) and essentially absent DRD4-7R (Chang et al. 1996). DRD4-7R shows considerable molecular evidence of having undergone recent positive selection, since the appearance of anatomically modern humans (Ding et al. 2002; Swanson et al. 2002; Wang et al. 2004). This creates a difficulty, because ADHD is widely viewed as maladaptive (e.g. Leung et al. 1996; Barkley 2001a). Its association with DRD4-7R raises the possibility that ADHD is also selected for in evolution. We therefore need an explanation of why ADHD–HI is seen in only a minority of humans.

2. Factors influencing the persistence of ADHD

In this section, we create a general framework (figure 1) that describes the evolutionary pressures relevant to ADHD. This includes several unitary hypotheses that have previously been put forward to explain the existence of ADHD.

Figure 1

Factors potentially governing prevalence of hyperactive-impulsive traits in the population. Parentheses indicate factors of debatable importance (see text).

There are several factors which are likely to have a role in determining the prevalence of ADHD-related genes (figure 1). We mention major risks to the individual only briefly, as they are extensively treated in the literature. People with ADHD have very high risk of other psychiatric disorders (Jensen et al. 2001), and of educational and career difficulties (Mannuzza & Klein 2000), as well as approximately 20% increased use of emergency and outpatient medical services (DeBar et al. 2004). Modern society tends to see uncontrolled people as deviant (Harpending & Cochran 2002). Certain cultures may have done this even more in the past (Lakoff 2000).

A more enigmatic aspect of ADHD is behavioural variability. This is one of ADHD's most characteristic features (Leth-Steensen et al. 2000; Sagvolden et al. 2005; but see Saldana & Neuringer 1998). It is generally not known how much of the variability is truly random (absolutely unpredictable; Glimcher 2005), and how much is merely behaviour which we have not yet learned to predict (relatively unpredictable). True randomness is one of several causes of apparent impulsivity (Williams & Dayan 2005) or cognitive idiosyncrasy. Alternatively, a decrease in effective memory retentiveness would have similar effects—at least, in some paradigms. Another aspect of variability is the willingness to take risks, which may be useful in certain circumstances (Hartmann 1993; but see Matejcek 2003), and will be discussed later.

Novelty-seeking is another aspect of variability. The linkage of novelty-seeking to DRD4-7R and detailed data on the geographical distribution of DRD4-7R have led to two important hypotheses. First, DRD4-7R may impose a degree of novelty-seeking that is a benefit in certain styles of society, particularly female-dominated farming (for male fighting and mating-related competition), and a hindrance when men must work, as in both hunter-gatherer and urbanized society—rather than becoming a problem in increasingly advanced societies per se (Harpending & Cochran 2002). This hypothesis explains several aspects of the geographical data well, but fails to explain why the alleles would be dramatically more important in the five South American groups than in all 33 ethnic groups tested in other continents (Chang et al. 1996). Alternatively, novelty-seeking could propel migration (Swanson et al. 2002) or, more likely, have adaptive value in migrated/migrating societies (Chen et al. 1999). Migratory societies introduce extra difficulty to the assessment of adaptation, because they usually involve a relatively small number of ancestors, reducing the genetic heterogeneity (Harpending & Rogers 2000). However, mechanisms of exploration that allowed ‘pre-emptive exploitation of the environment’ would be particularly beneficial in groups that underwent repeated evolutionary (or migrational) divergence (Kirschner & Gerhart 1998). This may be the reason for DRD4-7R's exceptionally high prevalence in South America (Chang et al. 1996), where unexpected or empty ecological niches would reward the more innovative new arrivals. However, associations with specific alleles may be too fine-grained to reveal the real evolutionary trend, which is governed by phenotypes. There will be no selection pressure between multiple mutations that cause the same phenotype (see Chen et al. 1999).

There are several other individual characteristics associated with ADHD that are adaptive without being clearly related to behavioural variability. A greater ability to elicit maternal attention has been suggested (Shelley-Tremblay & Rosen 1996). Indeed, being ‘difficult’ can improve survival of infants during drought (deVries 1984). However, between such crises maternal attention to children with ADHD is more negative than that toward controls (Befera & Barkley 1985), and this can have negative effects on cognitive and emotional development, acting directly as well as through reduced maternal warmth (Blair 2002; Tully et al. 2004).

After the children have grown up, do they reproduce more? The answer to this is not clear. There is indirect evidence that people with ADHD are more likely to have unprotected sex (Tims et al. 2002), and that women prefer sensation-seeking men (Zaromatidis et al. 2004). However, ADHD is comorbid (in our present society) with many psychiatric disorders (Jensen et al. 2001), most of which decrease reproduction (Puente 1977).

Other suggestions in the general category of individual adaptations are even less persuasive. For example, greater creativity has been suggested (Shelley-Tremblay & Rosen 1996), but formal measures of this are no higher in children with ADHD than in controls (Funk et al. 1993). The ability to vary one's behaviour unpredictably is useful in fighting (Barraclough et al. 2004), but children with ADHD are generally unable to confine their variability to situations in which it is useful. Increased exploration of territory could improve foraging, detection of dangers and (at least in principle) learning (Jensen et al. 1997)—but, set against this, hyperkinesis is relatively unusual and severely impairing, particularly when pervasive. The usefulness of aggression has been pointed out (Shelley-Tremblay & Rosen 1996), but it is more likely to be associated with oppositionality co-occurring with ADHD, rather than with ADHD itself (Barkley et al. 1992). Vigilance, response–readiness, enthusiasm and flexibility have been suggested (Hartmann 1993; Jensen et al. 1997), but these are not actually characteristic of ADHD (Goldstein & Barkley 1998). Finally, it seems unlikely that any of the above benefits are found in the inattentive subtype of ADHD, which seems unlikely to be adaptive for either the individual or society (Matejcek 2003; but see Jensen et al. 1997).

Disorders can also appear without adaptation (cf. figure 1). De novo mutation is bound to happen occasionally, but cannot be a major factor, given the high familiality of ADHD and the degree of consistency of associated alleles (reviewed in Faraone et al. 2005). Additionally, DRD4-7R, correlated with ADHD, shows several ‘fingerprints’ of positive selection (Ding et al. 2002; Swanson et al. 2002; Wang et al. 2004). Mere genetic drift seems unlikely, given the worldwide existence of ADHD and its behavioural significance. Preferential mating between people with ADHD could in theory contribute to the persistence of genetic differences, once they had arisen, but has not been found (Minde et al. 2003). In summary, high-effect mutations, drift and preferential mating are likely to have only minor effects. However, heritability of composite phenotypes (such as ADHD) will be much higher than heritability of individual factors, so the possibility that individual factors are unselected or rapidly mutating cannot be excluded.

Probably, the most prevalent current view of the evolutionary status of ADHD is that it is a side-effect of alleles which usually help, but which in particular unfortunate combinations, or large numbers, cause individual impairment (Gangestad & Yeo 1997; Goldstein & Barkley 1998; Ding et al. 2002; Swanson et al. 2002). According to this view, ADHD is not the result of adaptive pressure for itself, but of adaptation for something else, i.e. a ‘maladaptive spandrel’ (Gould & Lewontin 1979; Andrews et al. 2002). The characteristic selected for is often presumed to be either a particular collection of individual traits (Sih et al. 2004)—or variety per se, which is independently selected for (Lloyd & Gould 1993).

3. Informational aspects of DRD4 and ADHD–HI

Three of the hypotheses listed above are related to the gathering of information by individuals. They are (i) men's social success in some cultures; (ii) the special utility after migration; and (iii) the value of risk-taking. These may all be describing aspects of the same core value. Specifically, we reframe DRD4-7R's effect, and more generally ADHD–HI's, as not merely novelty-seeking and risk-taking for its own sake, but for exploratory knowledge acquisition.

Exploration in this context means bringing information to light, as opposed to exploiting, or utilizing the information. In this sense, we explore not just for food and shelter, but also for nebulous assets such as knowledge of how to play hopscotch or how to resolve an argument. Knowledge about fires includes not only the definition and appearance, but also what looks like fire but is not, what happens if you shout ‘Fire’, and how much it hurts to touch the gas cooker. This kind of auxiliary information is often learned through exploration (or observation of exploration) rather than through teaching.

Exploration does not always need to be well organized. Random search strategies are important, and in the absence of other knowledge can be a reasonable way to search for things (Gordon 1995). For example, in the 1800s, in the absence of sophisticated geological techniques, random drilling would have been a fairly efficient way to find very large oil fields (Reynolds 2001).

4. Advantages of confining unpredictability to a subgroup

When sufficient information can be obtained by a minority and passed on to the group, there is no longer any need for all individuals to explore. Even nowadays, small oil companies' low overheads allow them to try relatively unlikely sites; when they prove successful, and more information becomes available, larger oil companies can exploit the find. The larger companies use the information revealed by the less constrained small companies. Similarly, queen bees that stay in the hive are not only reproducing and being supplied with food, but are also being protected from exploratory risk.

Exploration by an individual can be a major benefit to his social group. Possible benefits of having people with ADHD–HI include the exploration not just of physical space, but of options or ‘semantic space’, including testing social limits, overcoming superstitions and performing dangerous experiments. Cumulative cultural systems are more likely to develop when reliable individuals express views about what unreliable individuals are doing (Castro & Toro 2004).

At a practical level, how could the children in a village benefit from having one or two children with ADHD–HI in their midst? The children with ADHD–HI do things they should not. They burn their hands on the stove. They eat poisonous berries and fall out of trees. They do not focus on their classwork, and they break the rules in games. All these physical and social mistakes provide useful lessons for the majority, while the majority remain safe. Much less often, but possibly also importantly, children with ADHD–HI discover something advantageous, that the other children would not. The reduction in ADHD–HI symptoms with increasing age (El Sayed et al. 2003) may have evolved because of the reducing likelihood that such disorganized experimentation would produce novel information, combined with the increasing cost of losing the individual.

5. Computational demonstrations of the mechanism

5.1 Simulation 1: the changing food task

We use a very simple paradigm to show the benefits of a mixed society (figures 2 and 3). This roughly follows the optimal diet model of foraging by hunter-gatherers (Hawkes et al. 1982). Precise timing of activity can be used to maximize an individual's rewards in optimal foraging theory (Kacelnik & Brunner 2002), but the current model will consider the effect of variability on group foraging. Learning to avoid poisons and seeking good food involve both shared and distinct faculties (Galef & Giraldeau 2001), but we simplify these into a unitary mechanism.

Figure 2

Knowledge acquisition in a changing environment (simulation). Four groups of individuals aim to maximize their knowledgeability about locally available food resources, while minimizing the risk of testing unknown foods. (a)–(c) show four independent groups of 40 individuals: a highly predictable group which does very little experimentation; a highly unpredictable group; and two mixed groups. The mixed groups contain mainly predictable individuals, with either 2 or 10 unpredictable individuals. (a) illustrates the paradigm. At any time, four foods (shown as four letters) are available to be freely chosen. This shows the accuracy of the group's knowledge of available foods, which changes as time progresses during a single run (unspecified time units, t.u.). Error increases each 10 t.u. when one food is replaced by another of unknown quality. Error then reduces as each group learns about the new food. (b) illustrates the rate of acquisition of knowledge of food by the four groups. The highly predictable group acquires information slowly, because individuals always choose their favourite three foods, so rarely explore new ones. (c) shows the gradual loss of individuals by each group, over nine food changes with no offspring (mean of 40 runs). In this particular task, groups in which 5% of the individuals are unpredictable survive the best. For discussion see text; for details see appendix A.

Figure 3

Effect of group and environment on survival (simulation). This shows the same overall task as figure 2, but with a much wider variety of groups. In (a) and (b), each datapoint indicates the number of survivors after nine food changes, one each 10 t.u., from an independent group initially of 10 members. (a) shows the effect of the predominant population brittleness on survival. The solid line shows that the survival of a completely uniform group reduces when their brittleness b exceeds 5. The dashed line is produced in exactly the same way, except for the substitution of two unpredictable individuals for two of the others. (b) shows the effect of the rate of environmental change on group survival. When food changes occur more frequently than once every 150 t.u., the presence of unpredictable individuals improves group survival. For discussion see text; for details see appendix A.

5.1.1 Subjects

Four groups of individuals are simulated. Group size is 10–40 individuals, based on estimates of early hominid group size (see Boehm 1997). In the homogeneous groups, all or none of the individuals are unpredictable; in the mixed groups, 5 or 25% of individuals are unpredictable. We use unpredictability (or level of randomness) as a simple form of behavioural variability.

Using true randomization of behaviour in the simulations excludes the possibility that some hidden aspect of the behaviour is helping the group; this simplifies interpretation of results. However, we expect that if true randomness of behaviour can be shown to have a value for society, then a wide range of other idiosyncrasies would too. Indeed, most efforts to find any particular deficit to be increased within ADHD have succeeded (e.g. Luman et al. 2005). The rare exceptions are cases where the unfound deficits can be interpreted as definitionally excluded by higher-priority diagnoses or deficits (e.g. Kaplan et al. 1998; Scheres et al. 2004).

5.1.2 Procedure

We simulate changing food availability, in order to model the effect of environmental change on learning and survival. Such environmental change maintains genetic variance (Gangestad & Yeo 1997). However, we focus on a single generation to clarify the effect of subgroup interactions.

5.1.3 Individual roles

There are many known examples of individual behavioural differences within species, which may in some cases be sufficiently discrete, fixed and complementary to amount to ‘specializations’ (Bolnick et al. 2003; Sih et al. 2004). The clearest example of a strain containing complementary exploratory specializations is a strain of bees that differ in whether they forage for water, nectar or pollen; this difference in adult roles is associated with perceptual differences early in life (Pankiw & Page 2000). Within a strain, individual rats differ in novelty-seeking, open-field exploration (Antoniou et al. 2004), learning rates and eating rates (Dewar 2004). It is not known how these relate to discrete individual exploratory roles for wild rats, but such individual differences necessarily confine certain risks to subpopulations. Macaques probably have individual differences in foraging methods (Drapier et al. 2002). Mantled howling monkeys track the changing seasonal availability of foods, with a single adult ‘sampling’ new trees before other monkeys join in (Glander 1981, p. 247). Even though people with and without ADHD–HI are on a continuum, for simplicity we simulate two discrete groups.

5.1.4 Sharing of information between individuals

Foragers are exposed to two risks: of dying from malnutrition (if they do not know enough nutritious foods) or from poisoning (if they try a poison). Rats, for example, minimize both risks by using social learning as well as individual observations (Dewar 2004). Macaques appear to alternate between searching by themselves and monitoring the discoveries of their neighbours (Drapier et al. 2002). In the current simulation, individual observations are immediately distributed through the group.

5.1.5 Results

In figure 2, the highly unpredictable group acquires the most accurate knowledge of food quality (figure 2b), but does not use this information reliably, so its population falls the fastest, through poisoning (figure 2c). The highly predictable group has the least knowledge of food quality (figure 2b), but always eats the best food it knows, so tends to lose members more to malnutrition than to poisoning. The two mixed groups have intermediate knowledge of food quality (figure 2b), and most of their members eat the best foods they know. Of these two groups, the one with 5% unpredictable individuals survives best of all groups; this group has its level of exploration matched to the risks and benefits of the environment. The group with 25% unpredictable individuals has about the same survival rate as the predictable group.

Figure 3a shows that there are many ways for a population to perform well in this task. When most individuals in a group are sufficiently unpredictable (i.e. brittleness b<6; for details see appendix A), no benefit is obtained by adding a small number of highly unpredictable (b=0.3) individuals to the group: the group collectively performs sufficient random testing without them. However, when most members of the group are quite predictable (b>5), the presence of a few highly unpredictable individuals bestows a marked benefit in this task. This is for two reasons. First, the homogeneous groups, containing only predictable individuals, do not explore new foods adequately, so suffer from malnutrition. Second, exploration by identical individuals is inherently inefficient, as they multiply the risk of poisoning, while obtaining redundant information. For example, when there are two foods which appear very risky and somewhat risky, predictable individuals will eat the somewhat risky food many times before they try the very risky food once (clearly, this is appropriate in some tasks but not in all). Reduction in this inefficiency accounts for the gradual increase in survival of the mixed group as predictability increases. Additionally, the presence of the unpredictable individuals allows the majority to be very predictable or not, depending on other constraints. For example, the addition of a co-task requiring high predictability in a proportion of group members (e.g. for boat-building or food preparation) would reduce group survival at the left of figure 3a—and thereby establish the presence of a small unpredictable minority in a generally predictable group, as the optimal solution. Figure 3b is discussed below.

5.2 Simulation 2: evolution based on the above factors

There are many possible ways of linking characteristics of ADHD (figure 1) to survival, reproduction and other processes that govern prevalence. A detailed account would involve risk-taking; sexual attractiveness; type-selective mating; co-inheritances; human and environmental variability over several dimensions; and multiple stages, including Richerson and Boyd's (2004) important model of the evolution of cooperation by cultural group selection followed by gene-culture coevolution. Such detailed work needs to be preceded by a general mapping of the territory.

5.2.1 Evolutionary mechanisms

Group selection has been the subject of much controversy (Dawkins 1994; Wilson & Sober 1994; Morell 1996). It is likely to be more important in humans than in animals, because of our inherently egalitarian (rather than strictly competitive) social organization (Boehm 1997) and our exceptionally severe inter-group conflict (Moore 1994). However, the field was devalued in the 1960 and 1970s by overstated claims that group selection was a predominant, rather than a contributory factor, in the fitness of particular traits (e.g. Ward & Zahavi 1973). Avoidance of this oversimplification has allowed progress (e.g. Goodnight & Stevens 1997; Barta & Giraldeau 2001). In the current context, it is important to note that examples of group selection pressures complementing or even opposing individual selection pressures are documented (Stevens et al. 1995; Goodnight & Stevens 1997). Self-sacrifice has been held out as a feature especially indicative of group selection (Alexander & Borgia 1978; Wilson & Sober 1994), and individually maladaptive impairment seems to us to be similar in this respect.

5.2.2 Reproductive mechanisms

Reproductive differences associated with ADHD were discussed earlier. We do not address the mechanisms by which subgroups may come to be reproductively favoured (red, green and light blue lines in figure 5). Females might be expected to favour predictable males, because they are more reliable husbands and fathers, but the results from Simulation 1, taken together with the multifactorial view in figure 1, suggest that female reproductive fitness will be maximal in females who select the optimal balance of predictability (for themselves and their children) and unpredictability (for the group). Clearly, determination of the optimal balance is a complex issue. Because there is insufficient information available to allow this to be considered in detail, we simplify reproduction to simple rates.

Figure 5

Effects of rate of environmental change, and subgroup reproductive fitness, on group predictability and survival (simulation). X-axes are evolutionary time, notionally in units of 2000 months or approximately 200 years. In the first evolutionary period (white, time 0–10), one food changes every 1000 months, in the second (grey, time 10–16) every 20 months, and in the third (white, time 16–22) every 1000 months. Four separate simulations are shown in different colours. Each simulation starts with a population of 50 individuals each with an arbitrary uniform brittleness (9). The blue simulation has all individuals reproducing at the same rate, but in the other simulations subgroups have increased reproduction (see key). Whole-population reproduction rates are normalized for all simulations. Std, standard deviation; SEM, standard error of means on 30 runs. For discussion see text.

5.2.3 Developmental mechanisms

The causes of ADHD have a great deal of overlap with those of other neurodevelopmental disorders, including autism spectrum disorders, Tourette disorder, specific learning disabilities and early forms of schizophrenia (Taylor & Rogers 2005). All the neurodevelopmental disorders have substantial genetic contributions as well as non-specific environmental influences, all are commoner in boys than in girls (Taylor & Rogers 2005), and there is a high degree of comorbidity (Gilger & Kaplan 2001).

Attempts to link individual disorders with specific evolutionary mechanisms (e.g. Leckman & Mayes 1998) have difficulty with these overlaps, and with the diversity within disorders. These problems are reduced in broader concepts such as deficits in attention motor control and perception (DAMP; Kadesjo & Gillberg 1998; Sonuga-Barke 2003b) and atypical brain development (Gilger & Kaplan 2001), which encompasses enhanced abilities as well as impairments. However, though there is good evidence for male preponderance in neurodevelopmental problems (Kadesjo & Gillberg 2000; Rucklidge & Tannock 2001; Wilens et al. 2002; Conners et al. 2003; Kooij et al. 2005), there is little evidence for usefully enhanced abilities in males (Lynn & Irwing 2002; Anonymous 2005).

Within a similarly general framework, but focusing on impairments, the developmental instability (DI) model (Yeo et al. 1999) proposes that individuals differ in their ability to ‘buffer’ development against the effects of damaging mutations and environmental insults; and that deficiencies in buffering account for the neurodevelopmental disorders. Genes are grouped into those responsible for general developmental processes, specific subsystems, and buffering mechanisms (used for reducing the effects of mutations and environmental agents). In their account, failure of buffering is a defect maintained in the population by mutation, ever-changing pathogens and recombination, and to this list we add the group effects already discussed. This account (figure 4) restricts both genetic and behavioural ‘exploration’ primarily to the males, who have a lower level of investment in parenting (MacDonald 1998; Rucklidge & Tannock 2001). In the next generation, females share their genes with selected novel male genes, honing and diluting the genetic and behavioural exploratory effects.

Figure 4

Model of ADHD effects on individual and group. Three classes of humans are shown: males with and without ADHD–HI (a continuum in reality), and females. The large grey circuit indicates the reproductive cycle from conception to adulthood and mating. At the right are shown the group benefits that arise from ADHD–HI, which may in turn be caused by the failure of developmental buffering in a minority of males. For more details see text.

5.2.4 Design of the evolutionary simulation

We simulated the evolution of groups in a changing environment. There were initially five groups of 10 individuals. All individuals started with a brittleness (i.e. predictability) of nine, a high value chosen to demonstrate the reduction during evolution. Information about local foods was shared within each group. Ninety per cent of matings were outside the group. In these, the baby's five brittleness genes were randomly selected from the two parents, and in 10% of matings one of the five genes randomly mutated. Whenever the overall population exceeded 200, a group was randomly selected for elimination. Individuals started reproducing at age 10, and lifespan was limited to 50 years. When groups exceeded 20 individuals, they split into two groups.

Most of the parameters of our model are selected for their plausibility in early human evolution. The low population limit was needed to make the simulation computable.

5.2.5 Results of the evolutionary simulation

The results (figure 5) are averages of several runs. For example, the low plateau at the right of one of the population curves in figure 5 was not seen in any run, but is the mean of several runs, including some extinctions.

The simulation started with a simple demonstration of the likely origin of the inverted-U shape of impulsivity in the general population. In this account, the distribution arises from the simple summation of effects of independent genes. In the simplest case, i.e. in the absence of any effects of the gene on fitness, the asymptotic mean brittleness would be the sum of the means contributed by the individual genes, i.e. 5×1. We previously demonstrated the benefits of cooperation between two distinct subpopulations, and we have now shown that the same advantages of cooperation are found in more realistic continuously varying populations.

We simulated the ability of two ‘interventions’ to alter this mean. These were (i) the rate of environmental change and (ii) the relationship between an individual's predictability and rate of reproduction. Increased environmental variability produced a small, but definite and reversible, reduction in population brittleness (figure 5). Reproductive bias favouring the unpredictable individuals helped populations cope with rapid environmental change, without imposing major cost during periods of stability. Reproductive bias favouring predictable individuals had the opposite effect, which was somewhat mitigated by introducing an opposed bias at the other end of the spectrum.

The simulations demonstrate effects that cannot be attributed to individual adaptation: the individuals do not compete at all, with the debateable exception of the competition to be in a diverse group. In a simulation that is reproductively unbiased (blue lines in figure 5), people with ADHD die sooner than others, particularly in rapidly changing environments (grey region in figure 5), and so reproduce less than others. The mean brittleness of the groups in which they die obviously increases; but because these groups tend to die en masse (figure 2c is the average of many runs and so conceals the catastrophic losses), the population brittleness then reduces. Broadly, when a homogenous predictable group, with a sole explorer, loses that explorer, it will die, as a group, but when a group with a wide spread of predictability loses its best explorer, it will have a good chance of replacing that explorer soon by genetic recombination.

This lacks the essential feature of frequency-dependent selection, i.e. ‘that fitnesses are not fixed, but variable, and the values they take on vary as functions of the frequencies of the diploid genotypes they describe’ (Gromko 1977), because the reproductive fitness of the individuals depends not on their frequency, but on the groups' ability produce an environmentally appropriate distribution of brittleness in their offspring. The process can reasonably be called diversity-dependent group selection.

6. Discussion

Based on a review of the literature, we have presented a multifactorial view of the evolutionary status of ADHD–HI. Within this framework, we have introduced two novel components: (i) the value of unpredictable behaviour in changing environments and (ii) the value of confining such unpredictable behaviour to a minority.

We have demonstrated that there is a class of tasks (‘group exploration tasks’) in which unpredictable behaviour by a minority optimizes results for the group. Characteristics of these tasks are (a) risk-taking, because its cost is borne mainly by the individual, and (b) information-sharing, because its benefits increase with group size. Such tasks have not been the subject of much experimental work. Because observational learning is so important to humans, we suggest that group exploration tasks model real life activities that were important in human evolution.

Our synthesis accounts for, and indeed requires, several apparently unrelated factors about ADHD–HI: (i) it is primarily heritable; (ii) it is highly heterogeneous and highly polygenic; (iii) ADHD impulsivity is reduced by adulthood (El Sayed et al. 2003), when the cost of losing an individual is maximal; (iv) the severity of ADHD is usually limited by the need for people with it to both engage in tasks that their compatriots will learn from, and share the information they have obtained—unlike mental retardation; (v) it genetically entrains a style which can also be achieved voluntarily (i.e. the Baldwin effect, described below); (vi) it is more common in the sex with the lower level of parental investment (MacDonald 1998; Rucklidge & Tannock 2001); (vii) it is confined to a small minority yet sufficiently common to exist in most villages; and (viii) its severity has an approximately normal distribution in the population (Li et al. 2005). Taken together, (i)–(viii) constitute a complex design for ADHD (Barkley 2001b), supporting the argument that it is an adaptation.

We have used a group version of the Baldwin effect (Baldwin 1896; Hinton 1987). In the Baldwin effect, an organism's efficient exploration of solutions, to evolutionarily relevant tasks, in turn increases individual fitness and thereby increases the likelihood of the next generation being genetically predisposed to find the same solutions. In our simulations, the groups evolve, over many generations, to become genetically predisposed to perform explorations of the environment in an optimal way.

Interactions between learning and evolution are not trivial (Hinton 1987); adding cooperating subgroups within a species brings a new level of complexity (Aoki 2001; Fehr & Fischbacher 2003; Nowak & Sigmund 2004). For example, when asexual creatures find (through solely evolutionary change) a rare and difficult solution to a survival problem, they will obviously pass it on to subsequent generations. Sexual reproduction at a stroke loses this particular benefit, but it can be largely regained by the addition of behavioural experimentation and learning (Smith 1987). Somewhat similarly, the increased exposure of a person with ADHD–HI to danger risks wasting all the information acquired in his lifetime—but when he is grouped with more reliable people his errors can be seen by others, and information will not be lost.

Game theory is widely used in evolutionary models (e.g. Colman & Wilson 1997; Dall et al. 2004), and sheds some light on the evolution of ADHD. Over long evolutionary time, many generations of individual learners (roughly comparable to our unpredictable individuals) and social learners (like our predictable individuals but lacking the capacity for individual learning) can compete, with the final balance between them determined by the rate of change of the environment (Rogers 1988; Wakano et al. 2004).

A striking result of the simulations is that, while evolutionary pressure can quite readily increase the amount of unpredictability in a population, it is much harder to get rid of that unpredictability. Figure 3a explains why: if a population starts at the right end of the lower curve (homogeneous population), there is great advantage in moving left to the midpoint where the two curves meet (this is at about b=7, which is also the mean brittleness achieved by the evolutionary simulation in figure 5). Once the population has moved to the central point of figure 3a, there is no longer any advantage in moving back. Figure 3b shows that the cost of including an unpredictable minority in a stable environment (right of figure) is small compared with the cost of lacking such a minority in a rapidly changing environment. This offers an explanation for the difficulty in removing DRD4-7R from the population, after it has outlived its evolutionary usefulness as suggested by Harpending & Cochran (2002) (discussed above).

6.1 Testable predictions

We expect that

  1. experimentally assembled small groups will perform computerized group foraging tasks better if an ADHD–HI individual is substituted for a control member of the group. The difference may also be detectable when a child with ADHD–HI joins a class, or stops taking his very reliably administered long-term medication,

  2. younger siblings (or step-siblings) of ADHD children will be socially slightly more advanced and less accident-prone than older siblings (or step-siblings),

  3. animals with solitary lives, particularly solitary upbringing, or that engage in solitary foraging, will show less inter-individual variation, on genetic or behavioural measures.

6.2 Limitations

The most important potential criticism of our proposal is that it is merely another evolutionary ‘just-so’ story. Although we have demonstrated that an unpredictably behaving minority can help the larger group on a particular class of tasks (described above), we cannot demonstrate that these tasks have actually occurred sufficiently to have affected the course of evolution. However, we have satisfied many of the criteria for an adaptationist hypothesis (Andrews et al. 2002), including (i) weighing alternative mechanisms; (ii) establishing a relationship with genetic findings; (iii) accounting for several aspects of ADHD–HI that cannot be accounted for by learning mechanisms or other hypotheses; and (iv) making testable predictions.

Frustratingly, we are unable to suggest a size for the group benefits of ADHD–HI. This is partly because reliable statistics on ADHD reproductive fitness are not available. Even when they do become available, debate will continue on whether effects act on the individual or the group. For example, if people with ADHD have twice as many babies as people without, is this because the ADHD is helping the individual directly, or because attractiveness of risk-takers to females has increased sufficiently to preserve these endangered genes in the gene pool?

Our approach predicts that any mutation creating individual cognitive variation will, if not too severe, confer group benefit. Indeed, serotonergic genes have been linked to ADHD (Bobb et al. 2004), and it is plausible that 5HT-linked obsessionality traits in a minority would fill a similar societal role to ADHD per se. However, the low incidence, adolescent onset, secretive nature, repetitiveness and severity of obsessive compulsive disorder make it unsuitable for a social learning role. Anxiety (as well as hypomania and transient excitement) states may occur briefly, in order to extract relevant information efficiently from the environment (Hanoch & Vitouch 2004). Autistic spectrum disorder (Toichi & Kamio 2002) and some personality disorders (Dall et al. 2004) could have informational benefits, but this seems unlikely to be the case for more severe impairments, such as mental retardation. At first sight, ADHD appears totally different from schizophrenia, which with its much lower incidence may result from ‘developmental ‘noise’ in the machinery of selective population genetics’ (Wilson 2004). However, diagnostic thresholds and recognition of subclinical states are somewhat arbitrary in both cases, and the ‘developmental noise’ description may also fit severe cases of hyperkinetic disorder.

7. Conclusion

Previous hypotheses for the adaptive evolution of ADHD fail to account for the multifactorial nature of the disorder and the worldwide confinement of the syndrome to a minority of the population. In order to overcome these problems, we have reviewed the balance between the benefits and disadvantages of ADHD, to both the individual and the group in which he lives. Beside the often-studied effects on individual morbidity and reproduction, which deserve further quantification, we have suggested two advantages of ADHD–HI to society: first, increased exploration of behavioural possibilities and second, the confining of concomitant social and physical risk to a minority.

Evolution's drive to variability offers an explanation for the difficulty in finding one, or even several, core deficits in ADHD (Castellanos & Tannock 2002; Luman et al. 2005). Even within the restricted subtype ADHD–HI, it seems likely that evolutionary time will have produced a long list of variations. This does not imply that the search for causes is futile: a search for causes of hypertension has painstakingly peeled away many contributory genes and interactions (Takahashi & Smithies 2004). Future work in this area should focus not on discovering which one of the various genes or neuro-psychological hypotheses best describes ADHD, but rather on estimating the weights of the factors. Similarly complex networks of causation remain to be studied in most areas of psychology and psychiatry (Kendler 2005).


We are very grateful to Hans Pecseli and Edmund Sonuga-Barke for useful discussions; to Hannah Buchanan-Smith, Gwen Dewar, Kenneth Glander and Peter Richerson for helpful information; and to Cole Coleman, David Berger, Peter Dayan, Espen Borga Johansen, Peter Killeen, Terje Sagvolden, Rosemary Tannock, Jeff Wickens and three anonymous reviewers for comments on earlier drafts of the manuscript. We are grateful to the Gatsby Charitable Foundation and the Centre for Advanced Study in Oslo for financial support. These had no role in study design or publication.


    • Received April 27, 2005.
    • Accepted October 8, 2005.


View Abstract