## Abstract

We apply three plausible algorithms in agent-based computer simulations to recent experiments on social learning in wild birds. Although some of the phenomena are simulated by all three learning algorithms, several manifestations of social conformity bias are simulated by only the approximate majority (AM) algorithm, which has roots in chemistry, molecular biology and theoretical computer science. The simulations generate testable predictions and provide several explanatory insights into the diffusion of innovation through a population. The AM algorithm's success raises the possibility of its usefulness in studying group dynamics more generally, in several different scientific domains. Our differential-equation model matches simulation results and provides mathematical insights into the dynamics of these algorithms.

## 1. Background

Social learning is an important way for individuals to acquire useful information and skills without the need for costly trial and error learning [1,2]. Such social transmission can lead to the rapid spread of individual discoveries through a population [3–5]. Although the transmission of new behaviours via social learning has been described in diverse animal species [6] and detailed in laboratory experiments [7,8], the population-level implications are less clear. It seems likely that individual biases, social interactions and population structure all influence how, when and what information is transferred [9,10]. However, disentangling these predictive factors has proved difficult as there are difficulties in tracking the spread of innovations and in eliminating alternative explanations such as individual learning [3,8].

Learning biases may mean that some innovations are more likely to spread than others. These biases include model-based strategies (e.g. copy dominant individuals), state-based strategies (e.g. copy when uncertain) and frequency-dependent strategies (e.g. copy the majority) [11–15]. Copying frequently seen behaviour may be a shortcut to ascertaining the best locally adaptive information, and could optimize learning outcomes in spatially variable environments where individuals are dispersing into habitats of which they have little experience [11,16–18]. Empirical evidence for conformity in non-human animals has derived largely from laboratory experiments (e.g. in fish, rats and primates) [19–22], or from reports of changes in the behaviour of immigrants who move into different groups [18,23].

An early source of evidence for the spread of innovation by social learning involved piercing of foil caps of home-delivered milk bottles by some bird species, with birds rewarded by accessing cream at the top [24]. Reports suggested that there might have been several points of innovation, with localized social learning leading to additional spread of information [25]. Although the evidence that this behaviour was socially learned and culturally transmitted was observational and controversial, it became a classic example of innovation and transmission [26–28].

A more recent experiment elucidated these patterns using novel methods, allowing detailed tracking of the spread of seeded behaviour within and between groups in the wild. These researchers introduced alternative novel foraging techniques into several wild subpopulations of birds [16]. The foraging techniques consisted of pushing a bi-directional sliding door of a puzzle box either left or right to access food concealed behind it. After two birds were trained on one of these two techniques and released back into their group, automated tracking recorded diffusion of this behaviour in each subpopulation over four weeks, and followed its persistence over 2 years. Among the important findings were that information transmission followed sigmoidal diffusion curves, plateauing at 75% of solvers in each population; greater numbers of solvers were present in seeded treatment conditions than in unseeded control conditions; and there was a high level of conformity, approaching consensus. This conformist copying led to increasingly entrenched population-level preferences for the seeded technique over time and across generations.

In social psychology, conformity is defined as changing individual human behaviour to match that of others, with synonyms including acquiescence, compliance, conventionality and conversion [29]. Different degrees of conformity can be produced by various motivations and methods. In behavioural biology, conformity has been studied from a population perspective in a variety of species. There, conformity is sometimes specified in terms of frequency dependence, a tendency for agents to conform either more (positive frequency dependence) or less (negative frequency dependence) than would be expected by the proportional frequency of the majority behaviour in a population [11–13,30]. A distinction is sometimes drawn between initial learning and switching to a group solution. A continuum from conformity to anti-conformity has been noted in both fields [31].

Our agent-based computer simulations of the empirical phenomena from [16] reflect these definitions as we show how three well-specified social-learning algorithms achieve different types and levels of conformity at the population level. Although some of the empirical phenomena from [16] are simulated by all three learning algorithms, several manifestations of social conformity bias are simulated by only the approximate majority (AM) algorithm, a population protocol with intellectual roots in chemistry, molecular biology and computer science. These simulations provide explanatory insights into the spread of innovation by social learning and generate predictions. We include a mathematical model that provides additional insights into empirical and simulation phenomena.

## 2. Material and methods

Our agent-based simulations employ three different social-learning algorithms: AM, always copy (AC) and copy if uncertain (CIU). Within each of these algorithms, an agent can be in either a decided or an undecided state. Although the number of decided states can vary, here we concentrate on having at most two decided states, representing a novel behaviour with two possible variants. This reflects the experimental design from [16] and is more broadly consistent with cultural-diffusion studies that use a dual-action, control-experimental design, where groups are exposed to a puzzle with two equally difficult, and equally rewarding, solutions [8,32]. In our simulations, each algorithm runs for hundreds of time cycles, where, in each cycle, a randomly selected initiator exhibits a food-gathering behaviour observed by a randomly selected recipient. That behaviour may or may not change the recipient's state, depending on the social-learning algorithm used and the state of each of the two agents.

In AM, such an interaction converts an undecided recipient to the state of the initiator and makes a recipient undecided if that recipient is in a decided state different from that of the initiator. This is illustrated more formally in table 1, where *u* represents the undecided state, and *x* and *y* represent two different decided states. Reading down the initiator columns, an undecided initiator has no effect on any recipient. An initiator who is decided on *x* converts an undecided recipient to *x*, has no effect on an *x* recipient and causes a *y*-decided agent to become undecided. Initiations from decided *y* agents have the analogous effects.

It is easy to demonstrate that this algorithm can also work with any other number of states. With a single decided state, such as *x*, results can be seen in the first two rows and columns of table 1, where a decided *x* initiator converts an undecided recipient to *x*, but otherwise there are no changes. Adding an additional row and column for each state greater than 2 works in the same manner as table 1. That is, a decided initiator converts an undecided recipient to its state and causes agents with other decided states to become undecided, while an undecided initiator continues to have no effect on other agents.

The AM algorithm, which has intellectual roots in chemistry and molecular biology, has been shown analytically to be the most effective way to transform an initial majority behaviour into a population consensus through entirely local interactions [33]. The authors in [33] presented three formal proofs for two decided states. First, *n* agents reach consensus within *n* log *n* interactions. Second, the final consensus value is reliably the initial majority state, if its initial margin exceeds √*n* log *n*. Third, consensus can be reliably achieved despite up to √*n* agents displaying Byzantine behaviour. Byzantine behaviour refers to agents which act in a random or even purposefully scheming, deviant or disruptive manner. In our simulations, Byzantine agents (represented as *b*) do not copy or demonstrate successful states, even if they are randomly selected as recipient or initiator, respectively. Relevant to our purposes, AM is a fast and effective way to transmit knowledge through a population.

To the best of our knowledge, AM has not previously been applied to any social or behavioural phenomena. Its only published application was to the cell-cycle switch in molecular biology [34]. Our simulations are thus the first attempt to apply AM to social learning. We consider that an initiator demonstrates how to deliver a desirable piece of food by opening a puzzle box, and that this sequence of events is observed by a recipient. Consistent with the tenets of social learning theory, learning can occur by observing another agent's behaviour and by noting the consequences of that behaviour [35]. Learning by imitation can often occur in a single trial (as noted in diffusion of innovations [36]), and the initial reinforcement is often experienced vicariously rather than directly.

Mathematical modelling of social learning tends to emphasize two important features: observed pay-offs and frequency dependence [30]. Although we do not vary the amount of reinforcement, we assume that a successful demonstration produces vicarious reinforcement (i.e. desirable food). Moreover, we monitor both proportional and conformity-biased trends in copying. Because our agent-based modelling involves contact between individual agents operating with a specified copying strategy, it is different from, and complements, purely mathematical approaches that rely on the mean field approximations valid primarily in large populations.

Although the bird experiments that we simulate often involved observation by more than one recipient, their data do not provide sufficient guidance on the frequencies and sizes of such audiences [16]. Rather than making possibly unwarranted assumptions about these audiences, we retain the random pairing of one initiator and one recipient from the classic AM algorithm. This is also mindful of the notion that all learning is ultimately individual, even though it is sometimes guided by social cues (BG Galef Jr 2013, personal communication).

If AM proves to be successful in simulating imitation in birds, we would want to know why, in terms of its essential characteristics. For this reason, we design and include the AC and CIU algorithms, which possess some but not all of AM's features. In AC, a decided initiator converts any recipient to the state of the initiator. In CIU, a decided initiator converts an undecided recipient to the state of the initiator, but a decided recipient remains in its previous state.

Comparing the three algorithms, we note that all assume that agents can be in any of four states: *u*, *x*, *y* or *b* (Byzantine). All three algorithms also reduce undecidedness (or uncertainty) in a population by converting undecided agents to a decided state. However, only AM and CIU specify that agents must be in the *u*-state to be converted to a decided state, and only AM allows decided agents to return to the *u*-state. There is extensive support (both modelling and empirical efforts) for the hypothesis that animals are more likely to CIU or be undecided [14,37–45]. The AM and CIU algorithms gain support from such findings.

Unlike threshold models which typically assume that agents somehow possess global knowledge of population states [46], our three learning algorithms function with the simpler assumption that agents may notice only one instigating agent at a time, thus better matching the bird experiments, with only one opener at a time [16].

Our simulations specify that agents possess a behavioural tendency, which can have any of four values: *u*, *x*, *y* or *b*. Alternatively, we might have allowed for compound states [47] of *xy* or *yx*, reflecting the possibility that an agent could discover one solution on its own and imitate another solution. We did not use compound states for most simulations because any bird recorded in [16] could execute only one solution in any given episode, and our simulations ought to mimic that behavioural constraint. We do, however, explore the relation of compound states to our standard model.

Most of our simulations contain 100 agents to allow easy conversion to proportions by the reader. This is also realistic because the populations that we simulate each averaged about 100 individuals [16]. To further ensure realistic simulations, we seed our populations with numbers of initial agent states matching those manipulated or observed in the bird experiments [16]. Such seeding allows experimental control over the conditions present in each study. Although copying never fails in our presented simulations, introducing a failure rate merely slows the pace at which meaningful interactions occur. For simplicity, we therefore set the probability of copying to 1, without loss of generality.

AM possesses several characteristics that make it a promising candidate for simulating information diffusion in wild birds. First, AM allows for multiple decided states (two are required for the bird experiments) and for return to an undecided state (if that proves to be important). Second, AM has also been shown to robustly establish population consensus, which could potentially yield the conformity bias seen in the bird experiments. Third, it is easy to seed AM to mimic the various initial experimental conditions used in the bird studies. Fourth, AM's ability to change agent states through communication of a demonstrator's current state is a natural way to implement imitation of a successful demonstration. And fifth, AM can be readily compared with plausible alternative algorithms (AC and CIU) that could help explain AM's predicted success.

In the main text, we present the results of our agent-based simulations. To gain additional, mathematical insight into these results, we present differential equation models of most of these phenomena in the electronic supplementary material. We also provide an alphabetical glossary in the electronic supplementary material as a convenient reminder of technical terms and novel acronyms used throughout the paper.

## 3. Results

Simulation results are presented under nine headings: (i) time to achieve consensus, (ii) diffusion of knowledge, (iii) naive agents joining an experienced group, and six analyses of conformity bias ((iv) agents joining a group with a different solution, (v) groups with two behavioural variants, (vi) changes in the proportion of the seeded option, (vii) per cent changes in majority and minority states, (viii) first learning versus switching, and (ix) compound states).

### 3.1. Time to achieve consensus

To determine how long it takes for each of our three algorithms to achieve consensus among non-Byzantine agents, we run 20 preliminary simulations for each algorithm seeded 73*u*–2*x*–0*y*–25*b*, meaning 73 undecided agents, 2 *x* agents, 0 *y* agents and 25 Byzantine agents. Consensus is defined as having all *x*- or all *y*-states (0 undecided) among the non-Byzantine agents. The mean cycles to reach this level of consensus are plotted in figure 1 for each algorithm, along with s.d. error bars. These results indicate that, when there is only one decided state, we can be assured of nearly complete social learning in about 75 of 100 agents if we run simulations to 1100 cycles.

### 3.2. Diffusion of knowledge

To simulate a scenario with an initial pair of knowledgeable individuals (treatment condition), we run 20 simulations seeded 73*u*–2*x–*0*y–*25*b* for 1100 cycles. To simulate a situation without any seeded agents (control condition), we begin the simulation at 330 cycles and run for 770 cycles, with a seeding of 74*u*–1*x*–0*y*–25*b*. This corresponds to about one-third of the way through the standard treatment experiment in [16], by which time a solution was discovered by about one bird. Essentially, social learning starts later in the control condition with only one knowledgeable agent who individually discovers a solution. Treatment and control results are presented in figure 2 for each of the three algorithms.

To determine whether diffusion in the treatment conditions takes a sigmoidal shape as in the bird experiments, the simulated data are fitted with a three-parameter sigmoid function,
3.1where *α* is the inflection point (when growth in *y* starts to decrease), *β* is the steepness (growth rate) and *κ* is the upper asymptote (carrying capacity). The sigmoid fit is compared with a linear fit, represented by a two-parameter function,
3.2where *a* is the slope and *b* is the *y* intercept. These fits, done with the Curve Fitting App in Matlab R2016a, are shown for data generated by the AM, AC and CIU algorithms in figure 3.

Sigmoid and linear fits to the same dataset are evaluated with the Akaike information criterion [48], computed for the least-squares minimization used in Matlab, as
3.3where *K* is the number of parameters and *n* is the number of data points. The preferred data-fitting model has the smallest AIC value. AIC thus rewards goodness of fit while imposing a penalty for the number of estimated parameters, to discourage over-fitting.

The smallest (best) AIC value can be compared with the next best AIC using relative likelihood, computed as
3.4where AIC* _{i}* is the next best AIC value. RL is the probability of the alternative model providing as good a fit as the best model. In each of the three pairs of fits, the RL of the linear model is extremely close to 0. This means that, for each of the three simulation algorithms, simulation of the treatment condition yields a sigmoidal diffusion curve, just as with the bird data in [16].

### 3.3. Naive agents joining an experienced group

In [16], puzzle boxes were reinstalled nine months after the original experiment for just 5 days in order to test for cross-generational transmission of information. On average, only 40% of this population had been present in the original treatment condition, while the remainder were naive about the puzzle boxes. Solving was much faster for the 40% prior solvers compared with their performance in the original treatment condition, indicating retention of the learned solution, and also faster for these naive birds compared with those in the original treatment conditions.

We simulate the faster learning of the naive birds with a population seeding of 45*u*–30*x*–0*y*–25*b*, run for 275 cycles, which is one-quarter of the original 1100 cycles. The change here is that there are many *x*-seeded agents [30], estimated as 0.4 of the original population of 75 solvers. Results are shown in purple for the AM, AC and CIU algorithms in figure 4 compared with the original simulated treatment condition (in black) for that algorithm. In each case, naive agents quickly reach about 35% success, as did the naive birds who joined a relatively knowledgeable group. Thus, as with the shape of diffusion curves, all three simulation algorithms capture the learning boost afforded to naive agents joining a group having many knowledgeable agents. Further simulations show that these naive joiner agents reach about 45 solvers when simulation continues to 1100 cycles, yielding an *r*-shaped diffusion curve.

### 3.4. Agents joining a group with a different solution

We next turn to several manifestations of conformity bias, which begin to differentiate the three learning algorithms. Situations in which knowledgeable immigrants (*n* = 14) join groups that exhibit a different solution provide a straightforward test of conformity bias [18]. Immigrants in this situation can do any of three things: retain their original solution, become undecided or convert to the solution of the new majority. In [16], 71% of these individuals changed their behaviour to match that exhibited by the majority of their new group. We simulate this by starting with a completed standard treatment condition in diffusion simulations and adding 14 immigrants with the opposite *y* solution, a seeding of 0*u*–75*x*–14*y*–25*b*. Means and 95% CIs for AM and AC are plotted in figure 5, revealing that these two algorithms capture conversion of the immigrants to the new majority. With CIU, there is complete retention of the former solution. There are not yet sufficient empirical bird data to decide between the quality of the coverage provided by AM and AC, but it is noteworthy that AC does not permit any return to undecided states. The AM results suggest a sensible narrative in which the 14 immigrants enter with their former solution (orange), become undecided as they see others with a different solution (black), and eventually conform to this new majority solution (purple).

### 3.5. Groups with two behavioural variants

A larger-scale test of conformity bias can be made using populations with both majority and minority seeds from a relatively early cycle. Here, we start the simulation one-quarter of the way into the experiment, at 275 cycles, and run for 825 additional cycles, recording majority and minority solutions at each cycle. By that starting point in the bird experiments, there was likely to be about one individual discovery of a minority solution, while the majority solution originally seeded at *n* = 2 was likely to have expanded to approximately *n* = 10. Thus, our seeding at cycle 275 is 64*u*–10*x*–1*y*–25*b*. The numbers of majority (*x*), minority (*y*) and undecided agents (*u*) are plotted for each of the three learning algorithms in figure 6. For each learning algorithm, there is a sigmoidal increase in agents with the majority solution and a corresponding reverse sigmoidal decrease in undecided agents. Where the algorithms differ is in the minority solution, which increases slightly and then decreases for AM agents, but continually increases for AC and CIU.

In these AM simulations, seeded with 10 *x* agents and 1 *y* agent, our initial margin is only 9. Nonetheless, the *x* solution is in the final majority in all 20 replications, with the mean state frequencies of *x* = 72.8 and *y* = 0.15. The mean final margin is 72.65. This is consistent with mathematical proofs about AM, which establish that relatively small initial margins reliably achieve consensus [33].

### 3.6. Changes in the proportion of the seeded option

Subtle differences between the algorithms can also be identified in the proportion of the seeded option over cycles. In [16], a generalized estimating equation was applied to the data to reveal a consistent increase in the proportion of seeded majority solutions in their birds, computed as *x*/(*x* + *y*) at each cycle. Figure 7 shows that only the AM algorithm simulates this result, using the data generated in §3.5. The proportion of the seeded option increases to near 1.0 for AM, as it did for the birds, while it remains near the starting value of 0.91 for AC and CIU. An ANOVA of the proportion of the seeded option on the last cycle yields a main effect of algorithm, *F*_{2,57} = 7.4, *p* < 0.001. Multiple comparisons using the LSD test indicate that the only significant differences are that AM finishes with a higher proportion of the seeded majority than does either AC (*p* < 0.01) or CIU (*p* < 0.001), which do not differ from each other or from the initial seed proportion of 0.91.

### 3.7. Per cent changes in majority and minority states

A particularly informative indicator of conformity bias involves a comparison of per cent change in the number of agents in the majority and minority states from the start of the experiment. Per cent change is computed as
3.5where *n _{c}* is the number of decided agents at cycle

*c*(greater than 0) and

*n*

_{0}is the number of decided agents at the start, cycle 0. Results from §3.6 are plotted in figure 8 as the mean per cent change at each cycle for each algorithm across 20 replications. Per cent increases in solution states are substantial for all combinations of state and algorithm except for the minority solution

*y*under AM, which initially increases and then decreases.

To analyse these results, we perform an algorithm by state mixed ANOVA of the per cent change on the last (825th) cycle, with state (*x* or *y*) as a repeated measure. This yields an interaction between algorithm and state, *F*_{2,57} = 7.83, *p* < 0.001, reflecting the fact that only the AM algorithm produces a difference between the per cent change of *x*- versus *y*-states. For AM, the majority *x*-state shows a large per cent increase, while the minority *y*-state does not. Means and s.e. bars are plotted in figure 9. By contrast, the other two algorithms (AC and CIU) produce large per cent increases in both the *x-* and *y*-states. This pattern is further explored by performing a separate ANOVA for each algorithm comparing per cent change of *x-* and *y*-states, which is equivalent to a dependent *t*-test for each algorithm. This yields an *F*-value of 3831 for AM, *p* < 0.0001, and *F*-values of 0.072 for AC and 1.223 for CIU, neither of which is significant.

Once again, AM is the only learning algorithm to exhibit conformity bias, resulting in consensus. Because this is such a strong prediction of our simulations and was not examined in the report of the bird data [16], we reanalysed the bird data to test this specific prediction. Per cent changes in majority and minority solutions at each of the five subpopulation sites are plotted in figure 10. Data points are based on the number of birds who exhibited each solution more than 50% of the time on each given day. The seeding for each group is based on results during day 1. Per cent change is calculated from these seeds using equation (3.5), just as in the simulations. The mean cumulative per cent change for the last day is greater for majority (mean of 1478) than for minority (mean of 33) solutions, *t*_{8} = 3.19, *p* < 0.01 (one-tailed). This confirms AM's prediction that per cent change over time is greater for majority than for minority solutions and is inconsistent with the predictions of the AC and CIU algorithms that majority and minority solutions increase at the same rate. This outcome resembles the simulations on the proportion of the seeded majority option, but is even more revealing about the changes occurring in both minority and majority solutions.

Further simulations show that CIU yields fewer state changes than either AC or AM, which do not differ on this measure. This makes sense because only CIU agents always retain their first decisions.

### 3.8. First learning versus switching

To further understand how AM achieves conformity, it is helpful to draw a distinction between conformist learning by naive individuals and conformity that arises as a result of switching between solutions [17]. Theoretical models typically focus on the former, where biases in social learning influence which solution is first learned by an individual [11,12]. By contrast, empirical studies have often concentrated on conformity in the latter sense, where individuals switch from a known solution to a socially learned norm [21,22,31].

To elucidate this distinction, we run a series of simulations where initial seeding is iterated from 90*u*–0*x*–10*y*–0*b* up to 90*u*–10*x*–0*y*–0*b*. Specifically, we simulate a world with 0 *x* agents and 10 *y* agents, then replace one of the starting *y* agents by an *x* agent and repeat the simulation. Iteration continues until there are 10 *x* agents and 0 *y* agents. This affords examination of seeding conditions where *x*/(*x* + *y*) ranges from 0 to 1, in increments of 0.1. We record two new quantities. The first tracks the proportion of undecided agents first learning the *x* (as opposed to *y*) solution, given the current *x*/(*x* + *y*) proportion in the population. The other quantity tracks the proportion of decided agents switching to *x* (as opposed to *y*) if knocked back into an undecided state, as a function of the *x*/(*x* + *y*) proportion at the time the agent reverts to undecided. To ensure a large and representative sample of learning and switching events at each *x*/(*x* + *y*) value, we run 1000 simulations for 2000 cycles each.

We find that AM, AC and CIU all show a linear relation between the proportion of the population exhibiting *x* and the proportion first learning *x* (figure 11*a*). However, when we consider the proportion switching from a decided state to *x*, this picture changes. Although the relation remains linear for AC (and cannot occur for CIU where switching is impossible), it is sigmoidal for AM (figure 11*b*). We conduct linear and sigmoid fits to the AM switching data, using the same techniques described in §3.2 (fit results in figure 12). The RL of an equally good linear fit for AM switching is 6.72 × 10^{−8}. AM is therefore positive frequency-dependent in terms of its switching probability, but not in terms of which solution is first learned. This nonlinearity arises because, between the time an agent is returned to an undecided state and when it finally settles on a solution, the population make-up changes, usually in favour of the majority.

### 3.9. Compound states

To better match the bird experiments [16] that we model, we consider agent states in our current models to represent behavioural tendencies, not knowledge states. At any given time, a bird could search for food in a puzzle box from either the left or the right side, and that is what is coded empirically and in our foregoing models. However, the question arises whether birds who return to an undecided state (as in AM) are better represented as exhibiting neither solution. Not knowing which behaviour to exhibit could plausibly lead to both behaviours being exhibited, rather than neither. Although we have no empirical data on this, we test the implications of such a change by implementing a different population protocol, the binary agreement model (BAM; [47]). In BAM, interaction between an *x* and a *y* agent yields a compound state *xy*, rather than conversion of one of the agents to undecided. The effect of an *xy* agent on other individuals depends on which solution it is exhibiting, which is treated as a coin flip. Depending on whether the *x* solution or *y* solution is exhibited, an *xy* agent acts as an *x* or *y* initiator (table 2).

Plotting means for 1000 simulations over 1500 cycles, using the same initial conditions as in §§3.5–3.7 (64*u*–10*x*–1*y*–0*xy*–25*b*), we find that the addition of a compound *xy*-state does not change the qualitative outcome of our AM results. BAM reaches consensus on one solution to the exclusion of the other, albeit at a slower pace than when the intermediate state is an undecided one as in AM (figure 13). Moreover, BAM yields qualitatively similar results to AM when examining first learning versus switching (figure 14). For BAM, like AM, first learning is linear, whereas switching is sigmoidal in relation to the proportion of solution *x* in the population (figure 15). The RL of an equally good linear fit is 0.12 for BAM switching. The compound proportion of *x* in the population is more complicated for BAM in figures 14 and 15; the denominator includes all the compound *xy*-states, while the numerator includes half of them because the *x*-solution is exhibited about half the time. These findings with BAM support the idea that consensus can be achieved with a compound *xy*-state as well as an undecided state.

### 3.10. Analytical model

Our method so far consists of using agent-based simulations to compare our three algorithms' fits with real-world data. To gain additional insight into our results, we construct an analytic model, presented in the electronic supplementary material. In this analytic model, we represent each algorithm as a system of differential equations, where the probability of a given conversion represents a deterministic rate of change. The advantage to this approach is its abstract simplicity, although it fails to capture the effects of stochasticity when population sizes are small. Thus, this mathematical analysis is primarily intended to guide intuitions about each algorithm's average behaviour in large populations. Results indicate that the phenomena in our simulations are well approximated by this model. The resulting mathematical insights for the covered sections are integrated into the Discussion.

## 4. Discussion

Our results fall into two broad categories: empirical data that are successfully covered by all three learning algorithms and empirical evidence of conformity bias, which is only covered by the AM algorithm.

### 4.1. Results covered by all three algorithms

Among the empirical phenomena that can be covered by all three algorithms are differential solution levels between treatment and control conditions, the sigmoidal shape of diffusion curves in the treatment conditions reaching an asymptote of about 75% solvers, and the rapid acquisition of solutions in naive agents who join a knowledgeable group. Each of these phenomena can be explained by the number of knowledgeable agents in the group. The treatment condition is seeded with two knowledgeable agents while the control condition has none until a single agent individually discovers a solution. Both the simulations and mathematical analysis capture this condition difference, with the treatment condition producing a full sigmoidal shape, and the control condition producing only the initial increase from a late start and not being allowed to continue past the end of the experiment. The asymptote of 75% solvers in both analytical and simulation methods is due to the inclusion of 25 Byzantine agents whose chief characteristic is to not exploit this food source. The relatively rapid diffusion of solutions in naive agents who immigrate into a knowledgeable group is also dependent on the number of seeded knowledgeable agents. Starting with 30 knowledgeable agents, as opposed to 2, increases the initial number of *x* agents by a factor of 15.

Because these experiments involve cases where only one solution is available, all three algorithms are equivalent here. This can be verified by eliminating the third column and third row in table 1, representing an alternative (*y*) solution, and noting that all three algorithms implement only a single type of change—converting an undecided agent to solution *x* when the *x*-solution is observed by that undecided agent. The corresponding mathematical insight (in the electronic supplementary material) is that, without any *y*-states, all three algorithms exhibit identical dynamics. Curve fitting establishes that simulated knowledge diffusion over the full experimental period takes a sigmoidal shape. Our mathematical model corroborates the finding that this knowledge diffusion is sigmoidal, and we identify a precise form of that function (electronic supplementary material, equation S18).

### 4.2. Conformity bias

Considering conformity bias phenomena begins to differentiate these learning algorithms. By including the AC and CIU algorithms, which contain key aspects of AM, we gain insight into why AM creates conformity bias. This pattern holds for all our conformity bias results: switching to the majority solution, decreasing the frequency of minority solutions, increasing the proportion of majority solutions, increasing per cent changes only in majority solutions and first learning versus switching. The reason for AM's superior coverage of conformity bias can be traced to its unique property of revisiting undecided states. Observing a novel solution causes a recipient with a different solution to become undecided. Although both AC and CIU reduce uncertainty by changing to decided states, they do not allow transitions from decided to undecided states. Changing to undecided states in simulations allows AM to eliminate minority solutions and achieve consensus around a majority solution. By contrast, minority solutions are more likely to be maintained within the AC and CIU algorithms, roughly in proportion to their initial seeding. Although there is apparently no direct published evidence for becoming undecided before changing one's mind, our extension of a leading computational model of decision-making [49] suggests that simulated agents virtually always become undecided before changing a decision. Our BAM results enable a more general conclusion that any neutral state (here, compound knowledge or undecided behaviour) might suffice for producing conformity bias, positive frequency dependence and consensus.

This does not mean that AC and CIU agents fail to conform. Indeed, they exhibit about a 600% increase in the frequencies of both majority and minority behaviours over their initial seeded values (illustrated most clearly in the solid and dotted gold and purple curves in figure 8). To summarize, AC and CIU agents show strong conformity, in the psychological sense of increased behavioural sameness, with both their minority and majority behaviours, while AM achieves near consensus on its majority behaviour, thus uniquely exhibiting positive frequency dependence and covering the conformity bias observed in the bird data that were the target of our simulations [16]. The near consensus achieved by AM can be regarded as an extreme degree of positive frequency dependence. This does not imply that algorithms like AC and CIU are uninteresting for understanding lesser degrees of conformity. They may yet provide viable explanations of maintenance of minority behaviours in phenomena such as political elections in human populations. It is also worth noting that AM could preserve minorities in different groups, dependent on different initial seeding conditions and social structure.

AM's unusual propensity for consensus conformity can be further understood by analysing population states as a dynamical system, as shown in the electronic supplementary material, §§4–7. In AC and CIU, the system changes until there are no undecided agents left, where the mixture of final minority and majority solutions is proportional to initial seeding values. By contrast, so long as two solutions exist in AM, the system continues to change, with no stable equilibrium to fall into until the minority and undecided states become extinct.

Our results clarify how the tendency for individual agents to imitate vicariously rewarded behaviour relates to conformity bias at the population level. Individual imitation is necessary but not sufficient for producing conformity bias or consensus. All three algorithms employ individual imitation of vicariously rewarded behaviour. But only AM exhibits conformity bias at the population level, via its unique capacity to send decided agents back into undecided states, thus making them susceptible to further social influence.

While previous mathematical models identify frequency phenomena as important in social learning [30], our agent-based and mathematical models show precisely when and how conformity bias occurs and differs from changes proportional to initial seeding. By building consensus, AM exhibits conformity bias (extreme positive frequency dependence), while AC and CIU maintain their initially seeded proportion of a population.

### 4.3. First learning versus switching

Another sort of evidence for conformity bias was reported in [16], in the form of plots of the proportion of first-learning states as a function of the proportion of those states in the groups observed previous to this learning event. This showed a sigmoidal relation between the probability of adopting a behaviour and its frequency in the foraging group, suggesting that individuals were assessing multiple others when learning. Our simulations do not attempt to simulate this. Instead, AM, AC and CIU all show a linear relation between the proportion of the population exhibiting a solution and the proportion first learning that solution. Switching probability is also a linear function of population proportion for AC (and cannot occur for CIU), but is sigmoidal for AM. This confirms that AM is positive frequency-dependent in its switching probability, although not in terms of first learning. This occurs because the proportion of *x* agents changes between the time an agent becomes undecided and when it finally settles on a solution. These results underscore that conformity may arise in more than one way, via first learning or switching.

Interestingly, and unlike most theorizing about conformist transmission, where naive recipients observe a group to assess the relative frequency of behaviours, our model starts with the simpler assumption that an individual recipient can change its state after observing a single initiator. AM integrates information over time, as agents move between decided and undecided states. This yields a sigmoidal relation between the current frequency of a behaviour and the proportions of agents switching to that behaviour, offering a cognitively simpler and local route to consensus, without tracking more than one initiator at a time, as in threshold models [46].

### 4.4. Time scaling

There is an issue of coordinating the scaling of time across the bird experiments in [16] (usually 20 days), our computer simulations (hundreds of cycles) and our mathematical analysis (in the electronic supplementary material, 10–12 steps). We do not have enough information about audience sizes in the bird experiments [16] to precisely match time cycles in the simulations. Note that each day of the bird experiments contains many demonstrations observed by many knowledgeable and naive birds, while each simulation cycle corresponds to a single demonstration observed by a single agent, and each time step in the mathematical analysis represents the mean outcomes over large numbers of demonstrations and observations. However, we are careful to launch both of our models with the same initial seeding of agent states and stop at the same asymptotes as in the bird experiments, as an approximate way to standardize time.

### 4.5. Evolution

Because there is no difference in pay-offs between alternative foraging techniques in the bird experiments [16] that we simulate, it is puzzling why individual and group-level preferences for one technique did not erode over time. By the same token, it is worth asking why the AM algorithm, which encourages consensus conformity through uncertainty and conversion, should best capture these data. One possible explanation is that there is some evolutionary basis for conformity bias. Because our present models stay within a single generation, they cannot shed much direct light on the evolution of social learning methods. However, conformity is thought to offer some evolutionary advantages in environments with spatial [13] and even temporal [11,12] variation because it allows immigrants to quickly adopt locally adaptive behaviour. Though these conclusions have been debated [13,50], empirical evidence suggests that conformity can be found in a variety of species [16,18,51]. An evolutionary basis for conformity bias therefore seems worthy of further research.

### 4.6. Social networks

An interesting aspect of the study in [16] that we do not include is that information was observed to spread across social network connections. An investigation of the effect of social networks on learning rules is beyond the scope of our paper. It is interesting to note, however, that all other empirical phenomena in [16] can be simulated and produced analytically with AM without spatial constraints. This suggests that our simulations would be robust against variation in social networks. The birds from which empirical data were derived in [16] belong to a highly social, fission-flocking species over winter, and thus show relatively little structure in their social networks [52]. We predict that the main influence of social networks in our simulations would be to slow the diffusion of information, and that this slowing effect would probably increase as networks become more modular. Even with simulated social networks, we expect that all other key phenomena in [16] would emerge as we report here.

### 4.7. Other potential applications of approximate majority

It is possible that AM could be useful across a wide range of fields, from the study of epidemics to quorum sensing, social influence, rumours, conversions, spread of memes, emotional contagion, marketing, opinion formation, decision-making, etc. AM, and presumably any other algorithms that achieve consensus through reversion to neutral states, could be worth exploring in other systems dealing with diffusion of information.

## 5. Conclusion

We apply three algorithms (AM, AC and CIU) in agent-based computer simulations to recent experiments on the spread of innovations in wild birds. Some of the phenomena found in the bird experiments are simulated by all three learning algorithms and explained by variation in the number of knowledgeable agents, and thus the number of solution demonstrations. Of the three featured learning algorithms, only AM achieves consensus and shows positive frequency dependence, due to revisiting undecided states. A differential-equation model matches the simulation results and provides additional insights into the dynamics of these algorithms.

## Data accessibility

The computer code for running these simulations is stored at the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.2p5rb [53].

## Authors' contributions

T.R.S. conceived of the study, designed and executed most of the computer simulations and data analyses, drafted many sections and participated in all the writing. M.M. created the mathematical models, drafted the text for those, designed and executed the simulations on first learning, switching and compound states, and contributed to all the writing. L.M.A. was the first author of the original inspirational bird paper, answered follow-up questions about that study, drafted the literature review, assembled the bird data to test model predictions on per cent increases of solutions and contributed to all the writing. All three authors gave final approval for publication.

## Competing interests

We declare we have no competing interests.

## Funding

This work was supported in part by operating grants to T.R.S. from the Natural Sciences and Engineering Research Council of Canada (7927-2012-RGPIN) and from the Social Science and Humanities Research Council of Canada (410-2011-0099). M.M. is supported by a graduate fellowship from the Natural Sciences and Engineering Research Council of Canada. L.M.A. is supported by a Biotechnology and Biological Sciences Research Council (BB/L006081/1) grant (P.I. Ben Sheldon) and a junior research fellowship from St John's College, University of Oxford.

## Acknowledgements

We are grateful for comments and suggestions from Artem Kaznatcheev, Peter Helfer, Kevin Da Silva Castanheira, Simon Reader, Milad Kharratzadeh, Ardavan Salehi Nobandegani, Michael Smilovitch and our reviewers.

## Footnotes

Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3801436.

- Received March 21, 2017.
- Accepted May 31, 2017.

- © 2017 The Author(s)

Published by the Royal Society. All rights reserved.