## Abstract

Cultural change can be quantified by temporal changes in frequency of different cultural artefacts and it is a central question to identify what underlying cultural transmission processes could have caused the observed frequency changes. Observed changes, however, often describe the dynamics in samples of the population of artefacts, whereas transmission processes act on the whole population. Here we develop a modelling framework aimed at addressing this inference problem. To do so, we firstly generate population structures from which the observed sample could have been drawn randomly and then determine theoretical samples at a later time *t*_{2} produced under the assumption that changes in frequencies are caused by a specific transmission process. Thereby we also account for the potential effect of time-averaging processes in the generation of the observed sample. Subsequent statistical comparisons (e.g. using Bayesian inference) of the theoretical and observed samples at *t*_{2} can establish which processes could have produced the observed frequency data. In this way, we infer underlying transmission processes directly from available data without any equilibrium assumption. We apply this framework to a dataset describing pottery from settlements of some of the first farmers in Europe (the LBK culture) and conclude that the observed frequency dynamic of different types of decorated pottery is consistent with age-dependent selection, a preference for ‘young’ pottery types which is potentially indicative of fashion trends.

## 1. Introduction

How does cultural change happen? What are the cultural transmission processes that underlie observed episodes of cultural change? Those and similar questions have been the subject of intense research in the last few decades. Naturally, fine-grained individual-level data detailing who learns from whom would be most suited to answer these questions empirically. However, outside experimental conditions, large individual datasets of this kind are difficult to obtain, especially in pre-modern contexts. Studies such as [1] that used historical records of the board game Go, or [2] that used contemporary data from Fiji to determine the relative importance of different cultural transmission processes are remarkable exceptions. Much available data in archaeology, biological anthropology, animal behaviour or psychology, however, describe the population-level outcomes of the transmission processes. The data are often in the form of frequency distributions of a number of different variants of a cultural trait in the population at a certain point in time, or in a time-series describing the dynamics of the frequency change of cultural variants over time. Furthermore, the data usually comprises sparse samples from the whole population.

But how well can population-level data answer questions about the underlying cultural transmission processes that produced observed episodes of cultural change? It has been suggested that the shape of the adoption curve (detailing the fraction of the population that has adopted a specific cultural variant) might be indicative of the underlying transmission processes (e.g. [3–6]). However, the diagnostic power of these approaches, especially when applied to sparse data, has been proved to be limited (e.g. [7–10]).

In archaeological applications, a lot of emphasis has been put on testing the consistency between observed empirical patterns and the hypothesis of neutral evolution. It has been argued that neutral evolution provides the most parsimonious mechanism to account for the common observation that new artefact attributes, such as decoration patterns, appear, rise in popularity then decline in frequency and go extinct [11]. To determine whether the composition of an observed archaeological assemblage is consistent, or not, with unbiased social learning (the mechanism of neutral cultural evolution), empirical diversity statistics of the observed assemblage have been compared with their theoretical expectations at innovation–drift equilibrium (e.g. [11–14]). Further, Bentley *et al*. (e.g. [15–17]) have used the full variant frequency distribution of a dataset and compared it with the right-skewed power-law distribution expected to arise under unbiased social learning at innovation–drift equilibrium [18]. However, it has been pointed out that the power law as limiting distribution can be generated by a large number of underlying mechanisms [19,20]. Nevertheless, Premo [21] showed that the power-law approach produces more robust results when time-averaging processes are considered than the diversity approach. To examine the temporal dynamic of neutral evolution, Bentley *et al.* [22] and Acerbi & Bentley [23] used the rate of turnover within the *n* most popular variants in a population to test for departures from the neutral evolution hypothesis within a model comparison framework. Potentially, it may be difficult to apply this approach to sparse archaeological data where the total number of observed cultural variants is low (such that a ‘top list’ of size sufficient to accurately calculate turnover rates cannot be defined). Crucially, the approaches described above are based on the assumption that the cultural system considered is at innovation–drift equilibrium. This means the system is in a state where the loss of cultural diversity through drift and the introduction of diversity through innovations are balanced. For a more detailed review of this literature, see [24].

In this paper, we contribute to this discussion by developing a generative inference framework, going beyond our previous work on this topic [25] to address sampling and other issues. Generative inference frameworks have already been successfully applied to genetic applications (e.g. [26,27]). Importantly, the framework will not rely on the assumption that the cultural system is at equilibrium. It has been shown that the inference of underlying transmission processes should, if possible, be based on the temporal dynamic of cultural change and not on the cultural composition of the population at a single point in time [28,29]. Therefore, the aim of our framework is to evaluate whether frequency changes of different types of cultural variants observed in two archaeological assemblages between two time points *t*_{1} and *t*_{2} (with *t*_{2} > *t*_{1}) are consistent with a specific hypothesis about the underlying mechanisms of cultural transmission [25]. The approach is similar in spirit to the model selection framework developed by McElreath *et al.* [30], but we do not restrict the choice of the modelling framework by the availability of an analytical representation of the corresponding likelihood function. In particular, we use the frequency distribution of the observed sample, denoted by (throughout the paper, the ‘overline’ symbol stands for observed quantities), of cultural variants at time *t*_{1} as the starting point of our analysis. But importantly a sample describes only a fraction of the whole population of cultural variants. Cultural transmission processes, however, act on the whole population and therefore any inference framework must be based on the cultural composition of the population rather than the observable sample. To incorporate this fact, we firstly generate population structures from which the observed sample could have been drawn at random. Then we determine the statistical properties of theoretical samples, denoted by *S*(*t*_{2}), produced under the assumption that changes in frequencies seen at time *t*_{2} are caused by a specific transmission mechanism. By this means, we also account for potential time-averaging processes, where time-averaging refers to the phenomenon where cultural remains deposited at different times are preserved together and cannot be distinguished from one another, the rule rather than the exception in archaeology (e.g. [21]). Subsequent comparisons of the properties of the theoretical sample with properties of the observed sample will allow for conclusions about the consistency between the transmission hypothesis and the observed changes in frequency of the cultural variants in the time period [*t*_{1}, *t*_{2}]. In particular, we are interested in exploring the consistency with neutral evolution, frequency- and age-dependent selection processes. For that we use approximate Bayesian computation (ABC) to establish which transmission processes could have produced the observed frequency data and therefore infer the underlying processes directly from the available data without any equilibrium assumption. The outcomes of the framework developed are posterior distributions for the model parameters (characterizing the specific transmission process), indicating the range of the transmission processes that could have produced the observed changes in frequency.

We stress that it is not known *a priori* whether observable population-level patterns, such as observed changes in frequency of cultural artefacts, carry a strong signature of the underlying transmission process and therefore whether reliable inference is possible on the base of the observed data [3]. However, we envision that a generative inference framework might shed light on this problem. Knowing when inference should not be based on population-level data is equally important, as in these situations other lines of evidence have to be considered.

## 2. The inference framework

The inference framework is designed to investigate whether observed changes in frequencies of cultural variants between two time points *t*_{1} and *t*_{2} are consistent with a specific hypothesis of cultural transmission. It consists of two main parts. Firstly, we develop a generative model which, based on the observed sample at time *t*_{1}, generates theoretical samples, *S*(*t*_{2}), at time *t*_{2} conditioned on a specific cultural transmission process. Secondly, we use Bayesian inference techniques to establish whether the cultural transmission process considered can produce samples *S*(*t*_{2}) that are similar to the observed sample at time *t*_{2}. In this section, we discuss both parts of the framework in detail.

Throughout the paper we use the following notation. A sample of cultural variants is characterized by the number *k* of different types of variants, the total number *n _{i}* of variants of type

*i*,

*i*= 1, … ,

*k,*the sample size and the relative frequencies ,

*i*= 1, … ,

*k*. Similarly, the corresponding population, denoted by

**, is described by the following usually unknown quantities: the number**

*P**K*of different types of variants (

*k*≤

*K*), the total number

*N*of variants of type

_{i}*i*,

*i*= 1, … ,

*K*, the population size and the relative frequencies

*R*=

_{i}*N*/

_{i}*N*,

*i*= 1, … ,

*K*. Importantly, we summarize the population structure by with or with and therefore consider all variant types which are not observed in the original sample as type

*k*+ 1. This is a simplifying assumption as variant type

*k*+ 1 most probably consists of several unobserved (and potentially rare) variant types. The determination of the number of unobserved variant types is not the subject of this inference approach as we aim to establish the statistical power of inference methods based on observable population-level quantities; it will be left for future research.

### 2.1. The generative model

As already mentioned, the observed sample of size *n*(*t*_{1}) only describes a fraction of the population of cultural variants present at *t*_{1}. Cultural transmission processes, however, act on the whole population. Consequently, the first step of our model must include the determination of a population structure ** P**(

*t*

_{1}) that is consistent with the observed sample or in other words, of a population structure from which the observed sample could have been drawn by a process of random sampling. Grounding the inference framework on sample frequencies would inevitably overestimate the role of cultural drift. Based on a consistent population

**(**

*P**t*

_{1}), we generate the population-level frequencies of the different cultural variant types conditioned on the assumption that frequency changes between the time points

*t*

_{1}and

*t*

_{2}are caused by the assumed evolutionary process. In this way, we generate theoretical populations and the last modelling step consists of sampling from those populations to generate a sample

*S*(

*t*

_{2}) of size

*n*(

*t*

_{2}) (

*n*(

*t*

_{2}) denotes the size of the observed sample ) at time

*t*

_{2}.

#### 2.1.1. Generating population structures based on a given sample

Given the observed sample , we determine consistent population structures meaning population structures from which the observed sample could have been drawn by a process of random sampling, by using a Dirichlet distribution approach. This approach has already been successfully applied in population genetics (e.g. [26,27]). It results in a distribution of consistent population ** P** of the form
2.1

Consequently, sampling from distribution (2.1) generates population structures from which the observed sample could have been drawn at random. As an example, we assume a sample consisting of seven variant types with the frequencies . Figure 1 shows the 95% intervals of the possible population-level frequency ranges of the seven variant types present in the sample and of variant type eight summarizing all variants not observed in the sample. As expected, those 95% intervals are relatively wide, demonstrating the importance of this modelling step.

In more detail, we assume that the total population size *N* is much larger than the sample size *n.* Now, given a population structure the probability of a specific sample with the absolute frequencies is determined by the multinomial distribution

In this case, the conjugate prior is given by the Dirichlet distribution of the form [31]
with exponents *α*_{i} > 0 and we can derive the unnormalized posterior distribution of the population ** P** given the sample (see equation (2.1)) [31]

We note that this procedure does not need an estimate of the actual population size. The parameter setting assigns equal density to any population ** P** satisfying . However, one needs to be cautious when there are many variant types with very low frequencies in the sample. In this situation, a uniform Dirichlet distribution (i.e. ) tends to underestimate the fraction of unobserved variant types and the parameters

*α*might have to be chosen appropriately.

_{i}#### 2.1.2. Modelling the evolutionary process

In the following, we develop a framework that models the change in frequency of the different variant types in the population at each time point (where *t*_{1} and *t*_{2} are measured in years and a time step has the duration of a year) conditioned on a specific underlying transmission mechanism. To do so, we determine the total population sizes for each time point between *t*_{1} and *t*_{2}, sample a population structure ** P**(

*t*

_{1}) at

*t*

_{1}from equation (2.1) and assume that in each time point there is a fraction of the population of cultural variants that is removed from the population and subsequently replaced. While the removal process is modelled in a random fashion, the replacement process occurs according to the cultural transmission process considered. In this way, we generate theoretical population structures

**(**

*P**t*

_{1}+

*τ*) at times

*t*

_{1}+

*τ*with conditioned on a specific underlying transmission mechanism, and additionally, allow for the inclusion of potential time-averaging effects (see [21] or [32] for detailed investigations of these effects). The assumption of a random removal process is a modelling choice. This is motivated by the fact that we later apply the inference framework to a dataset describing temporal frequency changes of different types of decorated ceramic vessels. The removal of certain vessels from the population might coincide with being physically broken. It seems reasonable to assume that this occurs in a random rather than in a selective manner.

More specifically, we determine the population sizes *N*(*t*_{1}) and *N*(*t*_{2}) at times *t*_{1} and *t*_{2} by
where *s* (0 < *s* ≤ 1) describes the ratio between population size and sample size. To account for uncertainties in the estimate of this ratio, we define *s* as normally distributed with mean *s* and variance . This implies that on average the sample represents of the population; however, the concrete realization varies around this mean ratio. The population sizes *N*(*t*_{1} + *τ*) for time points between *t*_{1} and *t*_{2} are obtained by linear interpolation; in other words, they are inferred on the basis that the difference between the sample sizes at *t*_{1} and *t*_{2} is evenly distributed across the time steps between them.

Next, we sample a population structure from equation (2.1) and determine the absolute frequencies by multiplying *R _{i}* by the estimate of the total population size

*N*(

*t*

_{1}) at

*t*

_{1}.

To describe the evolutionary process between *t*_{1} and *t*_{2}, we model the frequencies of the *k* + 1 variant types as a Markov process. In each time step , we firstly randomly choose *u* variants to be removed from the population. It holds
2.2with denoting the integer part. The variable 0 < *r* ≤ 1 describes the fraction of the population being replaced based on the population size and therefore 1/*r* defines the average lifespan of each variant. We assume that *r* is modelled as a normally distributed random variable with mean 〈*r*〉 and variance . Again the use of a random variable aims to capture the uncertainty and temporal variation in the fraction of the population that is replaced in each time step and therefore in the average lifespan of a cultural variant. Equation (2.2) ensures that the number of variants subsequently added to the population is not influenced by the direction of the change in population size between two time steps. Further, *u _{i}* records the total number of removed variants of type

*i*in this time step; it holds . Secondly, we add cultural variants to the population and ensure that the total population size at

*t*

_{1}+

*τ*+ 1 is maintained. Importantly, the choice of which variant type to add is guided by the assumed cultural transmission process and the variables

*v*denote the total number of added variants of type

_{i}*i*so that it holds .

Summarizing, the two random processes, the removal and subsequent replacement of cultural variants, result in a one-step transition probability for transforming the population of cultural variants at time *t*_{1} + *τ* into at time *t*_{1} + *τ* + 1 of the form
The set *G* contains all pairs of vectors and satisfying and . (see the electronic supplementary material, appendix A for more mathematical details). In the following, we explore the temporal dynamic of the frequency change under the assumption that the replacement process is guided by neutral evolution, frequency- and age-dependent selections.

*Neutral cultural evolution* means that in finite populations cultural variant types are chosen to be added to the population according to their relative frequency and new variants are introduced by a process of random mutation at rate *μ* (see [11,12,17]). This implies that the probability for choosing a variant of type *i* to be added to the population is given by
Variant type *k* + 1 contains all variant types not observed in the sample and also by definition all mutations.

*Frequency-dependent selection* is defined as the disproportional preference for either low-frequency variant types (anti-conformist bias) or high-frequency variant types (conformist bias) (e.g. [4]). In order to incorporate those biases in our framework, we define the probability of choosing a variant of type *i, i* = 1, … , *k* + 1 to be added to the population by with

and . The parameter *b*_{freq} defines the strength and direction of the selection. For *b*_{freq} > 0, selection favours variant types with frequencies higher than the relative majority (where denotes the number of variant types with non-zero frequency at *t*) and for *b*_{freq} < 0, selection favours variant types with frequencies lower than . In other words, *b*_{freq} controls the importance of selection compared with neutral evolution (equation (2.4) coincides with neutral evolution for *b*_{freq} = 0). The simplifying assumption that variant type *k* + 1 contains all unobserved and newly invented variants (i.e. variant type *k* + 1 has a higher frequency than any of the individual types) will slightly overestimate the selection pressure on this variant type in conformist situations and underestimate the selection pressure in anti-conformist situations. This naturally influences the temporal frequency change dynamic but simulation studies indicate that those influences are very small for sample sizes of the magnitude considered in §3.

*Age-dependent selection* (or pro-novelty selection) is here defined as a disproportional preference for ‘young’ variant types (e.g. [33]). Naturally, there can exist different preferences; however, we restrict ourselves to the situation where young variant types possess a selective advantage just because they are novel. We note that age-dependent selection and anti-conformity result in different cultural dynamics. While anti-conformity favours variant types below a certain threshold regardless of their age, age-dependent selection only supports ‘young’ variants. Consequently, the effects of both transmission processes might be similar on newly invented and therefore low-frequency variant types; however, they are very different on ‘older’ variant types returning to a low frequency.

Besides its frequency, each variant type *i* is now also characterized by its age, denoted by *a _{i}*, defined as the time it has been present in the population. We assume: the younger the age of type

*i*the stronger it is favoured by selection. Then the probability of choosing a variant of type

*i*, to be added to the population is given by with 2.5and . The parameter

*b*

_{age}defines how quickly the selective advantage of young variant types fades with increasing age. Similar to frequency-dependent selection,

*b*

_{age}controls the importance of selection compared with neutral evolution (equation (2.5) coincides with neutral evolution for

*b*

_{age}= 0). As variant type

*k*+ 1 contains all unobserved and newly invented variants the age-dependent selection pressure on this variant type can only be an assumed average value. Again this influences the temporal frequency change dynamic. But as it is almost impossible to obtain information about the age of unobserved variant types from observable data, we believe this simplification is justified.

#### 2.1.3. Generating theoretical samples

Lastly, we generate a theoretical sample *S*(*t*_{2}) at time *t*_{2} of size *n*(*t*_{2}). Instead of sampling from the theoretical population ** P**(

*t*

_{2}) at time

*t*

_{2}only, we randomly draw

*n*(

*t*

_{2}) variants evenly from the populations that have been determined as described in the previous section. In this way, we account for possible time-averaging effects in the considered interval [

*t*

_{1},

*t*

_{2}] by, for example, allowing variant types to be present in the sample

*S*(

*t*

_{2}) at

*t*

_{2}which are already extinct in the population

**(**

*P**t*

_{2}).

### 2.2. Inference of underlying transmission process

The second part of the inference framework includes the statistical comparison between observed frequency changes in the interval [*t*_{1}, *t*_{2}] and frequency changes expected under a specific hypothesis about the underlying mechanisms of cultural transmission. As described above, we consider in the following the hypotheses ‘neutral evolution’, ‘frequency-dependent selection with strength *b*_{freq}’ and ‘age-dependent selection with strength *b*_{age}’.

To do so, we generate 100 000 theoretical samples *S*(*t*_{2}) conditioned on a specific hypothesis according to the procedure described in §2.1 and derive their statistical properties. In particular, we determine (i) the marginal frequency distributions of all *k* + 1 variant types at *t*_{2} and (ii) the probability distribution of the number of variant types, denoted by , that are present in the sample at *t*_{1} and are still present in the sample *S*(*t*_{2}) at *t*_{2} conditioned on the assumed transmission hypothesis . Comparisons of these distributions with the observed frequencies of the *k* variant types and the observed number of remaining initially present variants at time *t*_{2} allow for conclusions on whether the assumed transmission mechanism could have produced the frequency changes in the samples and

While the procedure described above is appropriate for neutral evolution, our definition of frequency- and age-dependent selections (equations (2.4) and (2.5)) allows for varying selection strengths *b*_{freq} and *b*_{age}, respectively. Naturally, different selection strengths affect the distributions and . This leads to the question of which parameter values of *b*_{freq} and/or *b*_{age} describe the observed sample best. As the complete exploration of the relevant parameter spaces is too costly, we use the statistical inference method ABC.

#### 2.2.1. Approximate Bayesian computation

The general idea of this method is to approximate the (joint) posterior distributions of the model parameters, summarized in the vector *θ*, given the observed sample in situations where the likelihood function of the considered model is either impossible or computationally prohibitive to obtain (e.g. [34]). Broadly speaking, the posterior distributions describe the parameter ranges which could have produced the observed sample under a given tolerance level *ɛ*. These distributions are obtained by generating samples *S*(*t*_{2}) with parameter values *θ* sampled from appropriately chosen prior distributions. Parameter combinations resulting in samples ‘close enough’ to the observed sample are retained and provide a sample of the distribution . The function *d* describes the distance between the theoretical and observed sample and *ɛ*, the tolerance level that determines the level of approximation. As *ɛ*→0, the retained sample asymptotically approaches the true posterior distribution (e.g. [34]). In other words, the distributions are only a meaningful approximation of the likelihood function of the generative model if the tolerance level *ɛ* is sufficiently small. In the following, we use a sequential Monte Carlo approach (e.g. [35,36]). The idea here is to decompose the problem of sampling from the distribution into a series of simpler sub-problems. For that we choose a sequence of tolerance levels with and draw in the first step a sample from with *ɛ*_{0} large and subsequently from an increasingly constrained sequence of distributions
until the desired tolerance level *ɛ* = *ɛ _{n}* is reached. The crucial question, however, is how we define the distance between the theoretical samples

*S*(

*t*

_{2}) and the observation . In practice, data are often reduced to a set of summary statistics, denoted by , producing samples of the distribution . This approach only generates exact inference results if the summary statistic is sufficient. Here we determine the distance between theoretical and observed sample by calculating

— the absolute distance between observed and theoretical frequencies 2.6

— the absolute difference between the median of the distributions and the observed frequencies 2.7

Additionally, we require that the number of remaining initially present variant types coincides with the observed number. We note that the statistics *d _{d}* and

*d*are based directly on the actual frequencies. While

_{m}*d*compares the observed and theoretical frequencies directly

_{d}*d*compares the observed frequencies to the median frequencies (a measure of the average value of the possible frequency range) under the considered transmission hypothesis . Consequently, using

_{m}*d*as the statistic infers likely parameter constellations whereas using

_{m}*d*infers possible parameter constellations which, however, might only produce the observed sample with a small likelihood. In the electronic supplementary material, appendix B (cf. figure S1), we show that inference based on the statistic

_{d}*d*provides the tightest posterior distribution of the model parameters and therefore we build the analysis in the next section on this statistic. In the electronic supplementary material, appendix F, we show the corresponding results for the analysis based on

_{m}*d*.

_{d}### 2.3. Theoretical limitations of the inference procedure

To explore the theoretical limits of inference based on population-level data, we assume the presence of seven cultural variant types with the frequencies at *t*_{1} = 0. We use the modelling framework to generate samples under the hypothesis of neutral evolution (i.e. *b*_{freq} = 0), negative frequency-dependent selection (i.e. *b*_{freq} = −0.01), positive frequency-dependent selection (i.e. *b*_{freq} = 0.0.1) and the parameter constellation *r* = 0.1 and *s* = 0.3 at *t*_{2} = 50. Application of the ABC procedure identifies the cultural transmission processes (parametrized by the selection strength *b*_{freq}) that are consistent with the observed data. Figure 2 shows the posterior distributions of the parameter *b*_{freq} if the ‘true’ selection strength is *b*_{freq} = −0.01 (figure 2*a*), *b*_{freq} = 0 (figure 2*b*) and *b*_{freq} = 0.01 (figure 2*c*). Those posterior distributions describe the parameter range of *b*_{freq} for which the distance between theoretical and observed samples after 50 time steps as measured by equation (2.7) is almost zero. The widths of the distribution show that the information that can be inferred from sparse frequency data is limited. For example, in figure 2*b*, the data were produced under neutral evolution but frequency-dependent selection can generate almost identical frequency patterns resulting in a relatively wide posterior distribution for *b*_{freq}. It is theoretically impossible to distinguish between neutral evolution and weak frequency-dependent selection on the basis of such sparse data. This implies that there exist theoretical limits on the amount of information that can be extracted from population-level frequency data.

## 3. Case study

In the following, we apply the framework developed in §2 to a dataset describing the culture of the earliest farmers in Central Europe, the so-called Linear Pottery Culture (or LBK after its German name), *ca* 7500–7000 years ago. The dataset (given in [25,37]) records the frequency of different types of decorated vessels at seven different points in time, denoted with , defining six phases of cultural change which vary in duration. Our aim is to explore whether observed frequency changes in different types of pottery between the beginning and the end of each of the six phases are consistent with a specific hypothesis about the underlying mechanism of cultural transmission. The assumption that cultural transmission is relevant in this case is based on ethnographic studies showing that knowledge of pottery-making techniques is transmitted between potters. Previous studies of the transmission of LBK decoration patterns [12,25,38,39] have been inconclusive, some consistent with neutral evolution and others rejecting it in favour of anti-conformist transmission.

For the analysis, we use the observed sample sizes *n*(*t _{j}*) at times

*t*and assume that the corresponding population sizes

_{j}*N*(

*t*) are determined by with . Estimating even the average sampling fraction

_{j}*s*is, of course, very problematical for archaeologists because the currently available record is the result of a large number of local factors relating to site survival and recovery. However, in the case of the LBK in Germany the work of Zimmermann

*et al.*[40,41] provides a basis. Using data from small areas that have been very thoroughly excavated they calculated the estimated number of households for the different LBK settlement areas of Germany [40, fig. 15]. For the areas included in Strien's study this is

*ca*2300 at the peak density. The total number of distinct finds in Strien's dataset [37], which should roughly correspond to households, is 270. While this is based on all the phases, it suggests that an average sampling fraction of 0.1 is the right order of magnitude and we assume . The samples represent on average 10% of the population at

*t*but we allow for sufficient variation to account for uncertainties in those estimates. Population sizes for intermediate time steps are obtained by linear interpolation between

_{j}*N*(

*t*) and

_{j}*N*(

*t*

_{j}_{+}

_{1}). Further, in temperate climates such as Central Europe pottery making is largely seasonal and generally takes place in one bout of activity per year [42]. Accordingly, the lengths of the phases correspond to the number of potential opportunities for copying within a given phase and thus the number of opportunities for the population of decorative motifs to change as a result of the operation of drift, innovation and other potential forces.

### 3.1. Analysis of neutral evolution

We start our analysis by investigating whether observed frequency changes in different types of pottery between different phases are consistent with the hypothesis of neutral evolution. For that we determine the marginal frequency distributions of all initially present variant types and the probability distribution of the remaining initially present variant types for all phases under neutral evolution. We assume that the fraction *r* in equation (2.2) is given by with 〈*r*〉 = 0.1; 0.3; 0.5. Varying the mean value 〈*r*〉 allows us to explore the effects of different replacement fractions and consequently different individual lifespans of the cultural variants on the frequency change dynamic. Shott [43] reviews a range of ethnoarchaeological studies of pottery use life and finds a general correlation with vessel size: smaller vessels tend to have short use lives with larger vessels surviving longer. The data show that the replacement fractions used are plausible ones. It should also be noted that in the case of the LBK the vessels are very uniform in size and shape.

The 95% intervals of the marginal frequency distributions of all initially present variant types at the end of each phase represent the frequency ranges of each type of pottery that are possible under neutral evolution (see the electronic supplementary material, appendix C, figure S3 for the concrete intervals). Table 1 shows that in each phase a number of observations lie outside these intervals for 〈*r*〉 = 0.1; 0.3 and therefore we argue that the observed frequencies are not likely to be produced by neutral evolution with an average lifespan of an individual variant of 10 and 3.33 years, respectively. For 〈*r*〉 = 0.5 the neutral hypothesis gains slightly more support; fewer observations lie outside the intervals; however, with the possible exception of the phases II, III and VI, we also conclude that the observed frequencies are not likely to be produced by neutral evolution with an average lifespan of an individual variant of 2 years.

In the next step, we analyse the distributions of the number of remaining initially present variant types for each phase. The relatively long lifespan of the individual cultural variants for 〈*r*〉 = 0.1 predicts that under neutral evolution more variant types than observed should survive until the end of phases IV and V. For 〈*r*〉 = 0.3; 0.5, the number of remaining initially present variants in phase V contradicts the hypothesis of neutral evolution (see the electronic supplementary material, appendix C, figure S4 for details).

Summarizing, based on the analysis of the two statistics we found no consistent evidence that neutral evolution with an average sampling fraction 〈*s*〉 = 0.1 was the dominant transmission mechanism over the 500 years the LPK lasted.

### 3.2. Analysis for frequency-dependent selection

In this section, we ask whether frequency-dependent selection can describe the observed changes better than neutral evolution and assume that the probability of choosing a particular variant type is given by equations (2.4). To do so, we apply the ABC inference procedure based on the statistic *d _{m}*. The results of this procedure are posterior distributions of the model parameters

*r*and

*b*

_{freq}describing the parameter ranges for which the distance between theoretical and observed samples as measured by equation (2.7) is smaller than or equal to some tolerance level

*ɛ*. Figure 3 shows those distributions for the phases I and V (figure 3

*a*,

*d*: posterior distribution of

*b*

_{freq}; figure 3

*b*,

*e*: posterior distribution of

*r*; figure 3

*c*,

*f*: joint posterior distributions of

*θ*= (

*r*,

*b*

_{freq})); the complete analysis can be found in the electronic supplementary material, appendix D. It is apparent that

*b*

_{freq}= 0 is contained in the ranges of the posterior distribution of all phases meaning that frequency-dependent selection does not explain the observed samples better (but also not worse) than neutral evolution. In detail, based on the posterior distribution for

*b*

_{freq}, we exclude the presence of conformist selection in the population for phases I–III. By contrast, we exclude the presence of anti-conformist selection in phase V. The posterior distributions in the phases IV and VI indicate no strong frequency-dependent selection pressure. We conclude that a relatively large parameter range of

*b*

_{freq}could produce the observed frequency changes between the different phases (given a certain tolerance level) which demonstrates the uncertainties associated with inferences based on sparse data. But importantly we note that the achievable tolerance level

*ɛ*in the ABC analysis is relatively large in the different phases, indicating that frequency-dependent selection (and neutral evolution) might not be the best model to explain the observed data. To explore how well the identified parameter ranges are able to replicate the observed changes in frequency in more detail, we determine the marginal frequency distributions of all

*k*variant types present at

*t*based on the joint distributions of the model parameters shown in figure 3. Electronic supplementary material, figure S6 in appendix D shows that a similar number of variant types as in the case of neutral evolution with (cf. Table 1) lie outside the 95% intervals of pointing again to the fact that the hypotheses of neutral evolution and frequency-dependent selection have similar explanatory power.

_{j}### 3.3. Analysis of age-dependent selection

Next, we explore whether age-dependent selection can describe the observed changes better than neutral evolution and assume that the probability *p _{i}* of choosing a particular variant type is given by equations (2.5). This means that

*p*not only depends on the frequency but also on the age of the variant type, and consequently, we need to estimate the time of innovation for each type. This is a challenging task as archaeological insights can help constructing an interval estimate, but an accurate innovation time cannot be obtained with any certainty. Therefore, we apply the ABC inference procedure not only to determine the posterior distribution of the fraction

_{i}*r*of the population that is replaced in every time step and the strength of age-dependent selection

*b*

_{age}but also to determine the posterior distributions of the innovation times of all variant types and the average age of the variant types not observed in the sample at the beginning of each phase. Consequently, age-dependent selection represents a more complex hypothesis than frequency-dependent selection. Archaeological interval estimates allow us to define relatively tight prior distributions for the innovation times (at least for the observed variant types) which restrict the variability of those parameters. Figure 4 shows the posterior distributions of

*b*

_{age}(figure 4

*a*,

*d*),

*r*(figure 4

*b*,

*e*) and the joint posterior distribution of

*θ*= (

*r*,

*b*

_{age}) (figure 4

*c*,

*f*) for the phases I and V; a complete analysis can be found in the electronic supplementary material, appendix E. We note that in contrast with frequency-dependent selection, the achievable distance

*ɛ*between theoretical and observed samples as measured by equation (2.7) is relatively small for all phases. We further observe that

*b*

_{age}= 0 is not contained in the ranges of the posterior distributions and therefore conclude that the hypothesis of age-dependent selection describes the observed samples better than neutral evolution. Further, the marginal frequency distributions of all

*k*variant types present at

*t*produced by the joint distributions of the model parameters

_{j}*θ*= (

*r*,

*b*

_{age}) shown in figure 4 indicate that, apart from phases II and V, almost no variant type lies outside the 95% intervals of (see the electronic supplementary material, appendix E, figure S8). Therefore, we conclude that the observed frequency changes especially between the beginning and the end of the phases I, III, IV and VI are consistent with the hypothesis of age-dependent selection. The Linear Pottery early farmer population might have had a mild preference for novel as opposed to longer established types of decorated pottery.

## 4. Qualitative differences between neutral evolution and age-dependent selection

We have shown that age-dependent selection, i.e. the preference for novel cultural variants, is able to explain the LBK dataset better than neutral evolution. But what are the qualitative differences between the two cultural transmission processes? To explore these differences, we use our simulation framework and record the cumulative frequencies of each variant type that has been introduced in the population at some point over a period of 20 000 time steps under the assumption of *r* = 0.3 and neutral evolution and age-dependent selection, respectively. To study the effects of the transmission processes only we keep the population size constant over time (*N* = 500) and allow for an initial burn-in time. Figure 5 shows that age-dependent selection produces variant frequency distributions which differ from neutral evolution (see [15–17] for results for neutral evolution with *r* = 1). Age-dependent selection results in more variant types with medium-high frequencies and fewer variant types with high frequencies.

In the electronic supplementary material, appendix G, we show that under neutral evolution variant types with a high frequency are usually ‘old’. Consequently, the lack of high-frequency variant types in situations of age-dependent selection is explained by the fact that ‘old’ variant types are selected against and therefore variant types do not have sufficient time to accumulate a high frequency. Further, we demonstrate in the electronic supplementary material, appendix G, that age-dependent selection leads to a higher turnover rate of top lists compared with neutral evolution (see [22,45] for results under neutral evolution). Additionally, under age-dependent selection, variant types show a more monotonic decrease behaviour after they have reached their frequency maximum (see electronic supplementary material, table S1). Summarizing, age-dependent selection accelerates the speed of cultural change (see also [33]) by selecting against ‘old’ and therefore cumulatively high-frequency variant types.

## 5. Conclusion

In this paper, we developed an inference framework which evaluates whether observed frequency changes of cultural variant types between two time points *t*_{1} and *t*_{2} are consistent with a specific hypothesis about the underlying process of cultural transmission. In particular, we were interested in exploring the consistency with neutral evolution, frequency- and age-dependent selections. Naturally, this is not an all-encompassing analysis and alternative transmission hypotheses could also be consistent with the observed frequency changes. The framework developed takes into account that observed frequency data often describe the composition of samples and *not* of whole populations of cultural variants. Using the Dirichlet distribution approach, we generated population structures from which the observed sample at time point *t*_{1}, denoted by , could have been drawn randomly. Based on those inferred population structures, we determined the temporal evolution of the frequencies of the cultural variant types conditioned on a specific transmission hypothesis. In doing so, we assumed that only a fraction of the population, denoted by *r*, is renewed in every time step, which implies that individual variants have an average lifespan of 1/*r*. In this way, we generated populations of cultural variants at each time point in the considered interval [*t*_{1}, *t*_{2}]. To further account for potential time-averaging processes, we produced samples with a size equal to the observed sample at *t*_{2}, denoted by , by randomly sampling from all generated populations in [*t*_{1}, *t*_{2}]. Consequently, the outputs of the developed framework are theoretical samples at *t*_{2} conditioned on the cultural transmission hypothesis considered. Statistical analysis of the samples then informs about, for example, the frequency ranges of the cultural variants that are expected under the considered hypothesis.

Here we use ABC to determine which transmission processes are consistent with the observed changes in frequency. To test whether the hypothesis of frequency- or age-dependent selection explains the data better than neutral evolution, we approximated the posterior distribution of the parameters *θ* = (*r*, *b*_{age}) (with *b*_{age} describing the strength of frequency-dependent selection) and *θ* = (*r*, *b*_{age}) (with *b*_{age} describing the strength of age-dependent selection), respectively. Those posterior distributions indicate the parameter values and therefore the transmission hypotheses which could have produced sample frequencies similar to the observed ones. To this end, we evaluated the similarities between the theoretical samples obtained by frequency- or age-dependent selection and the observed sample at *t*_{2} in two different ways. Firstly, we compared the observed frequencies to the median frequencies (a measure of the average value of the possible frequency range) of the considered transmission process with the chosen parametrization *θ* (cf. statistic *d _{m}* described in equation (2.7)). We retained the parameter constellations which produced frequency data sufficiently close to the observed data and subsequently generated posterior distributions for the model parameters describing probable parameter constellations. Secondly, we compared the obtained theoretical frequencies directly to the observed frequencies (cf. statistic

*d*described in equation (2.6)). Again we retained the parameter constellations which produced frequency data sufficiently close to the observed data and subsequently generated posterior distributions for the model parameters describing possible parameter constellations. Possible means that the transmission hypothesis with the parametrization

_{d}*θ*could have produced the observed sample frequency but sometimes only with a very small probability. We found, unsurprisingly, that applications of the ABC algorithm based on

*d*resulted in substantially wider posterior distributions indicating that many transmission processes could theoretically produce the observed frequencies although this might only happen with a very small probability (see the electronic supplementary material, appendices D and F). This strongly points to the uncertainties associated with inferences based on sparse population-level data.

_{d}The observed widths of the posterior distributions led to the question: How well should we expect to be able to distinguish between different transmission processes on the base of the observed changes in frequency? We used the developed model to generate frequency data under a known underlying transmission hypothesis (in detail, we assumed frequency-dependent selection with *b*_{freq} = −0.01; 0; 0.01 and *r* = 0.3, see the electronic supplementary material, appendix B) and applied the inference framework to these data. The resulting posterior distributions of the model parameters *r* and *b*_{freq} indicate the ranges of the parameter values that could have produced the observed data (figure 2) and consequently the widths of those distributions describe the information about the underlying transmission process contained in the population-level data. Our example showed clearly that the information is limited. For example, it is theoretically impossible to distinguish between neutral evolution and weak frequency-dependent selection on the base of sparse population-level frequency data (cf. figure 2*b*). Even though the data were produced under the assumption of neutral evolution, i.e. *b*_{freq} = 0, values of *b*_{freq} within the interval [−0.005; 0.01] corresponding to weak/medium anti-conformity and conformity can generate almost identical frequency patterns. The main insight from the described inference analysis is the exclusion of cultural transmission processes (e.g. conformity with *b*_{freq} > 0.01) that could *not* have produced the observed sample based on the inference procedure used and therefore the reduction of the pool of potential hypotheses [46].

Further, our results show that the cultural system considered is more sensitive to changes in the selection strengths, *b*_{freq} or *b*_{age}, than to the fraction *r* of the population that is renewed in every time step (and therefore the average lifespan of an individual variant) (cf. the widths of the posterior distributions for *r* in figures 3 and 4). Considering the difficulties of estimating *r* from archaeological data, this points to the fact that moderate estimation errors do not affect the temporal dynamic of cultural change substantially.

We applied this framework to a dataset describing the culture of the earliest farmers in Central Europe, the so-called LBK, *ca* 7500–7000 years ago. Our aim was to investigate whether the observed frequency changes between the beginning and the end of a phase of the different types of pottery decoration are consistent with the hypothesis of neutral evolution. To do so, we determined the marginal frequency distributions of all variant types initially present at *t _{i}* and the probability distribution of the initially present variant types remaining at under the assumption that cultural transmission occurs in an unbiased way. We concluded that neutral evolution is not a likely hypothesis for the underlying transmission process as in almost all phases a relatively large proportion of the observed frequencies of the different variant types were outside the 95% interval of the ranges expected under neutrality. Additionally we found that in phase V the observed number of surviving variant types between the beginning and the end of this phase was too low to have been produced by neutral evolution.

To test whether frequency-dependent selection describes the observed frequency changes better than neutral evolution, we employed ABC. Based on the resulting posterior distributions of the model parameters *r* and *b*_{freq}, we were able to exclude conformity in phases I–III and anti-conformity in phase V as probable causes of the observed frequency changes. Overall, we found that frequency-dependent transmission did not explain the data better than neutral evolution. The parameter value *b*_{freq} = 0 (indicating neutral evolution) was contained in the range of the posterior distribution of *b*_{freq} in all phases and a similar number of variant types lay outside the 95% intervals of marginal frequency distributions . Comparing these results with [25] points to the crucial importance of rooting any inference framework in population frequencies and not in sample frequencies.

We applied the same statistical procedure to test whether age-dependent selection (defined as a disproportional preference for ‘young’ variant types) explains the data better than neutral evolution. But besides the parameters *r* and *b*_{age}, describing the strength of the age-dependent selection process, we also estimated the innovation times of the variant types present in the dataset (as generally this information cannot be obtained precisely from the archaeological record), resulting in a more complex model than considered for frequency-dependent selection. We concluded that with the possible exception of phases II and VI, the observed frequency changes between the beginning and the end of a phase are consistent with the hypothesis of weak age-dependent selection. In particular, the hypothesis of age-dependent selection is able to explain the LBK data better than neutral evolution.

In this context, it has been demonstrated that neutral evolution is able to replicate the population-level pattern of fashion trends (e.g. popularity of baby names [15,17], dog breeds [16], popular music [22]). In particular, it has been argued that neutral evolution may explain why a few cultural variant types become highly popular but then fade away, producing the well-observed temporal dynamic of fashion trends [22]. Age-dependent selection favours ‘young’ recently invented variant types but disfavours ‘old’ variant types which have been in the population for some time. Consequently, age-dependent selection equally produces the characteristic frequency behaviour of fashion trends (see §4 and electronic supplementary material, figure S11) and we hypothesize that it could be an alternative explanation for the dynamic of fashion trends (see also [47] for a different hypothesis), or rather that the concept of ‘fashion trends’ may actually encompass dynamics of different types as age-dependent selection does not produce a power law. Summarizing, the observed frequency dynamic of the different types of decorated pottery in the LPK is consistent with age-dependent selection potentially indicating that the early farmers treated decorated pottery as fashion items, with more recent innovations being preferred because of their novelty. This observation might present the earliest instance of such fashion trends yet recorded.

## Authors' contributions

A.K. and S.S. designed the study and wrote the paper, A.K. carried out the analysis.

## Competing interests

We declare we have no competing interests.

## Funding

S.S. thanks the European Research Council for its Advanced Grant for project 249390, EUROEVOL, Cultural Evolution of Neolithic Europe, which has made possible his work on this paper.

## Acknowledgement

We thank four anonymous referees for their constructive comments, which helped improve this paper.

- Received October 15, 2015.
- Accepted November 20, 2015.

- © 2015 The Author(s)