## Abstract

Collective phenomena, whereby agent–agent interactions determine spatial patterns, are ubiquitous in the animal kingdom. On the other hand, movement and space use are also greatly influenced by the interactions between animals and their environment. Despite both types of interaction fundamentally influencing animal behaviour, there has hitherto been no unifying framework for the models proposed in both areas. Here, we construct a general method for inferring population-level spatial patterns from underlying individual movement and interaction processes, a key ingredient in building a statistical mechanics for ecological systems. We show that resource selection functions, as well as several examples of collective motion models, arise as special cases of our framework, thus bringing together resource selection analysis and collective animal behaviour into a single theory. In particular, we focus on combining the various mechanistic models of territorial interactions in the literature with step selection functions, by incorporating interactions into the step selection framework and demonstrating how to derive territorial patterns from the resulting models. We demonstrate the efficacy of our model by application to a population of insectivore birds in the Amazon rainforest.

## 1. Introduction

Recent years have seen an explosion in the number of studies devoted to collective animal movement modelling, largely enabled by the availability of fast computational power and vastly improved tracking data [1,2]. They have succeeded in explaining a wide variety of patterns observed in nature due to the movements and interactions of animals [3,4], such as bird flocking [5], ant raids [6] and fish schooling [7]. Furthermore, in the last few years, the collective behaviour paradigm has been extended to include territorial patterns, which arise from conspecific avoidance mechanisms rather than those of alignment or attraction [8–10].

Despite these myriad advancements, animal-interaction models remain disparate and varied, with no formulation of a unifying framework encompassing the variety of interaction mechanisms, direct or mediated, attractive or repulsive. This makes it difficult to compare models quantitatively and so determine which behavioural aspects are necessary for causing the observed behaviour. Though several techniques have recently been proposed for selecting between models of collective movement and interactions [11], they tend to be system-specific. For example, fish repulsion–alignment–attraction mechanisms have been measured using several different techniques [12–14], as have the geometric nature of their interactions [15] and their decision processes [16,17]. Other examples include alignment and leadership decisions in bird flocks [5,18,19]. There are also a few theoretical studies aimed at more general application (e.g. [20]). However, it is not clear whether they can readily be applied to behaviours beyond grouping phenomena such as swarming, flocking or schooling.

On the other hand, mechanistic models of territorial interactions have typically been analysed by fitting the emergent spatial patterns, rather than the underlying movement processes, to positional data (e.g. [10,21]). While this is a reasonable way of testing hypotheses about the underlying causes of spatial patterns [22], it is not sufficient for concrete quantification of the underlying movement and interaction processes, because many different model processes can give rise to the same emergent spatial patterns. Furthermore, territorial modelling approaches have hitherto followed two separate paradigms. The first involves constructing partial differential equations (PDEs) either from details of the underlying movement and interaction processes or from more phenomenological descriptions, and then using these equations to derive territorial patterns mathematically [8,21–23]. The second approach is based more on statistical physics, analysing the individual movement and interaction processes themselves in discrete space, without taking a mean-field continuum limit [9,24]. A recent review explains the biological lessons that can be learned from these models [25]. These approaches would benefit from unification both with each other and with the rest of the collective behaviour literature.

Parallel to the collective animal literature, many studies have sought to understand and predict space use patterns by examining interactions between animals and their environment. Resource selection analysis, positing that animal space use is a function of the distribution of underlying resources, is perhaps the widest used tool in this regard and has a long history [26]. Recently, this has been integrated with animal movement processes by constructing step selection functions [27–29], where the selection of resources is constrained by the ability of an animal to move. Such functions are built by rigorously deriving parameter values from the data using well-developed statistical techniques [30]. Step selection functions, in turn, have been used to build mechanistic models to derive space use patterns from the underlying movement processes and animal–environment interactions [31], representing the first step in unifying resource selection with mechanistic models.

Some studies in the step selection function literature have factored into their analysis either positions of other individuals [32] or traces left in the environment by animals [33]. However, to model simultaneously more than one interacting group of animals, so that it is possible to build a mechanistic model to predict the resulting space use patterns, would require having different interacting movement kernels for each group. For example, in a territorial system there would be one function for each group maintaining a territory. These would then have to be coupled together so that each function depends on the animals modelled by the other.

In this paper, we present a modelling framework that unifies movement with both animal–environment and inter-animal interactions. The inter-animal interactions may either be direct or mediated by a stigmergent process [34,35] such as pheromone deposition or visual cues. Our framework includes as special cases both step selection functions and the two approaches to mechanistic territorial modelling mentioned above. Though we focus specifically on combining territorial interactions into the concept of step selection, our framework also happens to incorporate a variety of other collective motion models, suggesting far broader application (table 1). As such, our framework provides a useful way to codify movement and interaction processes, giving a generic starting point for modelling these processes and a clear way of testing which combinations of them best describe the underlying data. This will both help future researchers in model construction and provide a concrete means by which to compare and contrast different modelling approaches.

We show how to use our model to test hypotheses about the interaction mechanisms underlying territorial behaviour, by application to movement data on a community of territorial insectivore bird flocks in the Amazon rainforest. Parameter values for a model of movement and territorial interactions naturally arise from this hypothesis testing. This model can then be analysed either using PDE techniques [8] or by simulation analysis [9,24]. This enables the spatial territorial patterns to be derived from the underlying movement and interaction processes, which can be compared with spatial data. We demonstrate how to make this comparison quantitative, thereby giving a technique for determining which processes are the key drivers of space use in the study population. As such, the framework used here provides a vital bridge between the selection of models on the individual level and the validation of their emergent features on the population level. In figure 1, we give a schematic that represents the central place of coupled step selection functions (CSSFs) for the programme of constructing a ‘statistical mechanics for ecological systems’ [38].

## 2. Material and methods

### 2.1. Modelling framework

Our model is based around the notion of a step selection function [27]. However, simultaneous modelling of various interacting animals, or groups of animals, requires having a different step selection function for each animal or group. Therefore, instead of having one function that models all agents, as with previous approaches, we construct a different function for each agent and link them together with a coupling term. We use the term ‘agent’ here to refer to either a single animal, or a group of animals that are modelled as moving together as a single entity, for example a pack or a flock. The result is what we call a system of CSSFs, where each function has the following form:
2.1represented pictorially in figure 2. The function is the probability of agent *i* moving to position **x** at time *t* + *τ*, given that the agent was at position **y** at time *t* and had arrived there on a bearing *θ*_{0}. The term *ϕ _{i}*(

**x**

*|*

**y**,

*θ*

_{0}) represents the movement process of agent

*i*, disregarding the effect of the environment or other agents. For example, this could contain the step length and turning angle distribution for a correlated random walk [39].

The function is a weighting function containing information about the desirability of moving across the environment from position **y** to **x**. For example, if there is a partial barrier to movement between **y** and **x**, then may be lower than if the barrier were not there. On the other hand, if **x** were in a very desirable habitat for the agent compared with **y**, then would be higher than if the habitats were equal in quality. See [27] for a good example of the variety of animal–environment interactions that can be modelled this way.

The collective aspects of motion, i.e. the agent–agent interactions, are represented by . The term represents both the population positions and any traces of their past positions left either in the environment or in the memory of agent *i*. For example, if the agents were schooling fish, then perhaps the pertinent interactions would be direct [40]. However, if the agents were ants then might represent the pheromones left by other ants, to which ant *i* responds by tending to move up the pheromone gradient [6]. As a third example, if the agents were territorial bird flocks, then might include the memory that the birds in flock *i* have of past territorial conflicts or vocalizations. In most realistic cases, including the ones detailed here, this enables us to convert ostensibly non-Markovian processes, such as memory and correlations, into one-step Markov processes, possibly requiring high dimensions to encapsulate appropriately.

As is a probability, it must integrate or sum to 1, depending on whether continuous space or discrete space is being used, respectively. Therefore, we use the ∝ sign in equation (2.1), noting that this becomes an equality if the right-hand side is divided by the integral (continuous space) or sum (discrete space) over the possible target positions **x**.

We demonstrate the generality of our formalism by showing that it reduces to ordinary step selection functions [27], resource selection functions [36] and a variety of previously published examples of collective motion models. The latter include models of trail-following ants [6], collective patterns in animal populations through alignment and attraction [4,37] and territorial canids [8,10,24].

It is possible to generalize equation (2.1) further by writing the right-hand side as an arbitrary function of **x**, **y**, *t*, *θ*_{0}, and . This would enable the construction of dependencies between the three aspects of movement, environmental interactions and collective interactions. For example, this could describe the animal's speed varying over time due to seasonal changes, or the turning angle distribution being affected by habitat type, and so forth. However, the models from both previous collective animal behaviour studies and the step/resource-selection literature tend not to incorporate such dependencies, because they can be written in the form of equation (2.1). Therefore, for simplicity, we treat the functions *ϕ _{i}*,

*W*and

_{i}*C*as independent.

_{i}### 2.2. Application to bird data

As a demonstration of how to apply our model, we use movement data on a community of territorial insectivore bird flocks in the Amazon rainforest. These flocks are multi-species, with around five to 10 mating pairs consistently present sharing a territory [41]. Each pair will defend its territory from conspecifics, using a mixture of vocalizations and direct territorial conflicts [42]. The birds from each flock meet together at a ‘gathering point’ at dawn every day, usually in a central position within their territory, from where they forage within the territory for around 11–12 h, moving together as a flock.

We use flock movement data from 11 different territories to test hypotheses about the territorial interaction mechanisms used by the birds. We focus, for simplicity, on the vocal aspect of interactions. Vocalizations make neighbouring flocks aware of areas they have recently visited, causing the neighbours to alter their movement processes in or near these areas. We test three hypotheses: whether (1) flocks are likely to avoid areas that neighbours have visited in the past, due to the vocalizations made there, (2) flocks tend to move back towards their gathering site having visited such an area, (3) the time since the area was visited by a neighbour affects the response of the flock, so that old vocalizations are ignored. This demonstrates the ability of our modelling framework to select between competing theories about the nature of interaction mechanisms.

We analysed movement of 11 different flocks in the Amazon rainforest over 3 years during the dry season between June and November. The study site is about 70 km north of Manaus, Brazil. They were each tracked for between 4 and 18 days. The flock positions were recorded every minute during the time that they were active. Flock activity is conspicuous, so that birds can be followed on foot. As flocks moved, geolocations were recorded with a hand-held GPS unit (Garmin Vista HCX). The observer maintained a distance of 10–20 m from the flocks to ensure no alarm or avoidance behaviour was induced in the birds.

To examine which territorial interaction processes best fit these data, we constructed a CSSF (equation (2.1)) where the terms *ϕ _{i}*(

**x**

*|*

**y**,

*θ*

_{0}) and were obtained from a previous study on the same population [43]. In that paper, we found that setting

*ϕ*to be a product of the exponentiated Weibull distribution [44] for the step lengths and a von Mises distribution [45] for the turning angles fitted the data well. This led to the following step length and turning angle distribution: 2.2where each agent

_{i}*i*is an individual flock,

*θ*is the bearing from

**y**to

**x**,

*a*= 1.06 ± 0.03,

*b*= 6.90 ± 0.34,

*c*= 1.82 ± 0.11,

*k*= 0.336 ± 0.015 (error bars are 1 s.d.) and

**I**

_{0}(

*k*) is a modified Bessel function of the first kind. The best fit model from [43] for the term is , where

*C*(

**x**) and

*T*(

**x**) are, respectively, the forest canopy height and topography in metres, at position

**x**. The time interval

*τ*is 1 min and the best fit values for the parameters are

*α*= 0.0952 ± 0.037 and

*β*= 1.658 ± 0.345 (error bars are 1 s.d.). These were derived by performing the model fit while neglecting interaction mechanisms (see [43] for details).

For the interaction term , we set if any flock *j* ≠ *i* is at position **x** at time *t*, and otherwise. Here, *T*_{*} represents the amount of time a bird will remember a conspecific bird call from a particular location, and so respond to this memory when in that location. The cinereous antshrike from each flock tends to make a call about every 2–5 min, which can be detected by other birds at a distance of about 50 m (K. Mokross 2010, personal observation). In our model, we implicitly assume, for simplicity, that birds make calls each time they move and that they are always heard by neighbouring flocks. Note that this construction is similar in mathematical form to the territoriality model from [10], used to uncover behavioural mechanisms in a red fox (*Vulpes vulpes*) population. However, in that study, *T*_{*} represented the longevity of scent cues rather than the memory of vocalizations.

To test hypothesis (1), we examined whether using the following coupling function:
2.3gives a better fit to the data than the case of no interactions, . For hypothesis (2), we used the following coupling function:
2.4with *T*_{*} = *∞*, where *V*(*λ*, *ψ*) is a von Mises distribution (a single-mode distribution often used in the ecology literature for biasing angles [45]), *I*[*X*] is an indicator function equalling 1 if *X* is true and 0 otherwise, *θ* is the bearing from **y** to **x** and *θ*_{g} is the bearing from **y** to the gathering point. For hypothesis (3), we used the coupling function from equation (2.4), but with *T*_{*} a finite free parameter, to test whether allowing *T*_{*} to be finite significantly improves the fit.

We fitted the various models to the data using a maximum-likelihood technique, whereby we found the free parameters that maximize the product over *i* and *n* of , where are the positions of flock *i* at times . To find this maximum, we used the Nelder–Mead simplex algorithm as implemented in the Python `maximize()` function from the SciPy library [46]. For hypothesis (1), the free parameters are *T*_{*} and *γ*. For hypothesis (2), the free parameter is *κ*, and for (3) they are *T*_{*} and *κ*. The *p*-values for hypothesis testing were obtained using the likelihood ratio test.

One of the strengths of the CSSF approach is that the result of hypothesis testing and/or model selection naturally gives rise to a mechanistic movement model, given by the particular version of equation (2.1) that corresponds to the best fit model and parameter values. This enables one to determine the space use (i.e. home range) patterns that emerge from the model. We test whether the patterns that emerge from the best model that includes resource selection, topographical selection and territorial interactions are a significantly better fit to the data than the same model without the territorial interactions.

To do this, we constructed a simulation model for the bird flocks, whose movements each step are determined by drawing from the time-dependent probability distribution from equation (2.1) with the best-fit parameter values found by the hypothesis testing technique above. As each flock gathers in one particular place each day and moves around the terrain for a total of about 11 h 30 min during the day, we started the simulated birds at the gathering point and ran the simulation for 690 time steps, each step representing *τ* = 1 min (giving 11 h 30 min in total), taking a note of all the positions at which the flock landed after each step. We repeated this 100 times, representing 100 days, giving 69 000 simulated positions for each flock, from which we calculated home ranges using the kernel density estimation (KDE) method. We also ran identical simulations except where the model has , so that no territorial interactions were included.

To test which model performed better at predicting space use, we compared the Kullback–Leibler (K–L) distance [47] between each model's KDE distribution and the KDE distribution for the data. The K–L distance differs by a constant from 1/2 times the average Akaike information criterion (AIC) of a single sample from the data's KDE distribution (see [47] for details). Therefore, the difference in AIC (ΔAIC) for two different models of the same data distribution can be thought of as twice the difference in K–L distance, by considering a single KDE distribution as a single data sample. We have 11 flocks, so 11 KDE distributions. The ΔAIC is twice the sum of the differences in K–L distance across these flocks. We use this value to assess whether the resulting model is better at predicting *space use*, as opposed to just movement choices, than the model with no territorial interactions. To test whether the models are a good fit to the data, we used a Pearson *χ*^{2} test, treating each 10 × 10 m square as a single data bin. For this, we used the positional data rather than the smoothed data.

## 3. Results

### 3.1. Framing existing models as coupled step selection functions

#### 3.1.1. Step selection and resource selection

Step selection functions are simply single examples of equation (2.1) with the collective term equal to 1 [27,29,32,33]. In other words, we just consider one animal at a time, and how it interacts with its environment, without attempting to use the results to construct a mechanistic model of interacting animals. Resource selection functions are similar, but the environment-independent movement term *ϕ _{i}*(

**x**

*|*

**y**,

*θ*

_{0}) is replaced with an availability function, which can take whichever form the user feels is appropriate for study (e.g. [28,36]).

#### 3.1.2. Individual-based territory models

The selection of studies by Giuggioli *et al.* [9,48] and Potts *et al*. [10,24] modelled territorial interactions using moving agents on a square lattice. The initial model from [9] has agents performing nearest neighbour random walks and depositing scent as they move. The scent remains for a finite time *T*_{*}, the so-called *active scent time*, after which it is no longer considered as ‘active’ by conspecifics. Each animal's movement is restricted by the fact that it cannot move into an area that contains active scent of a neighbour.

This can be framed as a CSSF where *ϕ _{i}*(

**x**

*|*

**y**,

*θ*

_{0}) = 1/4 if

**x**is the lattice site either immediately above, below, to the right or to the left of

**y**, and

*ϕ*(

_{i}**x**

*|*

**y**,

*θ*

_{0}) = 0 otherwise. Additionally, as this model does not include any environmental interactions, we set . The term represents the presence of scent at position

**x**and time

*t*, so that 3.1Then the collective interaction term is 3.2The CSSF formalism (equation (2.1)) gives a natural way of incorporating environmental interactions into such territoriality models, an aspect of this approach hitherto lacking, as noted in [35].

#### 3.1.3. Advection–diffusion territory models

The type of territorial models described in [8] provides several other examples of CSSFs. We describe an individual-level model in a one-dimensional interval [0, 1] that has as its continuum limit the original advection–diffusion model of [23]. To do this, we first set
3.3where *a* is the average step length, and . This means that the intrinsic movement of each agent (pack of wolves) is a random walk with no correlation, and we are ignoring the effects of the environment on movement.

There are two agents in the model, so *i* ∈ {0, 1}. The collective action is mediated by scent deposition so that represents the scent mark density of pack 1 − *i*. Marking by individual *i* occurs at a rate , where *m* is typically a monotonic increasing function, representing the tendency of wolves to mark more heavily when conspecific marks are present. is governed by the following equation:
3.4where *x _{i}* is the position of agent

*i*at time

*t*−

*τ*, and

*μ*is the scent decay rate.

Packs have a tendency to move back towards their home range centre on encountering foreign scent. Assuming that the home range centre of pack 0 is to the left of the study area and pack 1 to the right, the collective interaction term is given by
3.5and
3.6where *D* and *C* are parameters, which can be determined by model fitting, and *I*(*X*) is an indicator function that is equal to 1 if *X* is true and 0 otherwise.

Now we move from an individual description to positional probability density functions. Let *u*(*x*, *t*) (resp. *v*(*x*, *t*)) be the probability distribution of pack 0 (resp. pack 1). For notational convenience, we rename the scent levels of packs 0 and 1 to *p*(*x*, *t*) and *q*(*x*, *t*), respectively. Then standard theory (e.g. [8, ch. 2]) means that the limit as *τ* → 0, *a* → 0 of *u*(*x*, *t*) is governed by the following advection–diffusion equation:
3.7where the advection and diffusion functions (*c _{u}*(

*x*,

*t*) and

*d*(

_{u}*x*,

*t*), respectively) are the following limits: 3.8This theory is built by constructing the master equation for

*u*. Implicit in the construction is the so-called ‘mean-field’ approximation, which assumes that the covariance between the scent mark density and the position of the pack is (approximately) zero. A direct calculation shows that

*c*(

_{u}*x*,

*t*) =

*Cq*(

*x*,

*t*) and

*d*(

_{u}*x*,

*t*) =

*D*. The equation for

*v*(

*x*,

*t*) is analogous, but with

*ϕ*

_{0}, ,

*c*,

_{u}*d*and

_{u}*q*replaced by

*ϕ*

_{1}, ,

*c*,

_{v}*d*and

_{v}*p*, respectively. Therefore,

*c*(

_{v}*x*,

*t*) =−

*Cp*(

*x*,

*t*) and

*d*(

_{v}*x*,

*t*) =

*D*.

The advection–diffusion equations for this system of CSSFs are then 3.9

Furthermore, the continuous-time limits of the scent marking equations (3.4) are as follows [49, ch. 3]: 3.10

Equations (3.9) and (3.10) form the system studied in [23]. This process can be generalized to derive advection–diffusion equations describing territorial pattern formation in two dimensions [8].

#### 3.1.4. Alignment-and-attraction models

Equation (2.1) also reduces to a variety of collective motion models other than territorial ones, including trail-following ants [6] and collective patterns in animal populations through alignment and attraction [4,37]. Here, we address one of these modelling frameworks [4] with the others left to the electronic supplementary material.

To write the model from [4] as a CSSF, we first note that each animal, *i*, has a fixed speed, *s _{i}*. Therefore, we set

*ϕ*(

_{i}**x**

*|*

**y**,

*θ*

_{0}) =

*δ*

_{D}(

*|*

**x**−

**y**

*|*−

*s*), where

_{i}τ*δ*

_{D}is the Dirac delta function. as there are no environmental interactions in the model from [4]. All the other animals in the population can influence animal

*i*'s subsequent movement, so where

**y**

*is the position of animal*

_{j}*j*at time

*t*, having arrived there on a bearing of

*θ*.

_{j}The model incorporates attraction, alignment and repulsion. Repulsion occurs if there are other animals within distance of *r*_{r} from animal *i*, to ensure that animals do not collide. If there is no repulsion, then animal *i* will align with any others that are greater than a distance of *r*_{r}, but less than a distance of *r*_{o}, from *i*. They will also be attracted to animals *j* such that *r*_{o} ≤ *|***y*** _{j}* −

**y**

*≤*

_{i}|*r*

_{a}(see [4] for details).

To aid in writing the interaction term, we let be the repulsion angle, which is the bearing given by the vector 3.11

We also define an alignment and attraction angle, , which is the bearing given by the direction of 3.12

The interaction term from [4], section ‘Behavioural rules: description’, is then
3.13where SG(*ψ*) is a spherical Gaussian.

### 3.2. The example of Amazonian bird flocks

When we apply our technique to data on Amazonian birds, there is no significant improvement in fit (*p* = 0.60) if we model birds as having a tendency not to go into areas from where they have heard conspecific bird calls in the past (hypothesis (1) from the Material and methods section). However, when flocks are modelled as being allowed to move into neighbouring territories, but then having a tendency to retreat in the direction of the gathering point (hypothesis (2)), we observe a significant improvement in fit (*p* = 0.022). If we assume that the territorial cues have a finite lifetime (hypothesis (3)), the maximum-likelihood estimator for *T*_{*} is larger than the length of the time-series data, suggesting that birds are able to remember these cues for a very long time after they have been made.

To demonstrate the space use patterns that arise from these results, we constructed simulations using the gathering point attraction model, used to test hypothesis (2), with the best fit parameters of *T*_{*} = *∞* and *κ* = 0.0597 (figure 3). For nine of the 11 flocks, the resulting KDE distributions are closer to those of the data than the KDE distributions without territorial interactions (table 2). Furthermore, the resulting difference in AIC (ΔAIC) between the two models is ΔAIC = 4.07, giving reasonable evidence to suggest that the model including territorial interactions is better at predicting space use patterns than that without. This is demonstrated pictorially in figure 3*b*, which shows that the model including territorial interactions is more highly peaked at the centre and includes a lower density of outliers.

Of the two flocks that are not well modelled by incorporating territorial interactions, for Cap North we have no data on adjacent flocks (figure 3*a*) so the inability of the model to detect territorial interactions is unsurprising. Cap II, on the other hand, is located in the most degraded area of all flocks in the study. Subsequent observations of the study area suggest that it did not persist over time, as key species either abandoned the area or died. Therefore, the territory could well be in the process of moving or degrading during the study period, mechanisms that are likely to be key drivers in shaping the space use, but which are absent from our current model.

For all of the flocks except Cap II, there was insufficient evidence to suggest that the data did not come from the model distribution that included territorial interactions (*p* < 0.0001 for Cap II, *p* > 0.999 for the others). The same test with the model that excluded territorial interactions suggested that there was only sufficient evidence to reject the hypothesis that the data came from the model for Cap II and Central (*p* < 0.0001 for Cap II and Central, *p* > 0.999 for the others). Therefore, we have significantly improved the absolute fit of the Central data by including territorial interactions. Central is the only flock for which we have data on all surrounding flocks so it is precisely the flock for which one would most expect to see improvement of absolute fit.

## 4. Discussion

We have constructed a general model for the effects on movement of both animal–habitat and between-animal interactions. We have demonstrated how the model encompasses, as special cases, a variety of disparate collective motion models as well as resource and step selection functions. By fitting a version of our model to data on bird flock locations, we have shown how it can be used to determine and quantify the nature of territorial interactions, as well as modelling simultaneously the effects of both conspecifics and the environment on movement processes. As we framed the system as a one-step Markovian model of both the animals and their environment, our framework allows for relatively simple calibration of models, which makes the process computationally fast. This contrasts with methods that fit the movement path as a whole, such as state-space models, which can be difficult to fit [50].

Though we have focused on territorial modelling, so not given an exhaustive demonstration of how our framework might be reducible to all collective behaviour models in the literature, we display a variety of different examples, encompassing both direct and mediated interactions, both conspecific attraction and avoidance processes. These demonstrate the possible wide applicability of our approach and potential to frame many more models as CSSFs. Encompassing competing models of collective behaviour under this unifying framework will make future comparisons easier, aided by the methods given here for fitting CSSFs to data. Furthermore, it will enable transference of techniques and results between the hitherto disparate fields of collective motion, resource selection and mechanistic territorial modelling. To give one example, research into ungulate behaviour often looks at the effects of the environment on movement but ignores herding interactions (e.g. [27]) or looks at herding behaviour but ignores the resource aspect (e.g. [51]). Our framework links these two ideas so will help future researchers build and validate models that account for both.

By applying our model to movement patterns of bird flocks, we were able to test hypotheses about the mechanisms behind the interaction processes. Previous studies of mechanisms underlying territorial patterns in populations of scent-marking animals postulated that they will avoid areas that have recently been claimed by others as their territory [10]. Here, we have shown that the territorial interaction mechanism in bird flocks is quite different. There is no evidence to suggest that they tend to avoid places that have previously been claimed as other flocks' territories. However, after visiting the outskirts of neighbouring territories, they will change their movement processes to include a tendency to retreat back inside their territory. These visitations explain the observed slightly overlapping utilization distributions in the birds' spatial patterns (figure 3).

Our framework can also be used to build predictive, mechanistic models showing how utilization distributions arise from the underlying movement and interaction processes. To demonstrate this, we used stochastic simulations of the best fit system of CSSFs for the bird data. Recently, step selection functions have been used to construct deterministic master equation [43] and PDE models [31], from which the resulting spatial distributions can be analysed using well-studied mathematical tools (e.g. [8]). While the coupling term in our framework makes such analysis significantly more complicated than for ordinary step selection functions, deterministic mathematical formulations would ultimately enable concrete conclusions to be reached without the need for extensive, time-consuming computer simulations. We therefore hope, in future work, to begin a programme of analysing coupled step selection models mathematically.

Though mechanistic models have previously been proposed to explain space use patterns by examining movement, territorial interactions and environmental features [22], those models fit the emergent space use distribution to relocation data, whereas our model is directly fitted to the movement trajectory itself, enabling the space use distribution to arise with no additional fitting. The advantage of this is twofold. First, there is no need to throw away data in order to make sure each data point is an independent sample of the spatial distribution from the others (see [8] for details of, and rationale behind, this procedure). Therefore, we can use the complete movement trajectory, containing much more information.

Second, fitting the model to the underlying movement choices ensures that the parameter values used to construct the model arise from the movement and interaction processes rather than the emergent patterns. This means that we can assess to what extent these processes predict space use, and where they fail. For example, in the data studied here, the space use of two flocks (Cap II and Cap North) were not predicted by the territorial interaction model as well as by the no-interaction model, unlike the other nine flocks (table 2). Therefore, we can postulate hypotheses about what other processes may be required to predict space use in these instances. On the other hand, fitting directly to the space use distribution implicitly assumes that the mechanistic model describes well all aspects of movement that give rise to the spatial patterns. Consequently, this procedure may cause inaccurate inferences to be made about the parameter values of the underlying processes. In other words, our approach is more cautious, therefore less likely to lead to incorrect results and more likely to reveal the extent to which certain processes fail to predict accurately the spatial patterns.

As an alternative to mathematical models of space use, simulations of individual-based models have also been used to attempt to understand animal movement decisions and emergent spatial patterns [52]. Typically, they take a pattern-oriented approach [53,54], beginning by including as many aspects of the animal's movement and interaction processes as are believed to cause the observed patterns. If the empirical patterns, also called *summary statistics*, are observed in the model output then the model is simplified to try to understand exactly which of the processes are causing the patterns to emerge. The aim of this approach is to find models that replicate as many of the summary statistics observed in the data as possible, with as few model parameters.

Our approach, on the other hand, is *process-based* in nature [55], seeking to build an individual-based mechanistic model by testing hypotheses about the underlying processes one at a time. The key difference is that we test the model parameters against the data for validity on the same level of description at which the model is constructed. The pattern-oriented approach tests the model parameters at a different level of description: that of the summary statistics. However, this is not sufficient for making inferences about the parameter values put into the model. Though analysis of a mechanistic model, individual based or analytic, shows that process *A* implies pattern *B*, showing that pattern *B* replicates the data does not imply that the underlying mechanism is actually process *A*. Therefore, it is not possible, purely using a pattern-oriented approach, to make solidly grounded inferences about the nature of the mechanisms that have gone into construction of the model. In our approach, we circumvent this issue by testing and parametrizing the model's mechanisms on the level of description at which they are constructed, then observing the patterns as an emergent feature of the model, which can in turn be compared with the patterns from data.

Recent developments in the collective behaviour literature provide many good examples of process-based modelling and model parametrization [5,12–17,19]. However, very few examine the emergent features of these data-parametrized models and test whether they accurately replicate the population-level patterns seen in the data, as we do here. That said, there are exceptions, for example [19,56,57], and these models could, in principle, be used in conjunction with theoretical mechanistic models of pattern formation, such as [37,58], to provide a full story. If they were to be framed under a single overarching methodological framework, such as the CSSFs proposed here, then this would aid the unification of process-based model construction and theoretical process-to-pattern analysis that has recently been sought [11].

Though our model was significantly better at predicting space use than the model free of territorial interactions, it is clear from figure 3*a* that our model does not capture all aspects of the birds' spatial patterns. However, the strength of our approach is that we can readily add further behavioural features one at a time, testing the efficacy of each using the techniques detailed here. For example, the birds are known to have direct territorial conflicts, which affect where they move in subsequent days and weeks. Also, the movement is driven by intra-flock interactions, with one particular species, the cinereous antshrike (*Thamnomanes caesius*), playing the main role in maintaining cohesiveness. By using our techniques to test the effect of such behavioural phenomena on movement and space use, we can move towards building truly accurate, predictive models linking movement processes, conspecific interactions and collective behaviour to the emergent space use distributions.

## Data accessibility

This article represents publication no. 643 in the BDFFP Technical Series. This is contribution no. 33 in the Amazonian Ornithology Technical Series of the INPA Zoological Collections Program. This manuscript was approved for publication by the Director of the Louisiana Agricultural Experiment Station as manuscript 2014-241-15478.

## Funding statement

This study was partly funded by NSERC Discovery and Accelerator grants (M.A.L., J.R.P.). M.A.L. also gratefully acknowledges a Canada Research Chair and a Killam Research Fellowship. Funding for the research was provided by US National Science Foundation grant no. LTREB 0545491 awarded to Phil Stouffer, which helped fund K.M.'s work, and by the AOU 2010 research award to K.M.

## Acknowledgements

K.M. would like to acknowledge the Biological Dynamics of Forest Fragments Project (BDFFP) staff for providing logistic support; J. Lopes, E. L. Retroz, P. Hendrigo, A. C. Vilela, A. Nunes, B. Souza, M. Campos for field assistance; M. Cohn-Haft for valuable discussions. LIDAR images for canopy height models and digital elevation models were provided by Scott Saleska (University of Arizona) and Michael Lefsky (Colorado State University). We are grateful to Phil Stouffer, Greg Breed and Andrew Bateman for helpful discussions and suggestions, as well as several others who examined previous versions of our manuscript, including four anonymous referees.

- Received April 1, 2014.
- Accepted April 24, 2014.

- © 2014 The Author(s) Published by the Royal Society. All rights reserved.