## Abstract

Searching experiments conducted in different virtual environments over a gender-balanced group of people revealed a gender irrelevant scale-free spread of searching activity on large spatio-temporal scales. We have suggested and solved analytically a simple statistical model of the coherent-noise type describing the exploration–exploitation trade-off in humans (‘should I stay’ or ‘should I go’). The model exhibits a variety of saltatory behaviours, ranging from Lévy flights occurring under uncertainty to Brownian walks performed by a treasure hunter confident of the eventual success.

## 1. Introduction

Lévy foraging hypothesis [1–3] suggests that a saltatory search composed of consecutive displacement lengths *l* drawn from a power-law distribution,
1.1with the scaling exponent *μ* approaching the theoretically optimal value *μ* = 2 would maximize a forager's chance to locate sparsely and randomly distributed prey [4–6] and therefore presents an evolutionary beneficial alternative strategy to spatially intensive search. Foraging movement patterns that fit closely to walks (1.1) known as Lévy flights [7] were identified in many living spieces ranging from micro-organisms to humans [4–6,8–20], although the reported scaling exponents varied substantially for different animals, in different environmental contexts. The rate of flight lengths decay in the population data is typically described by a power law with an exponential cut-off revealing that an exponential decay rate dominates for extremely long travels [13,21]. It was further suggested [22,23] that while the distribution of displacements for the population aggregate appears to show a fat tail, the individual's bout distributions do not. Furthermore, when movement lengths within tracks fit a Brownian walk (being exponentially distributed), but differ in the parameter of this exponential distribution, the power law with exponential cut-off in the population data would result from a superposition of these exponential distributions [23,24]. The detailed statistically robust analysis of foraging trajectories of albatrosses [25] provided strong support for the individual character of search patterns: some trajectories were well approximated by truncated Lévy flights (1.1), others fitted Brownian movement patterns, whereas a significant portion of trajectories were not fitted by either distribution, as having both Lévy and Brownian features.

Further progress in understanding of search behaviour calls for a model of a feasible biological mechanism that allows animals and humans (i) reproducing a variety of statistically different movement patterns, without demanding and tedious computations, and (ii) spontaneous switching between them in different environments. In our paper, we propose such a model and solve it analytically (§4).

The virtual environment (VE) provides a simplified way to see and experience the real world, supporting the sense of spatial presence via virtual locomotion, rendering a clear sense of navigation and allowing for interactions with objects through a user interface [26]. Not surprisingly, VE gained widespread use in recent years as a tool for studying human behaviour, maintaining the capacity to create unique experimental scenarios under tightly controlled stimulus conditions. A major problem for users of VE is maintaining knowledge of their location and orientation while they move through space, essentially when the whole path cannot be viewed at once but occluded by objects in the environment [27]. Most of human spatial abilities (such as navigating a large-scale space and identifying a place) have evolved in natural environments over a very long time, using properties present in nature as cues for spatial orientation and wayfinding [28]. However, many of the natural body-based self-motion cues are absent in VE, causing systematic spatial orientation problems in subjects, and therefore calling for the instant invention of new adaptive strategies to move through VE under reduced multisensory conditions. The analysis of adaptive strategies instantaneously evolving in VE accentuates the key biological mechanisms of searching behaviour more vividly than the analysis of empirical data recorded *in situ*. Furthermore, in VE, we can study the mobility patterns of humans with extremely high resolution, up to the scales of millimetres and milliseconds, obtaining by far the most accurate data virtually comparable to either the scales of a few meters/tenth of seconds, kilometres per hours (in GPS tracking data; [19]), or the scale of a few thousand kilometres per week (in banknote travel patterns; [11]) discussed in the previous studies.

In order to understand the adaptive movement strategy and to clarify the role of environmental structure in searching and browsing, we conducted a treasure-hunting experiment (§2.1) with a gender-balanced group of participants (§2.3), in the different office VE (§2.2). People participated in our study had to decide how to proceed amid uncertainty by solving the *exploration–exploitation dilemma* (‘should I stay’ or ‘should I go’) [29]. On the one hand, there was an option to continue searching in the nearest neighbourhood (*exploitation*), in the hope to get rewarded beyond the next door. Alternatively, subjects could *explore* the parts of environment they had never been to. The actual trajectory of search resulted from a permanent balance between exploitation and exploration confronted at all levels of behaviour across all timescales. We show that balancing exploitation and exploration can be responsible for the statistically variable searching behaviour, ranging from Lévy flights to Brownian walks. It is worth a mention that the setting of our experiments was inconsistent with the assumptions of Gittins that presented an optimal strategy for trading off exploration and exploitation [30]. The probability of delivering a reward was not fixed in our case; subjects did not discount the value of each reward exponentially as a function of when it was acquired, and eventually the time of our experiment was essentially limited (in contrast to an infinite time horizon in the Gittins' approach [30]). Subjects in our study acted amid uncertainty, so that any precursive calculation of an optimal strategy was impossible for them.

Gender is often reported as a decisive factor in spatial cognition research [31,32]. A review of gender differences in spatial ability in real-world situations can be found in [33]. In this paper, we do not discuss the gender mobility differences observed in our experiments, leaving a detailed report on that for a forthcoming publication. Based on the results of the statistical data analysis, we discuss the role of scanning and reorientations in a compensation of information deficiency while moving through VE (§3.1), and on the experimentally observed superdiffusive spread of searching activity on large spatio-temporal scales (§3.2).

In §4, we have formulated a mathematical model of decision-making when no precise information on a possibility of rewards is available. The model can be solved analytically for some important cases (see §4.3) and helps to generate biologically relevant hypothesis about the fundamental process of decision-making. We conclude in §5.

## 2. Methods

### 2.1. Experimental design and procedure

In our treasure-searching experiments, every participant was asked to browse an office VE searching for collectable objects. For each time frame, the position and heading orientation of the participant were tracked and subsequently analysed by considering the collections of displacements and turns as a series of random events whose spatial and temporal distributions are assumed to possess certain statistical regularities.

To motivate subjects for searching thoroughly, each found object was rewarded with an extra 50 cents coin, in addition to the basis remuneration for participation in the study. Treasure hunters neither visited a real prototype of the VE model nor foresaw its floor plan before participating in the experiment. The objects of search were big enough, contrast coloured, clearly visible toys: teddy bears and locomotives. At the beginning of each trial, a number of toys (10 toys, for the smaller environment, and 15 toys, for the bigger one; see the §2.2 for details) were allocated to randomly chosen offices, beyond closed doors, one toy per room. Objects could be found immediately, as soon as the subject opens the door and enters the room. In order to focus subjects on the tasks, there was no communication between experimenter and subject during the experiment.

Before entering the main exploration areas, every participant was trained in a virtual tutorial room, in order to get used to stereoscopic imaging of the computer-simulated environment (two-slightly different images accounting for the interpupillary distance paired with stereo glasses providing a three-dimensional display of the environment), to obtain a good command of a Nintendo Wii remote controller, and to judge their perceived motions via button presses. Although the search time was not limited, we have restricted the total number of doors subjects could open during the experiment (by 10 doors, in the smaller environment, and by 15 doors, in the bigger one), in order to prevent a sequential search at each office and to stimulate an exploration activity in subjects. The experiment ended when the participant opened 10 (15) doors.

Two AVI video fragments showing the records of the actual searching experiment from a first-person perspective can be found online.^{1}

### 2.2. Virtual environments

The virtual models of two office environments existing in the University of Bielefeld were rendered with the Autodesk 3ds Max Design v. 2010 software and then projected for any user viewpoint onto a wall-wide laboratory screen (4 × 2 m) with the use of the Barco's Galaxy NH-12 active stereoscopic three-dimensional stereo projector. The sense of spatial presence was in subjects supported in subjects by natural colour reproduction, extended grey levels and high brightness of the projector. The control of user viewpoint motion through the VE was implemented via the Bluetooth connected Wiimote, the primary controller for Nintendo's Wii console featured with motion-sensing capability, which allows the user to manipulate items on screen via gesture recognition and pointing through the use of accelerometer and optical sensor technology.

VE model A (figure 1*a*) exactly reproduced the second floor of a temporary building (2012) belonging to the Cognitive Interaction Technology-Centre of Excellence (CITEC, Bielefeld University), and the VE model B (figure 1*b*) imitated the ground floor of the future Interactive Intelligent Systems Institute (Bielefeld University) presently under construction. Both environments consist of the standard adjacent offices, meeting rooms, hallways and corridors providing space where people can move, meet and discuss. Emergency exits and elevators that exist in the actual buildings were not taken into account in our experiments. The VE model A consists of 48 interconnected individual spaces of movement (represented by nodes in the spatial graph shown in figure 1*a*), and the VE model B includes 68 interconnected individual spaces of movement (the nodes of the spatial graph shown in figure 1*b*).

The spatial structure is important because of its effect on proximity: greater connectedness of a built environment generally means more direct routes and thus shorter distances between possible destinations. Connectedness also affects walking by expanding the choice of routes, thereby enabling some variety in routes within the environment. Discovering important spaces of movement and quantifying differences between them in a spatial graph of the environment is not easy, because any two spaces can be related by means of many paths. In Blanchard & Volchenkov [34], we suggested using the properties of random walks in order to analyse the structure of spatial graphs and to spot structural isolation in urban environments. Each node of the graph can be characterized with respect to the entire graph structure by the *first-passage time*, the expected number of steps required to reach the node for the first time (without revisiting any intermediate node) starting from any node of the graph chosen randomly, in accordance with the stationary distribution of random walks. In built environments, the first-passage time to a place can be understood as an average number of elementary wayfinding instructions (such as ‘turn left/right’, ‘pass by the door’, ‘walk on the corner’, etc.) required to navigate a wanderer to the place from elsewhere within the environment. The values of first passage times to the nodes of the spatial graphs A and B are shown in figure 1 (in colour online): the central places characterized with the minimal first-passage times are red, and the secluded places with the maximal first passage times shown in violet. The spatial structure of the VE model A analysed by means of the spatial graph figure 1*a* is essentially simpler than the structure of model B. Contrasted to model A, where offices are located along three sequentially connected corridors providing a single path between most of possible destinations, the spatial structure of model B allows for many cyclic trips owing to a network of interconnected places and has a number of vantage points from which a walker could observe the substantial parts of the environment from different perspectives (see §3.1). The single central spatial node (the central corridor) in VE model A is located at nine steps apart from any randomly chosen node in the spatial graph (figure 1*a*). VE model B contains a network of well-connected spaces of movement characterized by the first passage times ranging from 10 to 14 steps.

### 2.3. Participants

Two gender-balanced groups of volunteers (82 participants in total) took part in the controlled searching experiments conducted in the office VE shown in figure 1*a*,*b*. Although participants (mostly university students, with the mean age of 24.2 years and a s.d. of 3.7 years) were recruited personally and through advertisements at the University of Bielefeld. None of them had ever been familiar with the actual building prototyping the virtual models A and B. Prior to testing, all adult participants and parents of children younger than 16 years old gave their informed written consent for participation in the study. Participation in the study was voluntary, and participants could revoke their participation consent and quit at any time and for any reason. The standard provisions for data protection were adhered: all test results were kept confidential. All individual data were managed and processed anonymously, which eliminated any possibility of identification of participants.

## 3. Results and discussions

### 3.1. Scanning turns, reorientations and explorative rotations in virtual environments

Self-motion through the VE suffers from a lack of many natural body-based cues. Natural methods of visual exploration are also restricted in VE to that experienced through the display representing only a limited field of view, suffering from the degradation of sensory cues owing to device latencies and blocking out all surrounding visual input. It was concluded from various experiments that the optic flow without proprioception, at least for the limited field of view of the virtual display system, appears to be not effective for the updating of heading direction [35], and even when physical motion cues from free walking are included this is not necessarily sufficient to enable good spatial orientation in VE [36].

Perhaps owing to systematic spatial orientation problems occurring in VE compared with real-world situations, most of the subjects participated in our study permanently performed fast scanning turns (of 200–300 ms) by quick altering pressing on the left and right buttons of the remote controller, each time causing them to turn a greater or lesser angle. Probably such a movement routine played an important role for the proper self-motion perception as compensating information deficiency experienced by subjects while moving through VE under reduced multisensory conditions. Longer turns (taking about half a second) were observed when subjects redirected their walk or avoided obstacles. Eventually the very long, explorative rotations often including several complete revolutions (each time lasting up to a few seconds) occurred after far relocations, at vantage points, along the borders of two or several vista spaces, and at intersections of corridors that afford a broader view of the environment.

The typical distribution of scanning turns, reorientations and explorative rotation durations while exploring a VE is presented in figure 2. The areas of adjacent rectangles in the histogram are equal to the relative frequency of observations in the duration interval. The total area of the histograms is normalized to the number of data occurrences. The data show that the vast majority of all reorientations performed by subjects were the quick scanning turns, being the essential part of the adaptive walking routine in VE. The vertical dashed line in figure 2 stands for the rotation duration of 1.5 s; the locations of the correspondent points, at which participants performed longer turns (without translations), are displayed on the floor plans by the circles (figure 3*a*,*b*). The diameter of a circle is proportionate to a number of long turns recorded at its central point over all subjects.

Several studies conducted on small animals [14,15,37,38] suggested that the switch between scanning and reorientation behaviour in movement patterns of animals that search emerges from complex mechanic-sensorial responses of animals to the local environment and could infer the effects of limited perception and/or a patchy environmental structure. When exploring patchy resources, animals could adjust turning angle distributions, selecting a preferred turning angle value that would allow organisms to stay within the patch for a proper amount of time, maximizing the energetic gain. For example, the zigzag motion of *Daphnia* appears to be an optimal strategy for patch exploitation [38]. Therefore, the distinction between quick scanning turns and a longer reorientation behaviour is crucial to understand the statistical patterns of search [1].

Being a part of the travelling routine in VE, quick scanning turns performed by subjects during the walk induce correlations between the total rotation turning durations and the net displacements, on relatively small spatio-temporal scales sensitive to the local structural features of the environment. After reaching the natural limits of available space of unobstructed motion or entering a new movement zone, the subject performs a longer reorientation, perhaps in order to explore the new environment visually that breaks the correlations. We analysed the data on duration of reorientations of subjects travelling through both VE models with the use of the root mean square fluctuations (RMSFs) suitable for detecting correlations [4,39].

The RMSF of displacements is calculated by in which the net displacement of the walker by the *n*th reorientation is **r**_{k} is the recorded position of the *k*th reorientation, and the angular brackets denote averaging over all data points *n*_{0} = 1, *…*, *n*_{max}. Similarly, the RMSF of rotation durations is calculated by in which the total rotation duration of the subject by the *n*th reorientation is *τ*_{k} is the duration of the *k*th reorientation, and the angular brackets again denote the average over all data points. In figure 4, we juxtaposed the RMSF of the net displacement and the RMSF of the total rotation durations, in the log–log scale, for all recorded reorientations of subjects, separately for VE models A and B. For both VE models, the graphs show the super linear slope, indicating the presence of a strong positive relation that reinforces the total duration of quick scanning turns with the increase in the net displacement of subject. In the long run, the correlations generated by the fast-scale scanning behaviour vanish, which is typical for a correlated random walk process [14]. Interestingly, the ranges of correlations of the RMSF of turning durations and the RMSF of net displacements in VE models A and B were different: for VE model A, the correlations exist in the range of net displacements from 0.1 to 6 m, and from 0.3 to 10 m for VE model B. Perhaps the difference in the correlation ranges of the net displacements is due to the different size of the average space available for unobstructed movements, in models A and B. It is obvious that such an intensive space scan is performed by subjects only within their immediate neighbourhoods and principally cannot be extended either to the entire VE, or even to any of its significant parts, as the superlinear increase of required time makes the scanning process on large spatio-temporal scales biologically unfeasible. Thus, after completing a phase of intensive search within the patch of a size depending on the structural properties of the environment, a treasure hunter moves to some other area where the phase of intensive search is resumed. It is important to mention that changes in reorientation behaviour, on large spatio-temporal scales, can generate different anomalous diffusion regimes, which, in turn, can affect the search efficiency of random exploration processes [14].

### 3.2. Heavy-tailed distributions of human travels in virtual environment

In order to identify phases of specific activity in recorded movement patterns and to reveal the underlying cognitive mechanisms from their statistical properties, we have studied the probability distributions of time intervals and distances between consequent observable searching events (door openings) as they can determine strong changes in the diffusive properties of movement and in relevant spatial properties of the trajectories. The distributions representing the data on dispersal of the treasure hunters during the experiments are shown in figure 5 for both VE models. Gender is found as a factor-influencing navigation in VE: males were reported to acquire route knowledge from landmarks faster than females [40] and to spend less time in locating targets [41]. Interestingly, the form of the empirically observed distributions (figure 5) was neither gender-specific nor sensitive to the different global structure of the environments.

The data show that most of the consequent door openings happen in the immediate neighbourhood of the actual position of a participant. The dispersal statistics on the small spatio-temporal scales shown in figure 5 can be well approximated by uncorrelated Gaussian random walks (see the dotted bell-shaped curve) insensitive to the local structure of the environment. Both distributions in figure 5 are remarkable for the long right tails dominated by the *quadratic hyperbolas* attenuating the super-diffusive spread of treasure hunters on large spatio-temporal scales. A power-law tail in the probability distributions of both travelling times and travelled distances could arise from processes in which neither time nor distance have a specific characteristic scale, so that rare but extremely long and far ‘explorative’ travels can occur, alternating between sequences of many relatively short, ‘exploitative’ travels featuring local searches. The power-law tails in the dispersal data provide evidence in favour of a strong coupling between the large-scale movements along the paths that cannot be viewed at once [16] and the environmental structure. As in many empirical phenomena, the tails of the distributions shown in figure 5 follow a power law only for values greater than some minimum scales *s*_{min} in space (about 10 m) and in time (about 20 s) that is consistent with the conclusion of the previous section about the dynamic formation of immediate neighbourhoods, in which subjects perform intensive scanning. Our observations are in agreement with the hypothesis that both travelling times and travelled distances observed over the group of people participating in the study were drawn from a scale-invariant distribution of the form *f*(*s*) ∝ (*s*_{min}/*s*)^{2} and thus do not change for if scales are multiplied by a common factor. However, it is important to note that statistics of dispersion estimated over a sample of individual searching behaviour could be as different from the power law (1.1) as following an exponential distribution, as many of these samples were not statistically representative. Therefore, although the power-law tails of the distributions shown in figure 5 are consistent with the Lévy foraging hypothesis [1–3], they rather represent a group aggregate feature.

Geometrically, discrete-step random walks with movement displacements drawn from the Lévy distribution consist of walking clusters with very large displacements between them repeated over a wide range of scales [8,42]. Although Lévy flights are ubiquitous for representing intermittent search, cruise and foraging strategies in living spices, to our knowledge, this is the first report on the observation of Lévy flight search patterns in humans exploring the VE.

## 4. Model of exploration–exploitation trade-off for searching behaviour

### 4.1. Arguments in favour of a self-organized critical model

The observed statistical properties of human search behaviour in the VE call for a model that could exhibit the scale-invariant characteristic of a critical point of a phase transition (correspondent to the super-diffusive scale-free spread described by the Lévy distribution) spontaneously, without the need to tune any control parameter to precise values. In statistical physics, such a property of dynamical systems is known as self-organized criticality [43]. Furthermore, a plausible model has to be of a discrete nature, as the biological principle of intermittent locomotion assumes that animal behaviour unavoidably produces observable punctuations, ‘producing pauses and speeding patterns on the move’ [14]. These motivations reflect a fundamental ‘trade-off’ confronted by a treasure hunter choosing between an *exploitation* of the scanning movement routine within the familiar environment of nearest neighbourhood, possibly at no reward, and a fast relocation aiming at *exploration* of unknown but potentially more rewarding areas.

From a theoretical perspective, it is known that in a stationary setting there exists an optimal strategy for exploration [30] maximizing the reward over an infinite horizon when the value of each reward is discounted exponentially as a function of when it is acquired. However, to date, there is no generally optimal solution to the exploration versus exploitation problem [29,44], as human and other animals are prone to *dynamically update* their estimates of rewards in response to diverse, mutable and perhaps discrepant factors, including elapsed time of search, annoying failures to predict the location of a searched object, and instantaneous mood swings that could change in a matter of seconds. Therefore, it seems that a stochastic model managing a balance between exploration and exploitation may be more biologically realistic [29].

There is growing evidence that the neuromodulatory system involved in assessing reward and uncertainty in humans is central to the exploration–exploitation trade-off decision [44]. The problem can be cast in terms of a distinction between *expected uncertainty*, coming from known unreliability of predictive cues and coded in the brain by a neuromodularity system with acetylcholine signals, and *unexpected uncertainty*, triggered by strongly unexpected observations promoting exploration and coded in the brain with noradrenaline (norepinephrine) signals [45]. It was suggested in Yu & Dayan [45] that an individual decides on whether to stay or to go according to the current levels of acetylcholine and noradrenaline, encoding the different types of uncertainty.

Summarizing the above-mentioned arguments, we are interested in a self-organized critical model driven by a discrete time random process of competing between two factors featuring the different types of uncertainty. Various coherent-noise models possessing the plausible features were discussed [46,47] in connection to the standard sand pile model [43] developed in self-organized criticality, where the statistics of avalanche sizes and durations also take a power-law form.

### 4.2. Mathematical model of decision-making in a random search

The movement ecology framework [48] explicitly recognizes animal movement as a result of a continuous ‘dialogue’ between environmental cues (external factors) and animal internal states [14]. In our model, we rationalize the dialogue nature of a decision-making process to take on searching in highly unpredictable situations when no precise information on a possibility of rewards is available. Despite its inherent simplicity, the mathematical model formulated below can help generate a hypothesis about fundamental biological processes and bring a possibility to look for a variety of biological mechanisms under a common perspective.

We assume that an individual decides on whether to ‘exploit’ an immediate neighbourhood by search beyond the next door or to ‘explore’ other parts of the environment yet to be visited by comparing the guessed chances of getting a reward beyond the next door, and of finding a treasure elsewhere. It is not necessary that *p* + *q* = 1. We suppose that at each time click the subject updates one or both estimates and decides to proceed to a part of the environment yet to be explored if *q* < *p*. Otherwise, if *q* ≥ *p*, she picks a next door randomly among those not yet opened and searches for a treasure in the room behind. We consider *p* and *q* to be the random variables distributed over the interval [0, 1] with respect to the probability distribution functions (pdf) Pr{*p* < *u*} = *G*(*u*) and Pr{*q* < *u*} = *F*(*u*), respectively. In general, *F* and *G* are two arbitrary left-continuous increasing functions satisfying the normalization conditions *F*(0) = *G*(0) = 0, *F*(1) = *G*(1) = 1.

We model the intermittent search patterns by a discrete time random process in the following way. At time *t* = 0, the variable *q* is chosen with respect to pdf *F*, and *p* can be chosen with respect to pdf *G*. If *q* < *p*, the subject relocates by pressing a button on the controller and goes to time *t* = 1. Given a fixed real number , at time *t* ≥ 1, the following events happen:

— with probability

*η*, the chance to find a treasure in the immediate neighbourhood (*q*) is estimated with pdf*F*, and the chance to obtain a reward elsewhere (*p*) is chosen with pdf*G*.Otherwise,

— with probability 1−

*η*, the chance to find a treasure in the immediate neighbourhood (*q*) is estimated with pdf*F*, but the chance to find it elsewhere (*p*) keeps the value it had at time*t*−1.

Therefore, the parameter *η* quantifies the degree of coherence between the two stochastic subprocesses characterized by pdf *F* and *G*, respectively: the processes are coherent if *η* = 1, and incoherent if *η* = 0.

If *q* ≥ *p*, the local search phase continues; however, if *q* < *p*, the subject presses the controller button and moves further, going to time *t* + 1. Eventually, at some time step *t*, when the estimated chance *q* exceeds the value *p*, the subject stops and resumes searching within the immediate neighbourhood. The integer value *t* = *T* acquired in this random process limits the time interval (and travelled distance) between sequential phases of searching activity.

### 4.3. Analytical solutions for the decision-making process

While studying the model introduced in §4.2, we are interested in the distribution of durations of the relocation phases *P*_{η}(*T*; *F*, *G*) provided the probability distributions *F* and *G* are given, and the coherence parameter *η* is fixed. For many distributions *F* and *G*, the model can be solved analytically. We shall denote *P*_{η}(*T*; *F*, *G*) simply by *P*(*T*). A straightforward computation shows directly from the definitions that For *T* ≥ 1, the individual can either depart elsewhere (‘*D*’) or stay in the neighbourhood (‘*S*’). Both events can take place either in the ‘correlated’ way (with probability *η*; see (i)) (we denote them *D*_{c} and *S*_{c}), or in the ‘uncorrelated’ way (with probability 1−*η*; see (ii)) (*D*_{u} and *S*_{u}). For *T* = 1, we have for example
Similarly,
where we define, and *B*(*n*) = *A*(*n*) − *A*(*n* + 1), for *n* = 0, 1, 2, … .

It is useful to introduce the generating function of *P*(*T*),
Defining the following auxiliary functions
we find
4.1where and are the generating functions of *x*(*l*), *y*(*l*) and *z*(*l*), respectively. In the marginal cases *η* = 0 and *η* = 1, the probability *P*(*T*) can be readily calculated. For *η* = 0, we have from (4.1)
4.2From (4.2), one gets
4.3Therefore, in this case, for any choice of the pdf *F* and *G*, the probability *P*(*T*) decays exponentially. For *η* = 1, (4.1) yields , so that
4.4for the special case of uniform densities d*F*(*u*) = d*G*(*u*) = d*u*, for all and for any . In this case, simpler and explicit expressions can be given for and *P*(*T*). Namely, from equation (4.1), we get
4.5The asymptotic behaviour of *P*(*T*) as *T* → *∞* is determined by the singularity of the generating function that is closest to the origin. For *η* = 0, the generating function has a simple pole, and therefore *P*(*T*) decays exponentially that agrees with the result (4.3).

For the intermediate values 0 < *η* < 1, the generating function has two singularities. The first pole, *s* = *s*_{0}, corresponds to the vanishing denominator 1 + (1−*η*)*γ*(*s*), where *s*_{0} = *s*_{0}(*η*) is the unique non-trivial solution of the equation −ln(1−*ηs*) = *ηs*/(1−*η*). The second singularity, *s* = *s*_{1} = *η*^{−1}, corresponds to the vanishing argument of the logarithm. It is easy to see that 1 < *s*_{0} < *s*_{1}, so that the dominant singularity of is of the polar type, and for times much larger than the crossover time *T*_{c}(*η*) ∼ ln(*s*_{0}(*η*))^{−1} the corresponding decay of *P*(*T*) is exponential, with the rate ln(*s*_{0}(*η*)).

Eventually, when *η* tends to 1, the two singularities, *s*_{0} and *s*_{1} merge. More precisely, we have
4.6The corresponding dominant term in (4.6) is of order *O*(*T*^{−2}) [49]. This obviously agrees with the exact result one can obtain from equation (4.4), with d*F*(*u*) = d*G*(*u*) = d*u*,
4.7Let us note that in the case of uniform densities it is possible to obtain an expression of *P*_{η}(*T*) for all times, and for any value of *η*,
4.8in which

In figure 6, we have presented the probability distributions of the searching durations for increasing values of *η*. The proposed mathematical model suggests that the algebraic tail dominated by a quadratic hyperbola observed in the distribution of time intervals and between sequential phases of searching activity (figure 5) can arise owing to a trade-off between exploitation versus exploration amid uncertainty. In such a case, the variable *T* is set to count bins in the histogram shown in figure 5 for the large scales . Contrasted to locomotion in real environments, mobility in VE depends on self-motion perception in virtual space and convenience of the locomotion interface, rather than on physiological factors of an individual such as height, weight, age or fitness. The instantaneous translation velocity in the VE is kept constant for any individual, as long as she presses a button. Therefore, the scale-free distribution of time intervals induces the scale-free distribution of travelling distances, with the same scaling exponent.

We have shown that when balancing the chances to be rewarded in the immediate neighbourhood (‘now and here’) and later elsewhere (‘then and there’) amid uncertainty subject estimates both uniformly at random at each step (*η* → 1), the inverse quadratic tail always dominates the bout distributions on large spatio-temporal scales, so that observed walking behaviour is reminiscent of Lévy flights. Thus, while searching for a sparsely and randomly located objects, subjects would adopt an ‘explorative’ movement strategy that takes advantage of the Lévy stochastic process, in order to minimize the mean time for target detection or mean first-passage time to a random target, as well as to maximize the energetic gain in the case of sparsely and randomly distributed resources, because the probability of returning to a previously visited site is smaller for a Lévy flight than for the usual random walk [5]. However, if the subject is convinced about the chance to get rewarded for exploration keeping this estimated value *p* fixed, the value of *η* slides from 1 to 0 (the two stochastic subprocesses become incoherent), and the walking statistics are tuned out from the power law (8) to the Brownian walks characterized by the exponential decay *P*_{η = 0}(*T*) = 2^{−(T + 1)} (figure 6). For the intermediate values of the coherence parameter, 0 < *η* < 1, the saltatory searching behaviour possesses both Lévy and Brownian features.

If we assume that the probability density functions characterizing the chances for exploration and exploitation in individuals are uniform, d*F*(*u*) = d*G*(*u*) = d*u*, we can calculate from (4.8) the appropriate intermediate value of the coherence parameter 0 < *η* < 1 generating the probability distribution *P*_{η}(*T*) that fits a given empirical distribution best. The obtained value of *η* can be considered as a measure of instability in the subject's estimation of the chance of success. It is also important to mention that the equation where *P*_{η}(*T*) is defined by (4.8), can have many solutions within the interval , so that the several consequent values of *T* have to be checked in order to determine the value of the uncertainty parameter *η* balancing exploration and exploitation behaviours in a searching individual. It is remarkable that the value of *η* can be evaluated formally for a sample of individual searching behaviour even when just a few points are available. However, there are also samples that cannot be fitted reasonably well with (4.8) by minimizing the discrepancy Perhaps some individuals exhibit certain preferences while estimating the chances for exploration and exploitation, so that the pdf *F*(*u*) and *G*(*u*) for them are not uniform.

## 5. Conclusions

We have studied the human search behaviour in the different office VE. The high-resolution data on displacements and reorientations of 82 subjects participated in the treasure-hunting experiments were analysed in search of certain statistical regularities. The data show that the vast majority of reorientations performed by subjects were the quick scanning turns (of 200–300 ms) being the essential part of the adaptive movement strategy under reduced natural multisensory conditions in VE. The analysis of the root mean square fluctuations of displacements and turning durations gives us a conclusive evidence of that the total reorientation durations are strongly reinforced with the net displacement of subjects that makes the intensive scanning process biologically unfeasible on large spatio-temporal scales. In the absence of any specific cues marking the location of hidden treasures, subjects searched in a saltatory fashion: they marched along corridors and across halls, paused for the local search in the nearest rooms, and then resumed traversing the VE. In built environments, the area of an intensive local search is naturally limited to space available for the unobstructed movement, in which the searching activity in humans is characterized by the usual diffusive spread. However, on large scales, the searching behaviour has the super-diffusive characteristics: we have found that the empirical distributions of time intervals and distances between consequent searching events are dominated by the quadratic hyperbolas that fits the Lévy flight patterns identified in the intermittent search, cruise and foraging behaviours of many living organisms.

Contrary to previous approaches focused primarily on applying various statistical methods for the detection of Lévy flight patterns in the available empirical data, without discussing the possible biological mechanisms causing the super-diffusive spread of searching activity in unpredictable environments, we have suggested a simple stochastic model of the coherent-noise type describing the exploration–exploitation trade-off in humans. According to our model, the saltatory search behaviours in humans can be featured by balancing between exploration and exploitation amid uncertainty (‘should I stay’ or ‘should I go’) in a way of a regularly recurring comparison of estimated chances to find a treasure in an immediate neighbourhood and to get rewarded in other parts of the environment yet to be explored. We have solved the model analytically and investigated the statistics of possible outcomes of such an exploration–exploitation trade-off process. It is important to mention that our model exhibits a variety of saltatory behaviours, ranging from Lévy flights occurring under uncertainty (when chances to get rewarded for the further exploration are revaluated by subjects recurrently, at each step) to Brownian walks, with an exponential distribution of movement bouts, taking place when subject's estimation of the chance of success remains stable over time (although not necessary high). Our model of decision-making amid uncertainty provides a possible explanation for the appearance of Lévy flight patterns in the foraging behaviour of animals occupying unpredictable environments, such as habitats with sparsely distributed resource fields or in new environments where experience may not help, as well as switching between different types of behaviours toward the more intensive searching strategy that could occur during a single trip of a treasure hunter confident of the eventual success.

## Acknowledgements

The treasure-hunting experiments were supported by the Cognitive Interaction Technology-Centre of Excellence (CITEC, Bielefeld University). D.V. gratefully acknowledges the financial support by the project *MatheMACS* (mathematics of multilevel anticipatory complex systems), the grant agreement no. 318723, supported by the EC Seventh Framework Programme FP7-ICT-2011-8. Conceived and designed the experiments: M.T., S.K., D.V. Performed the experiments: J.H. Analysed the data: D.V., J.H. Wrote the paper: D.V.

## Footnotes

↵1 Two AVI video fragments showing the records of actual searching experiments from the first-person perspective can be found at http://youtu.be/17aNxvZFMRw (the VE model A) and http://youtu.be/_Jooi9ZXRGs (the VE model B).

- Received April 17, 2013.
- Accepted May 29, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.