Falls not only present a considerable health threat, but the resulting treatment and loss of working days also place a heavy economic burden on society. Gait instability is a major fall risk factor, particularly in geriatric patients, and walking is one of the most frequent dynamic activities of daily living. To allow preventive strategies to become effective, it is therefore imperative to identify individuals with an unstable gait. Assessment of dynamic stability and gait variability via biomechanical measures of foot kinematics provides a viable option for quantitative evaluation of gait stability, but the ability of these methods to predict falls has generally not been assessed. Although various methods for assessing gait stability exist, their sensitivity and applicability in a clinical setting, as well as their cost-effectiveness, need verification. The objective of this systematic review was therefore to evaluate the sensitivity of biomechanical measures that quantify gait stability among elderly individuals and to evaluate the cost of measurement instrumentation required for application in a clinical setting. To assess gait stability, a comparative effect size (Cohen's d) analysis of variability and dynamic stability of foot trajectories during level walking was performed on 29 of an initial yield of 9889 articles from four electronic databases. The results of this survey demonstrate that linear variability of temporal measures of swing and stance was most capable of distinguishing between fallers and non-fallers, whereas step width and stride velocity prove more capable of discriminating between old versus young (OY) adults. In addition, while orbital stability measures (Floquet multipliers) applied to gait have been shown to distinguish between both elderly fallers and non-fallers as well as between young and old adults, local stability measures (λs) have been able to distinguish between young and old adults. Both linear and nonlinear measures of foot time series during gait seem to hold predictive ability in distinguishing healthy from fall-prone elderly adults. In conclusion, biomechanical measurements offer promise for identifying individuals at risk of falling and can be obtained with relatively low-cost tools. Incorporation of the most promising measures in combined retrospective and prospective studies for understanding fall risk and designing preventive strategies is warranted.
1.1. Importance of falls
In western civilization, falls not only place a heavy economic burden on society but are also responsible for a considerable loss of life quality. Costs resulting from falls in 2009 alone have been reported to range between 0.85 and 1.5 per cent of the total healthcare expenses within the USA, Australia, EU and the UK . In addition, falls have a critical influence on an individual's health status especially in elderly adults, with approximately 81–98% of hip fractures caused by falls . Furthermore, this risk of a falling increases with age [3,4]. The main associated costs therefore tend to occur in higher age groups and in the wake of fractures , a problem that is further exacerbated by the increasing proportion of elderly. Reliable and widespread methods for the identification of elderly individuals with lower postural and locomotive stability are therefore essential for effective implementation of preventive strategies (table 1).
1.2. Identification of fallers
There are currently over 400 known risk factors for falls , which are broadly classified into environmental, task-related and personal factors [12,13]. Environmental or extrinsic factors are generally considered to comprise all influences external to a subject and might include factors, such as lighting, surface elevation, surface roughness, obstacles, presence of support or external perturbations [14,15]. Task-related factors include task complexity and speed, progressive tiredness or fatigue during the task, load handling, etc. [14,15]. Personal factors are intrinsic factors reflecting individual differences in, among others, age and gender [14–16], muscular strength , reaction time , vision , ethnicity , use of drugs and medications , living alone , sedentary behaviour , psychological status, impaired cognition and foot problems . In addition, history of falls  as well as impaired balance and mobility  can be considered as higher level factors owing to their interdependency with both intrinsic and extrinsic factors. In-depth overviews of known fall risk factors have been presented elsewhere [13,26,27]. While knowledge of the environment is known to play a role in minimizing the effect of intrinsic and task-related factors on instability [13,26], extrinsic factors cannot generally be controlled, tested or accounted for in clinical assessment. Intrinsic factors (particularly impaired balance, mobility and muscle weakness) on the other hand can not only be quantified, but have also consistently been identified as major risk factors for falling [14–17].
1.3. Quantification of fall risk
Given the multitude of factors that influence fall risk, the sensitivity, specificity and applicability of subject-specific assessment of fall risk in a clinical setting remains unclear. Recently, meta-analyses and systematic reviews have been performed to identify effective screening instruments/scales for predicting fall risk in the elderly, reporting different success rates for predicting future fallers [28–33]. The most established techniques to quantify fall risk have been (i) motor performance tests, (ii) questionnaires, and (iii) biomechanical laboratory-based measurements. Owing to the high number of risk factors and the large number of assessment techniques, however, comparison between studies and thus appropriate selection of approaches for assessing fall risk becomes problematic.
Gait instability is a major fall risk factor, particularly in geriatric patients [34–36]. The term ‘stability’ is considered to be the behaviour of a system under small perturbations . A stable system would remain either in (or return to) a state of equilibrium under static conditions or in a state of uniform motion (maintain a specific trajectory) under dynamic conditions, after being perturbed [37,38]. Complementary to this, the term ‘gait stability’ is considered to comprise both direct as well as indirect biomechanical aspects of stability during gait. As parameters that can be directly measured and quantified, such measures could therefore contribute towards an understanding of subject-specific fall risk.
1.4. Deficits in common tests
While motor performance tests (e.g. Berg scale, Timed Up and Go, POMA, etc. [29,31,33,39]) are popular in clinical settings for examining the functional status of a patient, these tools are generally not capable of providing a quantitative predictive assessment of gait stability or fall risk. Even in specific test batteries designed to assess fall risk, Laessoe et al.  reported low fall prediction rates, with a sensitivity and specificity of only 50 per cent and 43 per cent, respectively. As these probabilities are rather low, the authors concluded that fall risk cannot be predicted in a healthy and active elderly population by assessing motor performance alone . In a similar manner, Boulgarides et al.  reported that current tools were found to be more successful in predicting fall risk in a frail population than in active older adults. The authors indicated the need to develop and test new motor performance assessment tools for an increasingly active and healthy ageing population. In order to detect differences between fallers and non-fallers, however, these parameters must ensure both assessment sensitivity and specificity.
In an attempt to identify methodological deficits in measures to assess fall risk, Myers  has challenged the clinical applicability and external validity of motor performance assessment tools for the general population. Furthermore, a study by Scott et al.  found that only few tools have been investigated more than once or in more than one setting. According to the authors, no single tool could, therefore, be recommended for use in all settings or for multiple sub-populations within each setting . These findings suggest that current clinical motor performance tests may possess limitations as tools for subject-specific fall risk assessment.
Questionnaires filled out by (i) clinical staff [41–45], (ii) the subjects themselves [46,47], or (iii) by telephone interviews  represent one of the most prevalent approaches for fall risk assessment in the clinic. While questionnaires can reach out to large cohorts in an efficient manner, a number of limitations are associated with this form of data collection; for questionnaires filled out by employees, well-trained staff are required in order to avoid bias; self-reported questionnaires can only measure explicit knowledge and this knowledge—especially in the elderly—is often not an objective reflection of the truth . In many cases, subjects have the tendency to conform to the behaviour they think is appreciated or expected within their social environment. In addition, people are inclined to perceive and interpret the world in a way that is appropriate to their beliefs , particularly when they have to estimate health risk . Such behavioural shifts can project stable personality differences, but these are known to cause problems in self-reported measures . Moreover, consideration and exploitation of all known fall risk factors retrieved via questionnaires are difficult owing to the complexity of the data collected. Evaluation of questionnaires is further complicated since self-reported information has low reliability  and written information is difficult to convert into a quantification of gait stability.
1.5. The assessment of gait stability
Given the limitations in assessing fall risk using motor performance tests and questionnaires, particularly considering the increasing age of the population, the need for objective, cost-effective and clinically applicable methods, as well as methods that possess high sensitivity and specificity for assessing gait stability on a subject-specific basis, is clear. While biomechanical laboratory-based measurements are usually (but not always) expensive and require trained staff, their effectiveness for fall risk assessment remains unknown. In addition, existing evidence suggests that falls in the elderly occur mostly in dynamic rather than static settings [5,52–55], thus indicating a need to assess dynamic characteristics during activities of daily living among elderly individuals. While biomechanical approaches for determining specific parameters of mobility are now gaining increasing recognition for assessing the function and stability of older individuals, they have hardly been taken up in clinical settings. This could be due to the unclear sensitivity and specificity of these approaches, together with the time and effort required for their use.
Since walking is the most frequent activity of daily living, this review considers only walking. Gait stability is an important and necessary precondition for walking without falling. As motion patterns deteriorate with age [34,35], establishing age- and fall-related differences during gait may therefore form a critical step towards better subject-specific evaluation of gait stability than using motor performance tests or questionnaires.
In this review, measures to assess gait stability include parameters of stability as well as variability. Here, stability parameters not only provide information regarding the noise present in the motor task performance, but also explicitly quantify the performance of the dynamic error correction. On the other hand, variability during specific tasks results from noise present in the motor task and in the environment [56–58]. In addition, variability will increase at a given noise level when error corrections are less effective. It can therefore be assumed that variability is related to fall risk because increased variability may bring the dynamic state of the person closer to their limits of stability. In this context, variability can be considered an indirect assessment of gait stability.
In order to maintain balance during dynamic activities, corrections to the base of support relative to the centre of mass are provided by suitable foot placement. For continuous motion, such foot trajectories, therefore, provide the primary mode of error correction to allow stability during gait. Evaluation of foot trajectories, therefore, provides the key to understanding gait stability. In addition, gait analysis in the clinic has traditionally been performed with relatively cheap devices such as floor mats, etc. which are able to assess only foot kinematics and variations in foot trajectories [34,35,52,59–68]. For this reason, time series of foot kinematics also form the largest proportion of data on gait stability, thus providing the focus of this study. Consequently, measures to quantify gait stability are considered here to include stability measures (direct assessment), and measures of kinematic variability (indirect assessment) that are related to fall risk assessed using spatio-temporal time series of the foot.
Although the importance of variability in certain gait cycle parameters was recognized more than 30 years ago for identifying fallers , studies that include measures of variability and stability now seem to be capable of identifying unique aspects of gait stability such as noise in human motor performance or dynamic error correction. Furthermore, the demand for rapid, cost-effective and subject-specific assessments of gait stability is growing, particularly in clinical settings, to improve the effectiveness of fall prevention strategies. The purpose of this systematic literature review was therefore to evaluate the relative efficacy of biomechanical measures obtained from time series of foot kinematics to quantify gait stability in the elderly during walking.
2.1. Search strategy
This systematic literature search examined all original articles published after 1980 until 11th March 2010 from the following electronic databases:
— Pubmed (http://www.ncbi.nlm.nih.gov/sites/entrez),
— Cochrane Library (http://www.thecochranelibrary.com/view/0/index.html),
— Embase (http://www.embase.com/search), and
— CINHAL (http://web.ebscohost.com/ehost).
Additionally, article reference lists of papers selected for inclusion were screened by two authors (D.H. and N.B.S.) to ensure that all relevant articles were considered. In order to be incorporated within this review, articles had to be published in a peer-reviewed journal, written in English and had to include the following combination of words in their title, abstract or keywords: (KW indicates the search phrase combined with the aid of the ‘history’ function):
— KW walk* OR gait OR locomot* OR ambulat*;
— AND KW stability OR variab* OR symmetry OR pattern OR balance OR equilib* OR ‘dynamic posturography’;
— AND KW fall OR falls OR falling OR elder* OR old* OR geriatric* OR age OR ageing OR ageing.
Since this study focused on methods that quantify human dynamic stability in elderly subjects, reports related to robotics, children, neurological or cardiac diseases and animal research were deemed non-relevant and were therefore excluded. To exclude these from the search results, the KW was constrained as follows:
— NOT KW child OR children OR animal OR robot* OR amput* OR stroke OR diabetes OR hypertension OR heart OR cardiac* OR coron* OR pulmon* OR blood* OR ‘Parkinson's disease’.
Using ‘*’ as wild card character, e.g. step*, stabil*, fall*, age* and child*, Pubmed returned the statement: ‘only the first 600 variations’ were considered. Therefore the key words step, stepping, stability, fall, falls, falling, age, ageing, ageing, child and children were used in the search strategy, instead of the shortened expressions with a wild card character attached. In Pubmed, Embase and CINAHL, the NOT KW condition was searched only in the words appearing in the title of the articles (title search), whereas in the Cochrane Library, a search was performed only within the keywords of the articles, as this library does not offer a title search.
The final search results from the four databases were combined and checked for duplicates using the EndNote package (Endnote v. 8.0.2, Thomson Reuters, California). The articles' abstracts were then further scrutinized to ensure the inclusion of the following criteria:
— the objective to distinguish between the stability of either fallers versus non-fallers (FNF) or healthy younger versus healthy older subjects;
— a cohort of healthy old and/or elderly population after proper screening tests excluding individuals with neurological diseases and/or musculoskeletal impairments;
— a method that included biomechanical measurements during level walking and that quantified gait pattern characteristics using kinematic and/or spatio-temporal indices of foot trajectories.
If doubts regarding the inclusion or specific exclusion of an article still existed, agreement was reached together with a third party (W.R.T.). If further detailed information was required, the full text of the article was perused.
2.2. Data extraction
All data for assessment were extracted from the full articles independently (D.H. and N.B.S.) using a standardized data extraction scheme, with any disagreements resolved (with W.R.T.). The following information, if possible and applicable, was extracted for each included article: first author; year of publication; sample sizes for each included cohort; each cohort's demographic information; definition of a faller if applicable; number of strides, duration or distance of walking; statistical values; means and standard deviations (s.d.s) of outcome measures, assessment methods, relevant significant differences and the devices used for measuring. If the means and s.d.s of the outcome measures were not listed in a table or mentioned in the text, the data were extracted from their plots using Adobe Photoshop v. 17.0. Furthermore, if those data were divided in other subgroups and listed in a table, then means, x and s.d. were determined.
In one case, as only the median of the outcome measures was reported, it was extracted and used as the mean . All units were standardized to metres (m), seconds (s) and metres per second (m s−1).
2.3. Effect sizes
In statistical terms, a high sensitivity for a biomechanical measure in a clinical setting would imply that an individual is correctly classified as a ‘faller’, whereas a high specificity would imply that an individual is correctly classified as a ‘non-faller’. However, since sufficient information to perform subject-specific classifications was not available for all studies, effect sizes (ESs) were calculated to quantify the magnitude of the differences between groups over the gait measures used. As ESs are independent of sample size, they are helpful in comparing results across studies . In the present study, the ES measure of Cohen's d was obtained and used to quantify the differences in gait stability between old versus young (OY) and FNF cohorts. Cohen's d was calculated using equation (2.1) : 2.1where d, Cohen's d used in this case as one measure for ES; xt, mean of the treatment group (old in OY cohorts and fallers in FNF cohorts); xc, mean of the control group (young in OY cohorts and non-fallers in FNF cohorts); spooled, Pooled s.d. from the cohort s.d.s.
Cohen classified ES, d, into categories, with d < 0.2 as small, 0.2 < d < 0.8 as medium and values greater than 0.8 as large . Cohen's d provides an estimate of the overlap between the distributions of treatment and control conditions, in the current study, between OY and FNF groups. For example, in the case of OY comparison, an ES, d, of 0 indicates that the mean of the old group is at the 50th percentile of the control or young group, which could be interpreted as a complete overlap between the two groups. Similarly, an ES, d, of 0.8 indicates that the mean of the old group is at the 79th percentile of the young group, indicating that the difference between the two groups' means (non-overlap) is approximately 47.4 per cent in the two distributions . Quantitative evaluation of all studies was undertaken using Cohen's d, which is presented in groups of FNF versus OY investigations and subsequently as linear versus nonlinear assessment measures.
2.4. Statistical heterogeneity
The ES, although a powerful tool for analysis, provides only an estimate of the true effect in a particular study and is subject to imprecision. Some of this imprecision could be a result of clinical and methodological heterogeneity, such as variation within the populations investigated in different studies .
In the current review, the definition of fallers in FNF comparisons was identified as a criterion leading to heterogeneity among study populations. Faller definitions were classified into five categories: studies with at least 1 fall in the 12 months following data collection [5,53] (category 1); studies with at least one fall in the 12 months prior to data collection [8,60,72,73] (cat. 2), studies with at least one fall in the 5 years prior to data collection  (cat. 3), studies with at least one fall in the 13 weeks following data collection  (cat. 4) and finally, studies with at least three falls in the 6 months prior to data collection  (cat. 5). One study did not provide a coherent definition of fallers .
Similarly, for the case of OY comparisons, age differences were identified as one of the most important factors leading to heterogeneity among the study populations. Age differences between OY cohorts were classified into four categories: mean age difference less than 40 years ; between 40 and less than 45 years [8,35] between 45 and 50 [34,59,64,65,67,68,76–81]; and greater than 50 [6,63,82–84].
2.5. Assessment of devices used and associated costs
The cost of the measures used was estimated from the minimum requirements (equipment and/or devices) for use in a complete clinical assessment. Based on an overview of the reviewed studies, the devices were divided into five categories—inertial sensors, foot switches (worn as insoles in the subject's shoes or taped to the bottom of bare feet), movement/motion laboratory, walking mats (walkway embedded with pressure sensors) and paper/PVC mats—which were then classified into three cost categories. The measures that require a fully developed movement analysis laboratory with motion capture technology for providing a quantification of gait stability were assigned to ‘high-cost’. On the other hand, the inertial sensors, foot switches and walking mats were categorized as ‘medium-cost’ and the paper/PVC mat was classified into the ‘low-cost’ category.
3.1. Study selection and characteristics
The electronic search yielded a total of 9889 articles. After screening, 29 articles were considered eligible for inclusion in this review (figure 1). Twenty-one studies considered research questions focussed on OY [6,8,34,35,52,59,63–68,76–78,80–85], whereas only 10 studies were designed to assess FNF [5,8,35,53,60,62,72–75]. Granata & Lockhart  and Khandoker et al.  presented a combination of both cohort comparisons within the same study.
In general, the selected articles ranged from 1980 to 2010 (30 years). Only four studies (13.8%) were performed before the year 2000, whereas 38 per cent of the studies were conducted in the last 2 years (2008–2010). More than half of the selected studies were conducted in the USA (approx. 55%), followed by Europe (24%) and Australia (4%). Furthermore, 26 articles reported measures of gait variability (s.d., coefficient of variation (CV) and inter-quartile range (IQR), figures 2 and 4), while six articles reported nonlinear measures for assessing variability (wavelets and detrended fluctuation analysis (DFA)) and dynamic stability (Lyapunov exponents (LEs) and Floquet multipliers (FMs)) while three reported both [6,34,73] (figures 3 and 5). Articles with nonlinear measures were generally more recent (2003–2009) than articles with linear measures (1984–2010).
The cohort sample sizes varied considerably from 4  to 422  subjects. Cohort sample sizes in studies with nonlinear measures were smaller than those using linear approaches, ranging from 4 to 32 subjects and 4 to 422 subjects, respectively. Furthermore, 63 per cent of the studies that used linear measures had different sample sizes within study cohorts, compared with 57 per cent of studies using nonlinear measures.
Among the studies that investigated FNF, two studies carried out a prospective rather than a retrospective fall examination [5,53]. Out of all studies, 22 reported the gender of participants; seven studies considered only female subjects [34,67,72,76,80,82,84], one study measured only male participants , while the remaining studies examined cohorts with both genders. In FNF settings, 6/10 studies were age-matched [5,35,53,62,72,75], with only 4/10 studies gender-matched [5,53,72,74]. In the OY setting, 50 per cent of all studies measured gender-matched participants [6,34,63–67,76,80,84].
3.2. Fallers versus non-fallers
Studies that performed an assessment of FNF [5,8,35,53,60,62,72–75] reported 15 measures of gait variability (linear) in total. The s.d. of stride time was reported three times [5,53,74] and was the most reported measure among the included studies (figure 2). Nine of these measures were reported only once [5,53,60,73,74], but variability (both s.d. and CV) of the temporal parameters, namely stride, swing and stance times, had average ESs greater than 1 and were reported more than once (figure 2).
Interestingly, the variability measures of swing time, stride time and stance time were reported for studies with faller definition categories 3  and 1 [5,53], with measures included in studies incorporating faller category 3 having higher ESs compared with those with faller category 1. Studies with faller category 2 and faller category 1 reported variability of step width with similar ES values [5,8,53,60,72,73]; however, studies with faller category 4  and faller category 1 [5,53] reported variability of step length, with ESs for step length with faller category 4 being considerably higher than that for faller category 1. None of the studies reporting gait variability incorporated a faller category 5 for their studies (figure 2).
When nonlinear measures were considered, three of seven parameters of the minimum foot clearance time series, namely multi-scale exponents of MinFC (MinFC(β)), wavelet transform coefficient of MinFC (MinFC(Wυ) and the long axis of the Poincaré plot of MinFC (MinFC(PPISD_2)), showed ESs > 1, with MinFC(β) having the highest ES, although this was reported only once  (figure 3). Khandoker et al.  reported that PPISD_2, the major axis of the Poincaré ellipse plot for the minimum foot clearance time series, was significantly (p < 0.05) longer for the fallers when compared with the non-fallers. While using summary measures from wavelet analysis , the authors reported that using the scaling exponent, β, a significantly lower spectral density in the higher frequency bands was found for elderly fallers compared with non-fallers, implying longer term variability in gait patterns. Importantly, none of the nonlinear measures was evaluated for more than one faller category (figure 3).
Furthermore, FMs, which assess orbital stability, were also reported to produce Cohen's d values greater than 1 in FNF comparisons . Importantly, although reported only once for an FNF study, inconsistency of s.d. had one of the highest ESs, indicating that fallers had a highly inconsistent variability in the stride time series compared with non-fallers .
3.3. Old versus young comparisons
Variability of 20 linear measures was reported across all studies conducting an OY comparison. Variability of conventional spatio-temporal parameters such as the s.d. of step width [64,77,80,81,85], s.d. of stride time [64,77] and CV of stride velocity [63,82,84] had large mean Cohen's d (>1) values (figure 4), suggesting sufficient sensitivity of these conventional parameters to age-related differences. In addition, although reported only once, CV of MinFC showed the highest ES (ES = 2.98) . Furthermore, CV as well as s.d. of MaxFC  and acceleration of the ankle in the sagittal plane  reveal a Cohen's d value greater than 1. Thus, age-related differences were observed in the variability of the lowest as well as the highest foot clearance and variability of ankle flexion–extension accelerations (figure 4).
Two linear measures, s.d. of stride time [6,59] and step time [80,81,85,86], were reported for studies with different age categories for experimental and control groups (figure 4). In both cases, measures estimated on cohort groups with the age difference of 50 years and above yielded higher ES values compared with cohort groups with a mean age difference of 45 to less than 50.
Among the nonlinear measures used to quantify the effects of age, only the non-stationary index of stride time yielded a Cohen's d value greater than 1 (figure 5), but this parameter was evaluated in only one study . Stride FM mean  and stride time α  also showed some of the highest ES values (greater than 0.6), indicating that summary FM measures and DFA exponents might also be able to distinguish between young and old individuals under certain conditions. Both the average long- and short-term LEs showed larger effects than FM measures, indicating that local stability measures, when compared with orbital stability analyses, seem to be more suitable in detecting age-related differences. On average, however, linear outcome measures have been shown to produce higher ES values (mean ± s.d.: 1.04 ± 1.32) than nonlinear measures (0.63 ± 0.54), indicating that linear outcome measures may be more sensitive to the changes that occur because of ageing than nonlinear measures.
Maximum FMs were reported for two studies with different age categories of the experimental and control groups [35,65] (figure 5). The orbital stability value estimated by the study with the larger age difference (45 to less than 50 years) between the control and experimental group  also showed larger ES than the same measure evaluated in the experiment using cohort groups with the lower average age difference (40 to less than 45 years) .
3.4. Combined old versus young and fallers versus non-fallers studies
In general, all studies in this review that reported significant differences (p-values < 0.05) between cohorts (OY and FNF) demonstrated higher variability (linear as well as nonlinear measurements) for old adult and faller groups, respectively (except Maki ). Temporal measures of linear variability in stride, swing and stance were capable of distinguishing between fallers and non-fallers (ES > 1), whereas variability in step width, stride time and velocity were more identifiable between age groups (ES > 0.5).
Of all the nonlinear measures reported, the Poincaré measures on MinFC time series (variability) and FM measures on stride trajectory (stability) were best able to capture fall-related differences (figure 3). On the other hand, nonlinear measures of variability for stride time (e.g. non-stationary index, DFA exponent) were most able to capture age-related differences (figure 5).
Only two studies, Granata & Lockhart  and Khandoker et al. , included both FNF and OY comparisons, but with relatively low sample sizes. The summary outcome measures included by Granata & Lockhart  were stability measures of left and right steps (FMmax and FMmean) and stride parameters (FMmean and FMmax), whereas Khandoker et al.  used nonlinear measures of variability, namely wavelet transform coefficients (Wυ), Fractal scaling indices (α) and multi-scale exponents (β) of MinFC time series during gait. The FNF comparisons yielded higher Cohen's d values than for OY (FNF = 0.95 ± 0.32 versus OY = 0.35 ± 0.21).
Of the 92 different outcome measures, 48 were measured with the aid of a movement laboratory, consisting of optical motion tracking camera systems, force plates, treadmills, etc. [8,34,35,64,65,73,76–78,80,81,85] and were assigned to the high-cost category (table 2). Seventeen measures were determined using a foot/floor mat [52,59,60,63,75,82,84] (medium cost), while 20 measures used foot switches or a portable insole plantar pressure measurement system [5,6,53,68,74] (medium cost). Finally, four measures used a paper/PVC mat [62,67,72] (low cost) and three used inertial sensors, including accelerometers [66,83] (medium cost).
Until recently, LEs and FMs have always been evaluated in a movement laboratory [34,35,65], making them relatively expensive to evaluate. With the availability of inertial sensors, however, access to these measures is now possible on a cheaper basis . On the other hand, linear variability measures are generally evaluated using a spectrum of equipment, thus the cost of evaluation for these measures is very broad ranging from cheap to very expensive. Foot switches, however, were able to provide only temporal measures of variability [5,6,53,68,74], but achieved the highest average ES values [64,88] while also providing one of the cheaper methods to evaluate gait stability. Studies using a foot/floor mats, on the other hand, although cheap, obtained the lowest average ES.
Although the threat caused by falls in western civilization is obtaining increasing clinical and economical attention, assessment methods designed to identify fall-prone individuals remain controversial. While biomechanical approaches for assessing gait stability seem to be able to contribute towards quantifying the dynamic stability of older individuals, they have hardly been taken up in clinical settings. This could be a consequence of unclear effectiveness of these approaches, together with the additional time and the effort required for their use. The objectives of this systematic review were to address this issue by firstly ascertaining the strength of biomechanical measurements of gait stability to investigate fall- and age-related differences, and secondly to assess their applicability in a clinical setting, including subject mobility and associated costs.
To this end, a literature search has been performed and data extracted with regards to variability as well as stability measures, cohort sample sizes and the methodology used. The ES was then calculated from the cohort means and s.d.s of the outcome measures in order to assess quantitatively the effects of age- and fall-related differences in a standardized manner. The results of this survey demonstrate that linear variability of temporal measures of swing and stance were capable of distinguishing between fallers and non-fallers, whereas step width and stride velocity prove more capable of discriminating between OY adults. The variability in stride time was able to identify both age- and fall-related differences. Examination of nonlinear measures, observed using wavelet exponent of the MinFC series (variability) and FM parameters (stability), revealed more effective identification of fall-related than age-related differences. The results, therefore, indicate that not only are measures of both variability and stability able to capture aspects of gait stability (level of noise in the human motor performance as well as dynamic error correction), but also that gait stability might be differentially affected by ageing versus falling. This systematic review goes on to confirm that gait instability, as indicated by increased variability of linear measures or altered structure of variability of nonlinear measures of foot trajectories, is higher in both elderly and fall-prone adults. Biomechanical tools for assessing these measures therefore seem to be effective for identifying subjects with a higher fall risk and should be considered when designing fall prevention strategies.
4.1. Study characteristics
Approximately two-thirds of the studies reviewed performed FNF comparisons rather than OY comparisons. However, only two studies combined both comparisons in one experiment [8,35], and the number of investigations able to capture the relationship between falling and age was therefore limited.
The choice of technology/measurement equipment is driven by the aspects of gait stability to be assessed (variability versus dynamic stability), with certain parameters requiring the analysis a large number of continuous strides [8,78]. Here, the advent and increased use of remote-sensing devices (e.g. inertial sensors) may aid in greatly reducing the costs for gait stability assessments in future, while allowing the required number of cycle repetitions for assessment of parameters for both stability and variability.
Since elderly females are known to be one of the groups at highest fall risk [89–91], it is critical that studies match or control for gender. In this review, however, less than one-half of the articles reviewed accounted for gender differences [5,6,53,63–67,72, 74,80,84]. Four of the 10 FNF studies [5,53,72,74] and 10 of the 21 OY studies [6,34,63–67,76,80,84] met this criterion and thus avoided introducing possible bias (even though none of the studies reported any gender interaction effects). FNF comparisons conducted by Hausdorff et al. , Heitmann et al.  and Maki  not only assessed gait stability in gender-matched cohorts but also in age-matched cohorts (table 3), thus avoiding any possible secondary bias.
4.2. Assessment of fallers versus non-fallers
Almost all of the studies that investigated linear parameters indicated that variability of temporal measures (ES > 1) is best able to discriminate between fallers and non-fallers. Although reported only once for an FNF study, inconsistency of s.d. had the highest reported ES among all the linear measures, indicating that fallers had a highly inconsistent variability in the stride time series compared with non-fallers . However, no collective trends were apparent among nonlinear measures investigated in an FNF setting . The nonlinear measures of variability β42, Wυ and PPISD_2, as well as stability measures of both peak and average values of FM (figure 3), were best able to differentiate between fallers and non-fallers. These results suggest that variability in gait patterns, which is relevant for fall risk, may be associated with specific time scales. This is demonstrated by the change in slope of β, representing a reduction in the spectral density (variability in the time-domain distributed at different frequencies), inconsistency of s.d. and the major axis of the Poincaré plot PPISD_2, which represents long-term variability. In addition, differences in stability measures of FMmax and FMmean showed slower dynamic error correction during foot kinematics among fallers (figure 3).
Approximately half of the measures incorporated in the 10 FNF studies were temporal measures and had a Cohen's d value of greater than 1, whereas only one measure, namely CV of step time as assessed by Brach et al. , showed no significant differences between fallers and non-fallers (ES = 0). The same authors also investigated the CV of step width, and stance time (figures 2 and 3). Furthermore, as in the case of step time, no significant results were reported for stance time (ES < 0.2). These results are in contradiction to other studies included in the review, which reported temporal measures to be highly sensitive in the FNF comparison. The authors noted, however, that their study participants walked faster than in other studies.
Linear variability measures of swing time, stride time and stance time reported for studies with faller category 3 (at least one fall in the 5 years prior to the experiment) showed higher ES values  than prospective studies that examined faller category 1 (at least one fall in the 12 months following data collection) [5,53], indicating that a fall event might lead individuals to compromise their gait patterns, with an associated increase in gait instability (figure 2). It should be noted that the ESs for step width s.d. between these two categories, however, were very similar. Expectedly, measures estimated in experiments using faller category 4 (at least three falls in the 6 months following data collection) showed considerably higher ESs than measures in studies using faller category 1. Here, the higher ES for faller category 4 indicates that recurrent fallers show deteriorated levels of gait stability compared with first time fallers (figure 2). Here, poorer performance might be associated with fear of falling , but recent studies have been unable to establish a link between the likelihood of future falls and fear of falling .
4.3. Assessment of old versus young subjects
Nine of 31 measures reached an ES > 1 regarding age-related differences in gait kinematics (figures 4 and 5), with the linear variability measure of step width producing the most repeatable results: this parameter resulted not only in very high ES values, but this result was also confirmed in eight studies [52,59,64,67,77,79–81] (figure 4). Malatesta et al. , Buzzi et al.  and Kang & Dingwell [64,65] have all evaluated CV and s.d. measures in the same cohorts, as well as nonlinear outcome measures of LEs in OY subject cohorts. Here, the calculated ES values demonstrated higher sensitivity concerning age-related differences of linear measures than their nonlinear counterparts. Elderly subjects therefore seem to not only walk with increased variability, but also have a poorer performance in correcting local disturbances compared with young adults [93–96].
Increased variability, as assessed using linear measures, has additionally been observed in the gait parameters of elderly subjects during obstacle crossing [88,97,98]. Malatesta et al.  reported that CV of stride time was significantly higher in elderly compared with young subjects, whereas the three nonlinear measures assessed in their study, associated with the fluctuation dynamics of stride time (non-stationary index, inconsistency of variance and fractal scaling index), did not significantly differ among the cohorts. The results corresponded with the calculated ES values, suggesting that linear measures may indeed be more able to detect age-related differences than the nonlinear indices investigated. Caution must be exercised, however, as measures in studies estimating gait stability with a larger mean age difference between cohorts yielded higher ES values than measures in studies with a lower mean age difference between cohorts. This indicates that the mean age difference between experimental and control groups in OY comparisons critically leads to heterogeneity and can influence ESs in outcome measures.
4.4. Combined old versus young and fallers versus non-fallers
When considering comparisons between OY and FNF groups, the variability (s.d. and CV) in step width (measured 11 times in the studies included) was found to have a higher discriminative power in the OY comparisons [52,59,64,67,77,80,81,85,99] than in the FNF comparisons [53,60,72] (figures 2 and 4). This suggests that although step width variation may increase with age, it is not necessarily a dominant factor in fall risk. Maki  even found that variability in step width was lower in fallers. On the other hand, variability of almost all temporal measures such as stride/step/stance/swing/double support showed the highest ESs in the FNF comparisons (figure 2). These results suggest that the ability to reproduce reliably a gait pattern is important for avoidance of falls, but its deterioration, as indicated by increased temporal variability, is unavoidable with age. If this deterioration continues unimpeded without compensation or correction, the phenomenon may lead to increased fall risk—a concept that is consistent with the increased propensity to fall among subjects in higher age categories [3,4]. Although caution must be exercised in interpreting these results because of small sample size, it does seem that nonlinear measures might be more sensitive to the fall-related differences than age-related factors.
4.5. Sources of heterogeneity among studies
In any comparison of different studies, it is difficult to ensure homogeneity when faced with data from a range of different sources. One of the biggest hurdles in this review was to standardize the assessment of a wide variety of methodology and experimental protocols and settings. As a prime example, cohort sample sizes ranged from 4  to 422  subjects. Although the ES were calculated from pooled s.d.s, thereby standardizing the sample size, the results obtained from studies that had very low sample sizes must be interpreted with caution.
As noted previously, one of the major sources of heterogeneity for example, was preferred walking speed. Measuring gait stability at preferred walking speeds would provide assessments that are closer to the real-life scenario (i.e. to daily routine). This approach, however, is not without controversy. For example, England & Granata  and Kang & Dingwell  found increased LEs at increasing walking speeds indicating decreased dynamic stability at higher speeds. On the other hand, Bruijn et al.  using LEs at different walking speeds (in young cohorts) found both increasing and decreasing stability values at a higher speed depending on the estimates used (λ_s versus λ_l) as well as the projected directions anteroposterior (AP) versus mediolateral (ML). Furthermore, Kang & Dingwell  as well as Bruijn et al.  found decreased gait variability at increased walking speeds, while Maki  found decreased step width variability at increased walking speeds among fallers. Thus, it appears that walking at different speeds is likely to influence not only the level of noise in the human motor performance but also dynamic error corrections.
Furthermore, heterogeneity from walking speed is induced not only because of differences in protocols but also, and more importantly, because of the dynamic characteristics of different cohorts. More specifically, assessment of gait stability at preferred gait speeds can be seen as an evaluation of the actual, subject-specific behaviour. Although walking may differ between cohorts (e.g. younger cohorts are likely to walk faster than their older counterparts), quantification of gait stability in such scenarios is more representative of performance in daily life. However, if the aim of a study is to assess differences in task performance in a standardized manner, then gait stability should be determined at fixed or specific walking speeds, in order to capture the spectrum of performance at different effort levels.
Another factor leading to heterogeneity in the experimental protocol could be the number of cycles analysed, which varied from as few as four strides to as many as 1000. While investigating LEs and FMs, Bruijn et al.  found that increasing the number of strides increased the specificity of both Lyapunov and Floquet measures; thus, it is likely that ES estimates provided here could also be influenced by differences in the experimental protocol undertaken by various studies that were included in this review. In addition, factors such as whether these gait cycles were collected continuously or not and using a treadmill or not, all influence the outcome in different ways. Although the use of a treadmill allows a considerable number of strides to be measured, a necessary requirement to gain reliable and precise estimates of linear  as well as nonlinear variability , walking over treadmill is known to differ from walking over ground [103,104]. Studies that incorporated pressure walkways often measured multiple runs to record sufficient strides in order to obtain precise and reliable estimates of variability, thus making these protocols non-continuous. Here, it is important to note that the repetitive stop and go movements induce transients in the stride trajectories, and potentially bias the variability estimates.
As a final point, groups tended to favour the investigation of specific parameters; e.g. Khandoker et al. [8,73] were the only research group to examine wavelets and Poincaré maps, but did this in two separate studies. It is also possible to reduce heterogeneity through publication of datasets from the same subjects in several articles, e.g. Kang & Dingwell [64,65] or Khandoker et al. (n = 10; age = 71.0 ± 2.1  versus n = 30; age = 69.1 ± 5.12 ), thus providing a comparison of different measures in the same/similar cohorts.
4.6. Perspectives on outcome measures
In order to obtain homogeneity across the measures as well as studies assessed and increase comprehensibility of results, the scope of this review was limited not only in terms of experimental protocol but also in design. Thus, studies that investigated kinematics of foot time series in OY and FNF populations were included, while studies that included patients with musculoskeletal disorders [105–107] were excluded, as well as studies using external perturbations [108–110] or the assessment of trunk kinematics [35,65]. This review, therefore, offers a somewhat restricted vision of the complete domain of gait stability. Furthermore, the assessment of foot kinematics may result in a biased interpretation of the ability of parameters to discriminate between fallers and non-fallers on a subject-specific basis.
Only one study investigating FNF and OY comparisons used FMs to examine geriatric populations and foot kinematics . This study found an association between fall risk and FMmax and FMmean, a finding that was corroborated by investigations on trunk kinematics in young healthy adults in normal as well as visually and mechanically perturbed environments . McAndrew et al.  found that subjects demonstrated direction-specific responses to perturbations using FMmax. While perturbations in ML and AP directions affected the orbital stability of trunk movements in the associated directions, both these perturbations had an opposite effect on orbital stability in the axial direction. These suggestions were confirmed by Granata & Lockhart , providing evidence that FMs are (i) capable of identifying differences in FNF and OY cohorts, (ii) sensitive in identifying the effects of visual and mechanical perturbations, (iii) unaffected by walking speed [35,111], and (iv) direction-specific . In contrast, however, Hobbelen & Wisse  suggested that FMs might have severe disadvantages as a measure for fall avoidance in bipedal robots. Furthermore, evidence from both modelling [113,114] and experimental human studies [108,115] suggest that FMmax might not necessarily be related to fall risk. The use of FMmax in the assessment of fall risk, therefore, remains controversial. Moreover, no studies that met our inclusion criteria found an association between LEs (λs) and fall risk (FNF comparisons). However, this could be due to the in/exclusion criteria of this review rather than any ineffectiveness of LEs in identifying fallers [34,65,111].
Although DFA exponents were able to quantify the age-related differences in gait variability, caution must be exercised in interpreting this parameter owing to the fact that this measure (i) has been reported only in a limited number of studies and (ii) quantifies the structure rather than the quantity of variability associated with time series being analysed . Here, higher exponents (absolute alpha values >0.5) indicate larger correlations within the time series, which could be interpreted as the variability being largely deterministic, whereas smaller exponents could be interpreted as variability being largely stochastic (i.e. random). In a similar manner, it could be expected that higher absolute alpha values (>0.5) are associated with gait in the elderly compared with younger subjects owing to the more deterministic nature of their neuromuscular system. Indeed, Khandoker and co-workers  found higher exponents for both elderly healthy as well as elderly fallers. The implications of a deterministic or stochastic response on fall risk based on gait time series, however, remain unknown.
The studies summarized here indicate that altered structure of the variability in the temporal components of gait kinematics is associated with fall risk. This is in agreement with the studies of diseased populations, including Schaafsma et al. , who reported increased stride time variability among Parkinsonian fallers and non-fallers. In a similar manner, both stride interval  and step length  variability increased in cohorts of healthy young adults during walking on a compliant surface compared with a normal surface. Such changes in variability of spatio-temporal time series during gait therefore seem to be a potential biomarker for fall risk assessment in disturbed intrinsic and extrinsic scenarios.
This systematic review of the literature has been performed in order to understand better the ability of biomechanical measures to identify potential fallers from non-fallers at an early stage. The results suggest that linear variability of temporal measures of stride, swing and stance time are the parameters most capable of distinguishing between fallers and non-fallers, whereas variability in step width, stride time and velocity are more attuned to identifying kinematic differences between OY adults. Measures of stability have proved to be more effective in identifying fall-related differences than age-related differences. Present results show that gait instability, indicated by higher variability of foot kinematics, is increased in elderly as well as in fall-prone adults. The results of this review, therefore, suggest that low-cost devices, together with quantitative evaluation of suitable measures to assess gait stability, may indeed provide the predictability required to identify future fallers in a clinical setting, and should be considered when designing subject-specific fall prevention strategies and interventions. However, it does seem clear that the best study designs would incorporate objective measurements of gait stability and fall detection in combined retrospective and prospective investigations.
Given the evidence provided in the current review, an assessment of the changes in measures for assessing gait stability after specific intervention programmes or therapies might provide improved access towards understanding the relative sensitivity and specificity of these measures for differentiating between fallers and non-fallers in clinical settings. Further research that includes not only the cross-fertilization of stability and variability parameters, but also sufficiently large sample sizes, is however required in this field before conclusions can be drawn regarding their efficacy for early identification of fallers. In conclusion, biomechanical measurements appear promising for identifying individuals at fall risk and can be obtained with relatively low-cost tools. Incorporation of the most promising measures in combined retrospective and prospective studies for understanding fall risk and designing preventive strategies is warranted.
This research was supported by the EU framework 7 project VPHOP (FP7-223864) and the EU framework 7 project MXL (FP7-248693).
- Received June 28, 2011.
- Accepted August 5, 2011.
- This journal is © 2011 The Royal Society