## Abstract

Hidden Markov models are widely used to describe single channel currents from patch-clamp experiments. The inevitable anti-aliasing filter limits the time resolution of the measurements and therefore the standard hidden Markov model is not adequate anymore. The notion of time-interval omission has been introduced where brief events are not detected. The developed, exact solutions to this problem do not take into account that the measured intervals are limited by the sampling time. In this case the dead-time that specifies the minimal detectable interval length is not defined unambiguously. We show that a wrong choice of the dead-time leads to considerably biased estimates and present the appropriate equations to describe sampled data.

## 1. Introduction

Ion channels are large proteins which are situated in the cell membrane and which control the flux of ions from one side of the cell membrane to the other. It is assumed that these proteins can exist in different configurations which correspond to minima of the energy landscape of the protein. The stochastic switching of the protein between these configurations or states is thus adequately described by a continuous-time Markov chain.

Usually the channels exhibit only two experimentally distinguishable conductance levels—open or closed. Since in general more physiological configurations of the protein exist the switching between the states cannot be observed directly. Therefore, the measured current through single ion channels is described by an aggregated Markov process. The unobserved dynamical behaviour of the channel is modelled by a Markov chain and the observed current depends on whether the occupied state belongs to the closed or the open aggregate. Information on modelling of ion channels with aggregated Markov models can, e.g. be found in Colquhoun & Hawkes (1982) or Fredkin *et al*. (1985). Hidden Markov models additionally incorporate the observational noise which is assumed to be white and Gaussian. Since in aggregated models the states of the underlying Markov chain are also hidden we will in the following not distinguish between these two terms. For a detailed description of hidden Markov models confer, e.g. Rabiner (1989) and MacDonald & Zucchini (1997). A thorough review can be found in Ephraim & Merhav (2002).

Owing to the aliasing effect the measured currents have to be low-pass filtered before sampling which violates the assumption of the standard hidden Markov model that the noise-free current changes instantaneously between open and closed and the observational noise is white. If the noisy, filtered time series is fitted to a hidden Markov model it has been shown that this violation biases the parameter estimates (Venkataramanan *et al*. 1998*b*). Furthermore when different models are compared this misspecification can lead to the selection of the wrong model (Michalek *et al*. 1999). Several approaches to extend hidden Markov models to incorporate the filter effects have been developed (Venkataramanan *et al*. 1998*a*,*b*, 2000; Michalek *et al*. 2000; Qin *et al*. 2000; Fredkin & Rice 2001). All these extensions have the major drawback that they are numerically expensive.

In another approach the noisy signal is idealized and converted into a sequence of open and closed dwell times, e.g. by means of a half-amplitude threshold. The filter gives rise to the omission of brief intervals since this threshold is not achieved by events that are too short. Thus, a constant dead-time *τ* of the filter is introduced and, for example, all open times shorter than this dead-time *τ* are added to the adjacent closed times forming long apparent closings.

Hawkes *et al*. (1990) have developed recursion formulas for the exact solution of the problem. Since with these equations the open time and closed time distributions cannot be calculated reliably for large dwell times, *t*, Jalali & Hawkes (1992) have given an approximate solution for this case. However, their model assumes that the measured time-intervals are obtained exactly, i.e. without sampling. Such continuous interval durations can, e.g. be achieved if the idealization of the data is performed by interpolation or time course fitting (Colquhoun & Sigworth 1983).

If instead the dwell time lengths are obtained by a simple threshold crossing method, they are multiples of the sampling time. In this paper we derive the appropriate solution for this situation. In this case the dead-time *τ* cannot be imposed on the data unambiguously. Let Δ*t* denote the sampling time and *n* a natural number and consider the case that all events from the sampled data record shorter than, e.g. *n*Δ*t* are missed. If the dead-time *τ* is defined such that all events strictly shorter than *τ* are missed and all events longer than or equal to *τ* are detected, the dead-time should be chosen as *τ*=*n*Δ*t*. If, conversely, the dead-time *τ* is defined such that all events shorter or equal to *τ* and all intervals strictly longer than *τ* are detected, the dead-time should be chosen as *τ*=(*n*−1)Δ*t*. Even all times between (*n*−1)Δ*t* and *n*Δ*t* would be possible choices of the dead-time *τ*.

We show here that the choice of *τ* being a multiple of the sampling time leads to a substantial bias in the estimation. Moreover, we derive the appropriate equations for the case of sampled data in the fashion of Hawkes *et al*. (1990) and Jalali & Hawkes (1992).

The method presented here is applicable for the analysis of ion channels that exhibit only two conductance levels but a generalization to ion channels with subconductance levels is possible.

The corresponding theoretical considerations are presented in the next section where the exact and the approximate solution for sampled data are determined. In §3 we investigate the validity of the approximate solution and compare the sampled and unsampled versions in their ability to estimate parameters.

## 2. Theory

### 2.1 Filtering with fixed dead-time

First, we introduce the filter model which is illustrated in figure 1. The upper panel displays an idealized single channel current before filtering. The second open time which is shorter than *τ* is not detected which is shown in the lower panel. Our definition of the time resolution *τ* is such that intervals that are shorter than or equal to *τ* are not detected.

This is an approximation to the behaviour of real low-pass filters but it is applicable if short intervals do not occur in quick succession. Indeed, the commonly used Bessel filter distorts the form of the incoming signal. However, relevant for the analysis is only the duration above or below the half-amplitude threshold. If an interval is long enough that the Bessel-filtered signal reaches its full amplitude, the duration above threshold has the same length as the unfiltered signal. Thus, the dead-time *τ* is chosen such that intervals shorter than that are missed and has to be imposed on the current record retrospectively.

Such a simple filter model is not adequate if the incoming signals contain many consecutive short intervals. The Bessel-filtered output would result in an averaged, observed current near the half-amplitude threshold.

In figure 1 the definitions of the extended open and closed time-intervals ^{e}*t*_{O} and ^{e}*t*_{C} are also illustrated which were introduced by Ball & Sansom (1988). The length of an extended dwell time is the same as the corresponding observed dwell times. The extended dwell times are merely shifted by an amount *τ*. We will denote by ^{e}*t* without index ‘O’ or ‘C’ an extended dwell time regardless of being an open or closed time.

### 2.2 Computation of the likelihood

Here, we will derive the likelihood for the observed process with sampling time Δ*t* and dead-time *τ*. To this end, consider the underlying Markov chain with *m* states, generator matrix *Q* and initial probability distribution *π*. Without loss of generality the state space is partitioned into ={1, …, *n*_{C}} and . In the following we will adopt the convention that all times are specified in units of the sampling time Δ*t*.

Let ^{e}*g*_{ij}(*t*) denote the probability that an extended dwell time ^{e}*t* ends at time *t* and the Markov chain is in state *X*(*t*)=*j* at time *t* given that state *i* is occupied at time *t*=0. ^{e}*G*(*t*), the matrix with entries ^{e}*g*_{ij}(*t*), possesses the natural partitioning of

The matrices *Q*, *A*=exp(*Q*Δ*t*) and the vector *π* are partitioned in the same fashion. If *τ* equals zero we return to the classical hidden Markov models and we can readily calculate the matrices(2.1)We will omit the superscript ‘e’ to denote the matrices corresponding to the standard hidden Markov case. Given a sequence of measured dwell times the likelihood can be calculated from these matrices as (Colquhoun *et al*. 1996)(2.2)*u*_{O} denotes a column vector of ones with appropriate dimension. The form of the likelihood is similar to that for hidden Markov models and suggests to perform the calculation in the fashion of the forward-algorithm developed by Baum *et al*. (1970).

The underlying Markov process including time-interval omission is usually referred to as a semi-Markov process. A semi-Markov process passes through states according to a Markov chain having a transition probability matrix *P*. The sojourn times in the individual states are conditionally independent given successive states visited and the distribution of the sojourn times depends only on the state currently visited and the state subsequently entered (Karlin & Taylor 1975). For the unobserved switching of the single channel with time interval omission these properties are inherited from the underlying Markov process. The sojourn time distributions are given by the matrix ^{e}*G*(*t*) and the transition probability matrix *P* by . The problem of time interval omission when the underlying process itself is a semi-Markov process has been addressed by Ball and co-workers (Ball *et al*. 1991, 1993*a*,*b*; Ball & Yeo 1994).

To calculate the matrix-valued functions ^{e}*G*_{OC}(*t*) it is convenient to define the matrix _{O}(*t*). The *ij*th component of _{O} is the probability that the Markov chain is in state *X*(*t*)=*j*∈ at time *t* given that the state *i*∈ has been occupied at time zero and no closing has been detected in the interval [0,…,*t*]. The matrix _{C} is defined similarly. Thus, the matrix ^{e}*G*_{OC} can be expressed as(2.3)

### 2.3 Exact solution of _{O}(*t*)

The key to the calculation of the likelihood lies in the determination of _{O}. The corresponding form for _{C} can be obtained by exchanging the indices O and C.

Hawkes *et al*. (1990) derived an expression for the Laplace-transform of _{O}(*t*). When we replace the Laplace-transforms in their derivation by *Z*-transforms we can apply the same arguments and obtain the following equations for :(2.4)(2.5)An asterisk will be used throughout this paper to indicate the *Z*-transform *f**(*z*) of a function *f*(*t*), . The matrix function *T*(*t*)≔*A*^{t} describes transitions of the underlying Markov chain. Let us define(2.6)Then the *Z*-transform equation (2.5) can be inverted, resulting in(2.7)where ⊗ denotes the discrete convolution. Note, that from the infinite sum in equation (2.5) only a finite number of terms is nonzero in equation (2.7) and the upper limit *K* is the greatest natural number obeying *t*>*K*(*τ*+1).

Hawkes *et al*. (1990) found that for exactly measured intervals the *k*th-summand in equation (2.7) is a product of a polynomial in *t* of degree *k* with exponential terms exp(*μ*_{i}*t*). The time constants *μ*_{i} of the exponential terms are the eigenvalues of the matrix *Q*.

When the data are sampled the solution of equation (2.7) similarly leads to the product of a polynomial in *t* with exponential terms . In this case the constants *λ*_{i} are the eigenvalues of the matrix *A*.

To calculate the convolutions of equation (2.7) it is necessary to compute the finite sums which arise in discrete convolutions. Therefore, we make use of the following two lemmata. The proofs are given in appendices A and B.

*For r*, *and* , *a*≠1 *the following equation holds*:(2.8)*Recursion formulas for the coefficients* *and* *are given in* *appendix A*.

*For t*, *the following sum can be written as a polynomial in t of degree r*+1:(2.9)*The coefficients* *can be calculated recursively as is shown in* *appendix B*.

With these lemmata we are able to solve the convolutions of equation (2.7). It is convenient to define . Then equation (2.7) reads(2.10)and the solution of _{O} is given by the following theorem.

*If A has the real eigenvalues λ*_{i}*, i*=1, …, *m*, *then**where B*_{ik}(*t*) *is a polynomial in t of degree k with coefficients C*_{ikr}*, so**The coefficients C*_{ikr} *are n*_{O}×*n*_{O}-*matrices*.

The proof is by induction over *k*. Consider the spectral decomposition of . Then, from the *Z*-transform of *T*(*t*) follows:and the theorem is true for *k*=0 with . We assume that the theorem is true for *k* and calculatewhere *H*_{OO}(*t*) has been already defined as . For convenience we define the *n*_{O}×*n*_{O}-matrix and note that . We thus obtainIf the terms (*) and (**) are replaced by equations (2.8) and (2.9) of lemmata 2.1 and 2.2, respectively, the equation readsIf we interchange the indices *i* and *j* of the summand in the first row we obtainExchanging the order of summations over *s* and *r* and sorting the terms by powers of *t*^{s} gives the desired result ▪

The recursion formulas for *C*_{ikr} can be read off by comparing the coefficients of powers of *t*^{s}. The comparison givesThe recursion starts with .

### 2.4 The asymptotic form of _{O}

The computational complexity for the calculation of _{O}(*t*) increases with growing dwell time *t* and eventually becomes unstable. The absolute value of the involved coefficients becomes large whereas the entries of the matrix _{O} remain between zero and one. The subtraction of the large, rounded quantities leads to a cancellation error in the calculation. We, therefore, present an asymptotic form of _{O} valid for large dwell time *t*. The derivation follows the method of Jalali & Hawkes (1992).

We define the matrix-valued functions *H*(*z*) and *W*(*z*)(2.11)and obtain from equation (2.4)The formula for the inverse *Z*-transform and the residue theorem(2.12)imply that the asymptotic behaviour of _{O}(*t*) is given by the poles *z*_{i} of _{O}(*t*) with largest absolute value. Here Res_{i} denotes the residues of at the pole *z*_{i} and the sum is over all poles of . The pole with largest absolute value corresponds to the root of the determinant of *W*(*z*) with largest absolute value.

We will now state the main results in the form of two theorems which give the asymptotic distribution of _{O}(*t*).

*If H*(*z*) *is irreducible*, det *W*(*z*) *always has a simple real root z*_{1}<1 *which is greater than the absolute value of any other root*. *Then, as t*→∞, _{O}(*t*) *is given by**where c*_{1} *and r*_{1} *denote the right and left eigenvectors of H*(*z*_{1}) *corresponding to the eigenvalue z*_{1}.

The proof is shown in appendix C. The asymptotic distribution can be improved, when the generator matrix *Q* of the underlying Markov chain obeys the law of detailed balance. A system that is in equilibrium is subject to this principle which is sometimes also termed microscopic reversibility. In this case we can show that has exactly *n*_{O} real roots which also contribute to the asymptotic behaviour.

*If H*(*z*) *is irreducible and Q obeys the principle of detailed balance*, det *W*(*z*) *has exactly n*_{O} *real roots z*_{i}. *If all roots are distinct*, *then*, *as t*→∞,*Again c*_{i} *and r*_{i} *denote the left and right eigenvectors of H*(*z*_{i}) *corresponding to the eigenvalue z*_{i}.

The proof is performed exactly as the proofs of theorems 2.2 and 3.2 in Jalali & Hawkes (1992).

It is reported that in the continuous-time case these asymptotic solutions work well for even moderate dwell times *t* which is in accordance with our experience. From the numerical studies, we have performed we suppose that the same applies for the approximation presented here.

## 3. Simulation study

### 3.1 Open and closed time distributions

As a numerical example we consider the gating scheme sketched in figure 2. The sampling time was chosen as Δ*t*=0.02 ms and a dead-time of *τ*=4Δ*t* was imposed on the data. In figure 3 the open and closed time distributions are displayed. The open time distribution is calculated by*u*_{C} denotes a vector of ones with dimension *n*_{C}. The exact dwell times can be calculated reliably for dwell times as large as . The approximate solution has been applied for larger times.

In figure 4 we compare the exact open and closed time distribution with the asymptotic approximation for large dwell times *t*. The difference between these distributions relative to the exact distribution is shown. For *t*≤2*τ* the approximation underestimates the true distribution. For moderate values around *t*>2*τ* the approximation is already good. This is in accordance with the findings of Hawkes *et al*. (1990, 1992) who reported a good agreement for similar times *t*.

### 3.2 Parameter estimation

In this section we compare the continuous-time model of Hawkes *et al*. (1990) and the procedure derived in §2 with respect to estimation of rate constants. We show that for sampled data the solution of Hawkes *et al*. (1990) leads to biased estimates if the dead-time *τ* is chosen as a multiple of the sampling time Δ*t*.

#### 3.2.1 Two-state model

We simulated 500 datasets each with 1 500 000 data points from the simple two-state gating scheme sketched in figure 2. Again, the sampling time was chosen as Δ*t*=0.02 ms and open and closed time intervals were determined by a half-amplitude threshold. This resulted in dwell times that are integer multiples of the sampling interval. A dead-time *τ* was imposed on the data such that all events shorter than or equal to *τ*=4Δ*t*=0.08 ms are missed.

We re-estimated the parameters by maximizing the likelihood with the equations given by Hawkes *et al*. (1990) and Jalali & Hawkes (1992) using three different dead-times. The time resolution has been chosen as *τ*=4Δ*t*, 4.5Δ*t* and 5Δ*t*. Furthermore, we re-estimated the parameters with the equations developed in §2, where the dead-time is uniquely defined. The likelihood has been maximized numerically by a quasi-Newton method from the NAG-library (subroutine e04ucf of The Numerical Algorithms Group Ltd 1999). In all cases we used the approximate solution for values of *t*>11*τ*. The necessary search for the roots of det *W*(*z*) has also been performed by a routine from the NAG-library (subroutine c05avf and c05azf).

In table 1 the mean value and standard deviation of the estimates for the various methods are summarized. It shows that the method of Hawkes *et al*. (1990) and Jalali & Hawkes (1992) leads to a considerable bias downwards for both rate constants when the dead-time is chosen as *τ*=4Δ*t*. The rate constant *q*_{CO} is underestimated about 30 Hz which corresponds to a relative difference of 16% and the estimate of the rate constant *q*_{OC} has a bias of 550 Hz giving a relative difference of 7%. Both parameters obtained with a dead-time of *τ*=5Δ*t* are overestimated. The magnitude of the bias is the same as for the case of *τ*=4Δ*t*. When the dead-time is chosen in the centre between these two times the parameters are estimated unbiased. The model incorporating the sampling of the data also gives the correct estimates. The standard deviations increase with larger estimates. The standard deviations obtained from the continuous-time formulation with *τ*=4.5Δ*t* and from the model incorporating sampling are essentially the same.

The distribution of the parameter estimates for the different dead-times are displayed in figure 5. The right-most graphs show the distribution of parameter estimates obtained from the model incorporating the sampling of the data. The arrows mark the true values. The figure confirms the biases when using a multiple of the sampling time as dead-time. It also shows that the maximum likelihood estimators are in good approximation Gaussian distributed.

#### 3.2.2 Four state model

As a second example the four state gating scheme sketched in figure 6 is investigated. Again, 500 datasets each with a length of 1 500 000 data points have been simulated from the model. The sampling time has been chosen to be 0.02 ms and the dead-time of *τ*=4Δ*t*=0.08 ms has been imposed on the data. For each dataset the maximum likelihood estimators have been calculated.

Again, the maximum likelihood estimators have been calculated with the method of Hawkes *et al*. (1990) for three distinct dead-times and with the procedure developed in §2. In all cases the approximate solution has been applied for dwell times larger than *t*>11*τ*. The findings for the four state scheme are similar to those for the two-state model and are summarized in table 2. It shows the mean values with standard errors and the standard deviations of the parameters determined from the 500 simulated datasets. The parameter estimates obtained from the method of Hawkes *et al*. (1990) and Jalali & Hawkes (1992) with a dead-time of *τ*=4Δ*t* are biased downwards whereas the estimates determined with a dead-time of *τ*=5Δ*t* show a bias upwards. The largest relative bias amounts to approximately 35% in this simulation study. If the dead-time is chosen as *τ*=4.5Δ*t* the parameters are estimated unbiasedly. The method developed here yields also unbiased estimates.

## 4. Discussion

Continuous-time hidden Markov models with binary outcomes are widely used to describe single ion channel currents. To account for the necessary low-pass filter the standard model has been extended taking into account that intervals shorter than a fixed dead-time *τ* are not detected.

This task has been solved exactly by Hawkes *et al*. (1990) and their method is widely used for the analysis of single-channel currents (Colquhoun *et al*. 2003; Hatton *et al*. 2003; Beato *et al*. 2004; Burzomato *et al*. 2004). For the determination of dwell time distributions it is necessary to calculate expressions containing *m*-fold convolutions, where *m* denotes a natural number. The corresponding integrals can be solved straightforwardly by integration by parts. The resulting distributions are piecewise defined and in the range [(*n*+1)*τ*,(*n*+2)*τ*] they have the form of a sum of polynomials of degree *n* multiplied with exponentially decaying terms of the form exp(*λ*_{i}*t*), where *λ*_{i} denotes the eigenvalue of the matrix *Q*. The coefficients of the polynomials are calculated recursively from the rate constants *q*_{ij}.

However, the model of Hawkes *et al*. (1990) assumes dwell time intervals on a continuous time-scale. If the idealization of the data is performed by time course fitting (Colquhoun & Sigworth 1983) or by interpolating between sample points, continuous durations can be obtained. However, there exist other idealization techniques, e.g. half-amplitude threshold crossing, the Hinkley detector (Schultze & Draber 1993) or the segmental *k*-means method (Qin 2004), such that the accuracy of the measured times is limited by the sampling interval Δ*t*. Then, the dead-time entering the equations of Hawkes *et al*. (1990) is not clearly defined. If all intervals shorter than or equal to *n*Δ*t* are undetected it is not clear which time between *n*Δ*t* and (*n*+1)Δ*t* should be chosen as dead-time *τ*.

In this paper we have derived the appropriate equations to analyse a sampled, continuous-time hidden Markov model incorporating time interval omission if the idealization yields time intervals that are multiples of the sampling time. We used the same method as Hawkes *et al*. (1990) to determine dwell time distributions and also obtained expressions containing *m*-fold convolutions given by equation (2.7). In contrast to the arising integrals of the continuous-time case, the corresponding finite sums are not straightforwardly computed. We have shown that these sums are polynomials and have derived recursion formulas to calculate the corresponding coefficients.

With lemmata 2.1 and 2.2 we were able to show that the exact dwell time distribution resembles the solution of Hawkes *et al*. (1990). It also is piecewise defined and in the range [(*n*+1)*τ*, (*n*+2)*τ*] it has the form of a sum of polynomials of degree *n* multiplied with exponentially decaying terms of the form where *λ*_{i} denotes the eigenvalues of the matrix *A*. The resulting recursions are somewhat more involved compared to the solution in the continuous-time formulation. Again, the complexity increases with increasing *t* and becomes computationally infeasible for large dwell time *t*.

Therefore, we give in §2.4 an asymptotic solution for large *t*. The resulting open time distribution is a sum of exponential terms *μ*_{i}*t*. The number of summands equals the number of open states *n*_{O} and the constants *μ*_{i} are given by the root of the determinant of *W*(*z*) defined in equation (2.11). The same applies for the closed time distribution. Thus, for large dwell time *t* the approximate solution has the same form as for the ideal case with no time interval omission. In this case the constants describing the exponential decay are given by the eigenvalues of the matrices *A*_{OO} and *A*_{CC}, respectively.

We have shown in §3 that the asymptotic solution is an accurate approximation even for moderate values of *t* such as *t*≈2*τ*. Correspondingly, in the case of continuous-time modelling Hawkes *et al*. (1992) reported a good agreement between the exact and the approximate solution for similar values of *t* and recommended the use of the approximate solution for times *t*>3*τ*. We found that the computation of the exact solution was even feasible for values of *t* up to *t*<22(*τ*+1), i.e. the equations involve polynomials of degree 20.

In §3 we also performed a simulation study to compare the model of Hawkes *et al*. (1990) and our model with respect to parameter estimation. Using the equations derived in §2 we obtained unbiased estimates of the rate constants. With the method of Hawkes *et al*. (1990) and Jalali & Hawkes (1992) it was also possible to estimate the rate constants of the sampled process correctly when the dead-time is chosen properly. Using a time resolution which lies in the middle of the range of possible dead-times resulted in unbiased estimates. Consequently, for a lower dead-time the parameters were underestimated, a higher dead-time gave overestimated parameters. The magnitude of the bias reached up to 35% in our simulation study.

With the proper choice of the dead-time the continuous-time model and the one incorporating sampling do not differ with respect to the standard deviation of the estimates. Thus, for the numerical example investigated here these two methods perform equally well regarding parameter estimation. However, it is not clear if the choice of time resolution that resulted in unbiased parameter estimates for the continuous-time formulation is applicable as a general rule.

Therefore, we present here the appropriate model which reflects the experimental conditions correctly and takes into account the fact that measured data are sampled.

## Footnotes

- Received April 7, 2005.
- Accepted August 3, 2005.

- © 2005 The Royal Society