## Abstract

In this research, we hypothesized that novel biomechanical parameters are discriminative in patients following acute ST-segment elevation myocardial infarction (STEMI). To identify these biomechanical biomarkers and bring computational biomechanics ‘closer to the clinic’, we applied state-of-the-art multiphysics cardiac modelling combined with advanced machine learning and multivariate statistical inference to a clinical database of myocardial infarction. We obtained data from 11 STEMI patients (ClinicalTrials.gov NCT01717573) and 27 healthy volunteers, and developed personalized mathematical models for the left ventricle (LV) using an immersed boundary method. Subject-specific constitutive parameters were achieved by matching to clinical measurements. We have shown, for the first time, that compared with healthy controls, patients with STEMI exhibited increased LV wall active tension when normalized by systolic blood pressure, which suggests an increased demand on the contractile reserve of remote functional myocardium. The statistical analysis reveals that the required patient-specific contractility, normalized active tension and the systolic myofilament kinematics have the strongest explanatory power for identifying the myocardial function changes post-MI. We further observed a strong correlation between two biomarkers and the changes in LV ejection fraction at six months from baseline (the required contractility (*r* = − 0.79, *p* < 0.01) and the systolic myofilament kinematics (*r* = 0.70, *p* = 0.02)). The clinical and prognostic significance of these biomechanical parameters merits further scrutinization.

## 1. Background

Acute myocardial infarction (MI) is a common cause of premature morbidity and mortality. Although early survival post-ST segment elevation MI (STEMI) is improving [1], the longer term risk of heart failure remains persistently high [2]. The standard of care for assessing the initial severity of heart injury is left ventricular (LV) systolic function, and in particular, LV ejection fraction (LVEF) [1]. Nonetheless, global measures of LV pump function are simplistic [3,4], as compared to biomechanical parameters of pump function, such as myocardial contractility and stiffness.

There is a growing recognition that a computational approach for ventricular biomechanics, when integrated with clinical imaging, can provide insights into heart function and dysfunction [5,6]. For example, carefully designed models have been used to inform various improvement therapies post-MI [7].

Cardiac dynamics are complex multi-physics problems that involve dynamic blood flow, electrophysiology, nonlinear deformation and interactions among them [8]. Substantial effort has been devoted to developing computational models of the heart from simplified representations to more realistic image-derived cardiac models [8–12]. Among the computational frameworks that have been developed, the hybrid finite difference–finite-element version of the immersed boundary method (IB/FE) [13], one of the recent extensions of the immersed boundary (IB) method, has also been used by the authors' group to simulate LV dynamics with a hyperelastic representation of the fibre-reinforced myocardium [12,14,15].

One of the challenges for modelling the heart is that the myocardial constitutive parameters need to be determined prior to the modelling. Various approaches have been developed to estimate those parameters by matching the available clinical measurements (displacement, strain or pressure–volume curve) [16–20]. For example, Genet *et al.* [21] developed an image-based LV model by matching measured strain and volume data to construct a reference stress map which may be used in restoring LV stresses back to a normal level. Predicting the systolic stress inside LV wall will further require the incorporation of myocardial active contraction [8,9,22,23].

However, the biomechanics leading to LV adverse remodelling and heart failure remain incompletely understood [24], and controversy exists [25]. Modelling of diseased hearts has attracted a wide interest [12,26] in recent years. Using a Fung-type constitutive law [23], Guccione and colleagues found that the myofibre stress is increased in the infarct zone, and the contractility in the border zone is reduced. They suggested that the changed mechanical environment may lead to adverse remodelling post-MI [7,16,26,27]. By developing a biomechanical porcine heart model, Chabiniok *et al.* [28] estimated myocardial contractility from *in vivo* data at three time points after acute-MI, and found that the contractility in the remote regions of MI increased 10 days after acute-MI, followed by a further increase 38 days later. By simulating LV dynamics using a patient-specific clinical data, Gao *et al.* [12] reported that required myocardial contractility after acute-MI was much higher compared with a control heart, suggesting an increased use of the contractile reserve in the myocardial remote zone for the patient. Asner *et al.* [29] estimated peak contractility in a healthy volunteer and two patients with dilated cardiomyopathy using personalized mechanical LV models and, again paradoxically, the higher peak contractility was observed in the patients. However, myocardial contractility varied considerably for healthy and diseased hearts when estimated using computational models. The reasons for this variability are unclear but may relate to inter-individual variations, sample size or technical factors.

Biomechanical parameters, such as myocardial contractility and mechanical properties, are more directly linked with pump performance than global measures of systolic function (i.e. LVEF), and should have greater discriminative value for heart function and prognosis. However, direct measurements of these indices are very challenging, if not impossible *in vivo*, which limits the use of higher fidelity measures of pump function in clinic. Furthermore, it is not immediately obvious which of the biomechanical parameters have more clinical relevance or prognostic value [30]. The lack of a case-controlled study of the myocardial mechanical modelling between the healthy and diseased LVs makes it hard to identity the links between biomechanical characteristics and MI pathologies. The aim of this work is to study how myocardial contractility and pump function varies between healthy subjects and patients with a recent MI. Our questions are:

(1) Does myocardial mechanically function differ in patients versus controls?

(2) If so, what are the differences in the proposed biomechanical parameters?

(3) Can these parameters be used to classify myocardial contractile function and if so, which classification method is better?

(4) What are the clinical implications of these parameters?

To address these questions, we firstly carried out an extreme case–control study of LV biomechanical behaviours using the IB/FE method in a patient group with acute STEMI and a control group without history of cardiovascular disease. Secondly, we applied various machine learning and statistical classification methods to identify the potential biomarkers which may reflect myocardial function difference, and then evaluated their performance. Finally, we analysed potential associations between the proposed biomechanical biomarkers and the LV function recovery and remodelling at six months within the patient group.

## 2. Material and methods

### 2.1. Study design

The study involves an ‘extreme case–control’ approach [31] in order to enhance the statistical power of the analysis while using a limited sample size. The specific clinical focus of this work will be patients who have suffered a large, acute STEMI that is associated with reperfusion injury (i.e. no-reflow), which is a major, life-threatening cause of acute LV pump failure post-MI [32,33]. No-reflow is defined as an acute reduction in myocardial blood flow despite a patent epicardial coronary artery, and is independently associated with adverse remodelling and adverse outcome [34]. Twenty-seven healthy subjects and 11 patients with acute-MI were enrolled in this study. Cardiac magnetic resonance (CMR) scans were used to computationally infer biomechanical parameters that would otherwise not be available from *in vivo* measurements.

### 2.2. Mechanical left ventricle model

#### 2.2.1. Imaging-derived left ventricle model

In-house developed Matlab (Mathworks, Inc., Natick, MA, USA) code was used to extract the endocardial and epicardial surfaces at early-diastole when the LV pressure is lowest [21]. Short-axis slices from the LV base to apex and three long-axis slices were chosen for manual segmentation, shown in figure 1*a*. Figure 1*b* shows the reconstructed LV geometry. Details of the CMR scans and LV geometry reconstruction are provided in the electronic supplementary material.

In the MI group, CMR scans at 2 days after acute-MI were chosen for model construction. Short- and long-axis late gadolinium enhancement (LGE) images were combined with cine images to define the infarct region, denoted as *U*^{MI} shown in figure 1*c*–*e*. To avoid abrupt change of material properties from *U*^{MI} to functional myocardium, a transition region adjacent to *U*^{MI} was defined within a distance of 10 mm of the infarct boundary, similar to previous work [12]. The reminder of the LV wall was assumed to be unaffected by the infarction, denoted as *U*^{un}, or the remote region. A Lagrangian field *M*(**X**) was introduced to describe the extent of the infarction throughout the whole LV geometry *U* as *M* is 1 in the infarct region, 0 in the remote region and linearly decreases from *U*^{MI} towards *U*^{un} in the transition region (figure 1*f*).

In-house developed b-spline deformable registration method was used to estimate regional circumferential myocardial strain from four short-axis slices from the basal plane to mid-ventricle; each slice was divided into six regions according to the AHA 17-segments definition [35]. LV volumes at end-diastole and end-systole were calculated using corresponding cine images. For each LV model, the measurements from one scan consisted of 24 regional circumferential strains and the LV cavity volume. Because it is difficult to acquire myofibre architecture *in vivo*, a rule-based myocardial fibre generation method [11] was used to describe the fibre and sheet orientations of the myocardium. Myofibre rotates from −60° to 60° from endocardium to epicardium, and the fibre along the sheet direction rotates from −45° to 45°. Because end-diastolic pressure in MI patients is usually higher (10–25 mmHg) than healthy subjects (5–10 mmHg) [36], a population-based end-diastolic blood pressure of 8 mmHg was assumed for the healthy group, and 16 mmHg was assumed for the MI group as in [12]. The LV systolic blood pressure (SBP) was approximated by the cuff-measured systolic pressure taken just before the CMR scans.

#### 2.2.2. Myocardial mechanics

The IB/FE method [13,14] is employed to model the ventricular dynamics at end-diastole and end-systole (details can be found in the electronic supplementary material). The myocardial stress tensor () is modelled as the summation of the active () and passive () stresses, i.e.
2.1The myofibre stress is
2.2where is the normalized myofibre direction in the current configuration. is the passive response described using an invariant-based Holzapfel–Ogden Law [37],
2.3in which *a*, *b*, *a*_{f}, *b*_{f}, *a*_{s}, *b*_{s}, *a*_{fs} and *b*_{fs} are eight unknown parameters, , **f**_{0} and **s**_{0} are the initial fibre and sheet directions, and **f** and **s** are the current fibre and sheet directions. is the right Cauchy–Green deformation tensor, in which is the deformation gradient. We assume myofibre can only bear load when taut, thus . From (2.3), we derive
2.4where *β*_{s} = 1.0 × 10^{5} Pa is the bulk modulus, *p* is the Eulerian pressure, *μ* = 4 *cP* is the viscosity, **u** is the Eulerian velocity and . (*β*_{s}/*J*)log(*J*^{2}) is used to enforce the incompressibility of the immersed solid in addition to the system-wide incompressibility condition [14].

The active stress is computed as
2.5where *T*_{a} is the active tension computed from
2.6where *T*_{req} is the active tension generated by the myocardium when λ_{f} = 1, i.e. *T*_{req} is the minimum value required to meet the pumping demand at the time of imaging. If the innate ability of the myocyte to contract under the maximum activation is denoted as *T*_{max}, which may be measured through stress CMR, then the difference between *T*_{max} and *T*_{req} reflects the contractility reserve of the myocardium. *C*(λ_{f}, *z*) is the effects of myofilament kinetics described by Niederer *et al.* [38] as
2.7where *z* is the available fraction of actin binding sites that is dependent on the intracellular calcium transient, *z*_{max} is the maximum fraction of actin binding sites at a given λ_{f}, *β*_{0} is a constant and *α* is a measure of the curvature of the force–velocity relationship, *Q*_{i}(*i* = 1, 2, 3) are calculated from
2.8where *A*_{i} and *α*_{i} are constants. A simple model of intracellular calcium dynamics [39] is used to trigger the myocardial contraction simultaneously. In our simulations, *T*_{req} is inversely estimated to model the patient-specific systolic LV dynamics, other parameters (*β*_{0}, *α*, *A*_{i}, *α*_{i}, etc.) involved in active tension generation are adopted from Niederer *et al.* [38]. To evaluate systolic myofilament kinetics, we also assess the value of *C*(λ_{f}, *z*) at end-systole, denoted by *C*^{s}.

The differences of the passive and active myocardial responses in the remote, transition and the infarct regions are modelled by changing the strain energy function and active stress [12,26]. Previous studies showed that tissue stiffness due to MI scar increases by 50-fold compared with the remote myocardium [12,27]. Given the similar biological process leading to a MI scar, we assume that the MI scar of all patients is 50 times stiffer
2.9where *M* ∈ [0, 1] takes 0 in the functional myocardium and 1 in the MI region, with a linear transition between them. We further assume the MI tissue does not contract, thus the active tension in the MI heart is
2.10note that partial contractility within MI region is not considered, but assuming the whole MI region is non-contractible. In addition, we adopt the definition used in [21] to define normalized *T*_{a} and *σ*_{f},
2.11

#### 2.2.3. Material parameter identification

Material parameters, including *a*, *b*, *a*_{f}, *b*_{f}, *a*_{s}, *b*_{s}, *a*_{fs}, *b*_{fs} and *T*_{req}, are determined so that the simulated LV dynamics are in good agreement with corresponding CMR measurements. Specifically, the passive myocardial parameters are inversely determined from *in vivo* data (LV cavity volume and the regional circumferential strain) using a multi-step optimization scheme [20]. The active parameter *T*_{req} is determined by matching the measured LV end-systolic volume and systolic circumferential strain (CS) of the clinical measurements. All other parameters in the active tension generation model are kept the same as in [12]. The strains are calculated with the reference configuration set at end-diastole. For the healthy subjects, we first inflate the LV model to the end-diastolic pressure and estimate the passive parameters, then initiate the systolic contraction to determine *T*_{req}. The objective function is
2.12where *ɛ*_{i}, *V* are the *i*th regional circumferential strain and LV cavity volume from the model, and *ɛ*^{mearsured}_{i}, *V*^{measured} are corresponding measured data. *N* is the total number of the control points. Figure 2*a* shows the optimization of *T*_{req} in a healthy LV model, and figure 2*b* is the strain comparison between the CMR measurements and values from the model predications using the optimized *T*_{req}, with good agreement.

For the MI subjects, we only determine the parameters in the remote regions because the MI region is modelled as non-contractile and 50 times stiffer. The passive parameters are determined similarly as for the healthy models, but *T*_{req} in the remote region is determined by minimizing the objective function
2.13in which *N*_{un} is the number of segmental remote regions. The average strain data from the unaffected region in the MI group is 13 ± 3 out of 24 segments.

Further comparisons on LV cavity volume and strains with *in vivo* CMR measurements are summarized in electronic supplementary material, table S1.

### 2.3. Statistical classification methods

The general characteristics that define the difference between a healthy heart and one that experienced an MI are well known. However, it is less clear whether the functional myocardium and its associated biomechanical factors are sufficient to classify the myocardium from a healthy heart or heart after acute-MI. Here we will apply seven different techniques from multivariate statistics and machine learning to assess the potential of myocardial classification given several biomechanical factors and to identify the methods that yield the highest prediction accuracy.

Our portfolio of methods includes univariate logistic regression, multivariate logistic regression with several predictors and the *k*-nearest neighbours classifier (KNN). Furthermore, linear discriminant analysis (LDA) will be applied, as well as sparse logistic regression with L1 regularization (Lasso). We will evaluate two tree-based methods that exploit the change of entropy in the data, as defined in electronic supplementary material, eqn. (S13). This includes decision trees trained with the C5.0 algorithm, and random forests based on bagging with feature sub-selection. Finally, we will apply a non-parametric Bayesian approach with Gaussian processes and automatic relevance determination (GP-ARD). All these methods are described in more detail in electronic supplementary material, §S0.5.

Another important topic is the identification of relevant factors that exhibit the greatest influence on the classification outcome. In this study, we attempt to find those biomechanical factors that possess the strongest explanatory power for the differentiation of myocardial contractile function from a healthy heart or a heart with acute-MI. Several of the previously mentioned methods provide a measure of factor importance, including the Lasso, Decision-Trees, Random Forests and GP-ARD. We will combine the importance measure of these methods into a single-ranked relevance score to identify the biomechanical factors with the highest explanatory power. The results for both studies are presented in §§3.4 and 3.5.

#### 2.3.1. Method evaluation

We assess the accuracy of correctly predicting myocardial contractile function from either a healthy or MI heart in terms of the true positive (TP), true negative (TN), false positive (FP) and false negative (FN) counts. Positive labels refer to MI heart; negative labels refer to healthy heart. Hence, a *true positive* is the correct identification of an MI heart, a *true negative* is the correct identification of a healthy heart, a *false positive* is the misclassification of a healthy heart as an MI heart and a *false negative* is the misclassification of an MI heart as a healthy heart. These counts are obtained out of sample with leave-one-out-cross validation (LOOCV). With LOOCV, a method predicts the class for one observation *i* ∈ {1, …, *n*} at a time based on the training set of the remaining (*n* − 1) observations, where *n* is the total number of observations in the labelled dataset. From the TP, TN, FP and FN counts, we compute the sensitivity,^{1} specificity^{2} and the total misclassification error^{3} to assess the classification accuracy of a method.

We plot, for all methods included in our study, the sensitivity against the complementary specificity (i.e. 1 minus the specificity). The results will be shown later in figure 10. The convex hull of theses scores presents the receiver operating characteristic (ROC) curve of the ensemble of classifiers we have trained.^{4} By numerical integration, we obtain the area under the ROC curve (AUROC) as an overall indication of the classification performance. An AUROC value of 0.5 indicates random expectation, whereas the maximum value of 1 gives perfect predictive accuracy.

For the non-parametric Bayesian approach, the regularization parameters are optimized by maximizing the marginal likelihood of the training data. For the non-Bayesian approaches, the regularization parameters are obtained based on LOOCV. It is important to note that the out-of-sample data used for evaluating the classification performance must not be used for tuning regularization parameters, to avoid an overoptimistic bias. We, therefore, need two nested LOOCV schemes, one for regularization parameter tuning, the other for method evaluation. This is best illustrated with an example. Considering four data points {1, 2, 3, 4}. For method evaluation, we use an outer LOOCV scheme as follows: training on {1, 2, 3}, evaluating on {4}; training on {1, 2, 4}, evaluating on {3}; training on {1, 3, 4}, evaluating on {2}; training on {2, 3, 4}, evaluating on {1}. For each training set, we use an inner LOOCV scheme for regularization parameter tuning. So for the first fold, {1, 2, 3}, we have: training on {1, 2}, regularization parameter tuning on {3}; training on {1, 3}, regularization parameter tuning on {2}; and training on {2, 3}, regularization parameter tuning on {1}.

## 3. Results

### 3.1. Characteristics of subjects

The CMR findings of the MI group and the healthy group are summarized in table 1. Distributions of age, end-diastolic volume (EDV), SBP, LVEF and CS are shown in figure 3.

### 3.2. Left ventricle mechanics

Figure 4 shows examples of the mechanical modelling of LV dynamics in a healthy subject and a MI patient. Figure 4*a*,*b* is the simulated LV geometries from the healthy control at end-diastole and end-systole, superimposed on CMR cine images, and figure 4*c*,*d* is from the MI model. Figure 4*e*,*f* is the distributions of systolic active tension *T*_{a} and myofibre stress *σ*_{f} in the healthy subject. Since there is no active contraction in the MI region, the MI region is over-stretched to bear the systolic pressure, as shown in figure 4*k*, in which the myofibre strain is positive (in blue), and the remote myocardium region is shortening. The distribution of myofibre stress *σ*_{f} is more homogeneous in the healthy heart compared with the MI heart. Similar pattern is shown in myofibre strain in figure 4*i*,*k*. Figure 4*j*,*l* shows the LV twist relative to the LV base, which linearly increases towards the apex with a maximum apical rotation of around 14° in the healthy model, but only 4° in the MI model.

Table 2 summarizes the average passive parameters of all the healthy and MI subjects. Figure 5 plots the average passive stiffness along the myofibre direction, and shows that the passive myofibre stiffness in the remote regions of the MI patients is much stiffer than the controls. This is a consequence of the myocardium adaptivity to MI and simply modelled using a higher end-diastolic pressure for the MI patients [36]. This increase corresponds with a stiffer myocardium for a given deformation [20].

The average *T*_{req} in the healthy group is 157 ± 25 kPa, and 156 ± 27 kPa in the MI group, as shown in figure 6*a*. No significant difference is found in *T*_{req} for the two groups. Similar results are found for *T*_{a} and systolic myofibre stress *σ*_{f} as shown in figure 6*b*,*c*. However, the average *T*^{norm}_{a} in the healthy group is much lower than the MI group (0.45 ± 0.06 kPa mmHg^{−1}—healthy versus 0.55 ± 0.07 kPa mmHg^{−1}—MI, *p* < 0.01). Similar trend is shown in *σ*^{norm}_{f} (0.35 ± 0.05 kPa mmHg^{−1}—health versus 0.47 ± 0.09 kPa mmHg^{−1}— MI, *p* < 0.01). The systolic myofilament kinetics *C*^{s} is 0.42 ± 0.04 in healthy group, which is slightly lower than 0.44 ± 0.13, *p* = 0.50 in the MI group. Figure 7 shows the distributions of *T*^{norm}_{a}, *σ*^{norm}_{f} and *C*^{s}. Interestingly, even with a wider range of SBP, the standard deviations of *T*^{norm}_{a}, *σ*^{norm}_{f} and *C*^{s} in the healthy group are much smaller compared with the MI group.

From the biomechanical models, we choose *T*_{a}, *T*_{req}, *T*^{norm}_{a}, *σ*_{f} and *C*^{s} as potential biomarkers for myocardial contractile function, but not passive stiffness and diastolic stress because of uncertainties in assumed end-diastolic pressure. A linear correlation analysis is further carried out to decide which features should be included in statistical feature classification and summarized in the electronic supplementary material. The criteria are that if two features are highly correlated in both the healthy and the MI groups, and further related to other features and CMR measurements in a similar way, then the more reliable feature will be selected. For example, *T*_{a} and *σ*_{f} are generally inter-related through the myocardial mechanical model, see equation (2.1), as is also confirmed by the linear correlation analysis. Furthermore, *T*_{a} and *σ*_{f} relate to other features in a similar way, but part of *σ*_{f} is dependent on the passive stress (equations (2.1) and (2.2)). Therefore, *T*_{a} is selected. Although *C*^{s} is correlated to *T*_{a} in both groups, they relate to CMR measurements in different ways; therefore, *C*^{s} is included. Similarly, for *T*_{req} and *T*^{norm}_{a}. We also include EDV and SBP from CMR measurements because both are inputs for modelling the LV contraction, but not EF, systolic strain and end-systolic volume, which can be reproduced from the LV models. In summary, the selected features for classification of myocardial contractile function between the healthy group and the STEMI group are: SBP, EDV, *T*_{a}, *T*^{norm}_{a}, *T*_{req} and *C*^{s}.

### 3.3. Datasets

Three datasets, based on the analysis in §3.2, are evaluated, which differ in the selected biomechanical factors that serve as predictors: the first dataset *D*_{1} includes the factors *T*_{req}, *T*_{a}, SBP, EDV, *C*^{s} and *T*^{norm}_{a}. The second set denoted with *D*_{2} lacks the ratios. The third set *D*_{3} includes *T*_{a}, EDV, *C*^{s} and *T*^{norm}_{a} but not SBP and *T*_{req}, whose effects are included in *T*^{norm}_{a} and *C*^{s}, respectively.

### 3.4. Factor importance

In this section, we discuss the relevance or importance of the various explanatory variables included in the three datasets. For Lasso, the importance measure of an explanatory variable is given by the absolute value of the average regression coefficient associated with that variable. For GP-ARD, the importance measure is expressed as the inverted and normalized length scale associated with the corresponding explanatory variable. The larger the length scale, the larger the change of the corresponding variable has to be to have any effect on the output. For the extreme case of an infinite length scale, any finite change in the corresponding explanatory variable has no effect on the output, and this variable has therefore effectively been switched off. The Decision-Tree provides (figure 8) the usage metric to quantify the importance of an explanatory variable, showing the percentage of times the respective explanatory variable has been selected by the C5.0 algorithm to build the tree. For the Random Forest, the importance measure is the mean decrease in classification accuracy incurred when excluding an explanatory variable from the training set.

For each of these methods, we rank the explanatory variables, and combine the results in an accumulative rank score, which serves as an indicator for the overall factor relevance. These accumulative scores are shown in figure 9, which shows the cumulative ranks based on the importance from 0 to (*p* − 1), where (*p* − 1) is the highest rank. The height of the bars in figure 9 provides a global indication of the association of the respective input variables with the output, or to paraphrase this: the higher the bar in figure 9, the greater is the relevance of the corresponding factor for predicting the classification outcome. It is seen that, overall, the two ratios *C*^{s} and *T*^{norm}_{a} have the strongest explanatory power in predicting myocardial contractile function from healthy or MI hearts.^{5} However, our classification results, discussed below and succinctly summarized in figure 10, show that the best performance is achieved when all the available factors are included in the set of predictors. Hence, no individual factor is completely irrelevant.

### 3.5. Classification performance

We have applied the classification methods mentioned in §2.3, and described in more detail in electronic supplementary material, §S0.5, to the three datasets described in §3.3. The results are shown in table 4 and in figure 10. Table 4 shows the sensitivity, specificity and overall misclassification error. The overall method ranking is shown in the last row of table 4 and is based on the sum of misclassification errors over all datasets. Given these scores, KNN and GP-ARD are the preferred methods. Figure 10 shows the sensitivity–specificity score pairs obtained with the different statistical/machine learning methods, for the three datasets described in §3.3. As described in §2.3.1, the convex hull of these scores, indicated by a dashed line, defines the ROC curve that connects the best methods. Methods that lie below this ROC curve are suboptimal. It is seen that none of the statistical/machine learning methods is optimal for all three datasets. However, KNN and GP-ARD are optimal for two datasets, and close to optimal for the third. In agreement with table 4, they can thus be regarded as the best methods overall. Note, in particular, that the univariate methods either clearly fall below the convex hull, or give the trivial score pair of both sensitivity and complementary specificity equal to zero (which is obtained when patients are always classified as healthy). This finding confirms the need for multivariate methods. As discussed in §2.3.1, the ROC curve corresponds to a combination of the best-performing methods, and the area under the ROC curve (AUROC) is a measure of the overall classification performance. Our study indicates that the AUROC score depends on the dataset and varies between 0.77 and 0.9. This is considerably better than random expectation (0.5) and, in the latter case, close to optimal prediction (1.0).

### 3.6. Associations between biomechanical factors at baseline and left ventricle function atsix-month follow-up

Section 3.4 shows that *T*^{norm}_{a} and *C*^{s} are the strongest explanatory factors in predicting myocardial contractile function between healthy subjects and MI patients, followed by *T*_{req}. Therefore, in this section, the association between *T*^{norm}_{a}, *C*^{s}, *T*_{req} and LV pump function recovery at six months is further analysed. Figure 12 shows the relationship between LV pump function recovery at six months and *T*_{req}, *C*^{s} at baseline in the MI group. It can be seen that a strong linear relationship between *T*_{req}, *C*^{s} at baseline and LVEF changes at six months. With a lower *T*_{req}, the MI patient could have a better LVEF recovery after six months. Thus, we hypothesize that for acute-MI patients with a similar calcium handling dynamics, a lower *T*_{req} suggests a potential to further increase the contractile function, without going into de-compensated states, compared with cases when the myocardial contractility already reaches the maximum. Since the maximum myocardial contractility is limited, less usage of the contractile reserve will give the heart more resilience in the longer term. We also observe that *C*^{s} is positively related to LV EF changes at six months. Since *C*^{s} is associated with the myofilament kinetics, this presumably suggests there are more binding sites available at systole for generating the active tension. However, no correlations are found between *T*_{a}, *T*^{norm}_{a}, *σ*_{f}, *σ*^{norm}_{f} and LV function at six-month follow-up, which are summarized in table 3.

## 4. Discussion

### 4.1. The LV mechanical modelling

The major determinant of long-term survival after an acute-MI is the efficiency of LV pump function. Despite decades of research, the pathophysiology and disease mechanisms of the failing heart remain incompletely understood, engendering a knowledge gap for the development of new therapies. It is believed that novel mathematical tools are required to solve those fundamental patho-physiological questions. In this study, we have modelled the LV dynamics at end-diastole and end-systole in healthy controls and in patients with recent STEMI. The computational contracting LV models are based on *in vivo* CMR data. Using this approach, we inversely determine the passive parameters and contractility for all the LV models. We have shown for the first time that, compared with healthy controls, patients with recent STEMI exhibit increased active tension, i.e. increased contractility and increased active tension, when normalized by SBP.

It is challenging to estimate myocardial passive stiffness inversely, especially with limited measurements *in vivo*. Our previous study [20] showed that although the eight constitutive parameters from the Holzapfel–Ogden Law (equation (2.3)) cannot be uniquely estimated inversely, the mechanical response along myofibre direction for a functional myocardium can be robustly obtained. One of the main limitations in estimating the passive myocardial parameters in this study is the assumed population-based end-diastolic pressure, and the MI group has a higher end-diastolic pressure (16 mmHg) compared to the healthy control groups (8 mmHg). The end-diastolic pressure is one of the key input in the parameter estimation, and a higher pressure is associated with a stiffer myocardium. Without knowing the actual end-diastolic pressure for the MI and healthy groups, the quantitative change in the myocardial stiffness may be subject to some uncertainty. However, previous studies [17] and our own experiments show that the passive stiffness uncertainty arising from end-diastolic pressure has few effects on the contractility (*T*_{req}) estimation because of the matched EDV and the strain data.

The patient-dependent myocardial systolic contractility in our simulations is controlled by the parameter (*T*_{req}), all other parameters of the active tension generation are kept fixed. This allows us to match measured LV dynamics in end-systole, and avoid the complexity of determining many parameters in the electromechanical models from different functional and structural scales. This approach has been widely applied when mathematically simulating LV systolic dynamics [12,21,26,27]. The average required myocardial contractility for the control group is 157 ± 25 kPa in our study, which is comparable with the reported value by Genet *et al.* [21] from five normal human subjects, the peak contractility reported by Arsner *et al.* [29] in one healthy volunteer (139 kPa), and the average contractility reported by Wang *et al.* [40] from six healthy human subjects.

A transition region is defined in MI hearts to avoid abrupt change of material properties from MI to functional myocardium following our previous study [12]. We remark this does not necessarily represent the MI border zone which can also be modelled with partial contractility as in [41]. We did not attempt to model the detailed border zone region effects in this study as we could not extract the border zone and its viability from the CMR images. Thus, we adopted a simplified approach to only consider two regions in the MI hearts: the functional myocardium and non-contractile MI region. Future work to accurately define the border zone and the material properties within and around the scar based on new imaging protocols and detailed experimental tests is required. Indeed, mathematical models have been developed which can map the myocardial viability from LGE images that provide partial contraction inside MI [42].

It has been found that myocardial mechanics are dependent on the underlying fibre architecture [11]. However, because of the difficulties of acquiring detailed subject-specific fibre architecture *in vivo*, such fibre structures are usually not included in computational LV models. Instead, most models, this study included, employ rule-based fibres [7,26,42]. To assess the sensitivity of the results due to small changes of fibre architecture, we further increased the myofibre angle by 10% in a healthy heart, that was, −66° to 66°, compared with the original myofibre rotation from −60° to 60°. We found that *T*_{req} was decreased by 6%, and *T*_{a} was slightly reduced by (≈ 1.8%).

We are aware that our current mechanical model is still necessarily simplified, in particular, with one chamber and the LV dynamics are simulated at two time points only (end-diastole and end-systole). Moreover, a simplified intracellular transient is used to trigger active contraction simultaneously, due to lack of parametrization in cellular level, *C*^{s} may only represent an ‘apparent’ myofilament kinetics under same intracellular calcium dynamics.

However, this is a starting point towards clinical translation in terms of real time prediction, and our combined mechanical, statistical and MRI study already show promising value in evaluating myocardial functions between the healthy subjects and MI patients.

### 4.2. Statistical classification

Our ultimate aim is to build a classifier (case versus control) based on *in vivo* imaging data described in the electronic supplementary material. However, working directly on the images, e.g. representing them as grey-level pixel vectors and building a classifier in this high-dimensional space, leads to the well-known curse-of-dimensionality problem [43]. Standard approaches, therefore, carry out a dimension reduction first. In the simplest case, this can be done with principal component analysis. More advanced methods aim to improve dimension reduction by identifying low dimensional submanifolds of the high-dimensional configuration space that contain the relevant information related to the classification problem at hand. Our present research can be seen as an extension of previous work whereby the low-dimensional configuration space is directly spanned by the parameters of an explicit biomechanical model. This is model-based rather than purely data-driven dimension reduction, with the advantage that for a reliable and accurate model, the reduced configuration space is *a priori* highly likely to contain physiologically relevant information.

We have worked in the biomechanical parameter space with a variety of methodological tools. We have started with a simple univariate analysis, followed by a multivariate approach, where we have used the latter to evaluate the relative explanatory relevance of the various biomechanical parameters with respect to the classification task and to compare the performance improvement obtained with several multivariate classification methods over the univariate approach.

In the univariate approach, by simply comparing the different biomechanical parameters, several parameters are identified for classifying myocardial contractile function from either a healthy heart or a MI heart, including SBP, EDV, *T*^{norm}_{a}, *T*_{req}, *C*^{s} and *T*_{a}. However, the simplified comparison, such as using the S student *t*-test, has its limitations, as shown in figure 7. Only *T*^{norm}_{a} is found to be significantly different between the control group and the MI group. In the second part of the study, we have shown that multivariate statistical methods provide more powerful tools for classifying the difference in myocardial contractile function after acute-MI.

In the second part, we have used several machine learning and computational statistical methods to identify the biomechanical factors that possess the strongest explanatory power for predicting the changes in myocardial contractile function after acute MI. Our study has revealed that *C*^{s} and *T*^{norm}_{a} are the most relevant factors for this classification task, but that no individual factor is completely irrelevant. For details, see §3.4.

In the next evaluation step, we have predicted myocardial contractile function between healthy and MI hearts based on the previously mentioned biomechanical factors with eight different methods. Figure 10 and table 4 summarize the prediction performance and the overall best possible performance given all methods. The univariate approach (Univariate LR) shows the poorest performance, as discussed in §3.5. Among the multivariate approaches, KNN and GP-ARD show the best performance (see §3.5 for more details). GP-ARD allows us to visualize the posterior probability of MI heart as a function of the two most important factors; this is shown in figure 11. The plots in this figure illustrate three things. Firstly, it requires more than one factor or variable to separate the data into the two classes. This confirms that a univariate approach is too restrictive, and that a multivariate approach is needed. Secondly, the separation boundaries are nonlinear. This explains why the nonlinear methods, KNN and GP-ARD, show in general a better performance than their linear counterparts. Thirdly, the folding of the decision boundary is very narrow. This explains why the best performance with KNN is achieved with low values of *k*.

The overall predictive accuracy given all methods can be summarized with the AUROC score, which is displayed in figure 10 for each dataset. As it was previously mentioned, the overall accuracy improves from 0.86 to 0.9 when *C*^{s} and *T*^{norm}_{a} are added to the single factors present in dataset *D*_{2}. This AUROC value is considerably larger than random expectation (0.5) and quite close to perfect prediction (1.0). To rule out that *C*^{s} and *T*^{norm}_{a} are for themselves sufficient predictors for the two heart conditions, we excluded *T*_{req} and SBP and found a significant drop in the AUROC value to 0.77. This implies a synergy, i.e. interaction effect, of the involved factors. Hence all the available factors present in *D*_{1} are important myocardial contractile function indicators that should be used in conjunction to differentiate myocardial contractile function between healthy and MI hearts.

### 4.3. Clinical implication

The estimated overall myocardial contractility *T*_{req}, *T*^{norm}_{a} and *C*^{s}, as suggested in the classification part, might potentially be biomarkers for risk stratification of MI patients. However, since our study has a limited sample size with selected patients, it is not immediately obvious which, if any, of the biomechanical parameters might have clinical prognostic value, and further research is warranted. The linear correlation analysis (figure 12) might suggest that *T*_{req} and *C*^{s} could be potentially used to identify MI patients who may have better recovery by using inotropic treatment if *T*_{req} is in a normal range or by reducing *T*_{req} if it is high.

Future prospective studies should investigate the mechanism of how *T*_{req}, *C*^{s} and *T*^{norm}_{a} link with myocardial contractile function through different pathways. The effects of novel therapies on these biomechanical parameters should also be assessed. Finally, future studies should evaluate whether these to determine the incremental prognostic value of *T*_{req}, *C*^{s} and *T*^{norm}_{a} to identify individual patients at high risk of heart failure post-MI, over and above standard prognostic markers such as natriuretic peptides. The strategy to achieve such improvements could involve identification of high-risk patient subsets, identification of subsets of patients with viable myocardium who might be expected to respond to therapy, and implementation of more intensive therapy. Future heart failure management and therapies may be targeted to restore the overall LV contractile reserve according to the level of myocardial contractility and its reserve, such as either downregulating of adrenergic signalling to reduce contractility or using inotropic treatments to enhance it.

## 5. Conclusion

Using combined personalized computational cardiac biomechanical modelling and statistics analysis, we have studied systolic LV dynamics for one patient group consisting of 11 acute STEMI patients with no-reflow and a group of healthy control subjects, based on CMR imaging. The passive response and active contractile properties of myocardium are determined by matching the simulated LV dynamics (volume and circumferential segmental strains) to the CMR measurements. We find that, compared with healthy controls, patients with STEMI exhibit increased LV wall active tension when normalized by SBP. This suggests that the functional myocardium in the patients are overcompensating in order to preserve the stroke volume. Different machine learning and multivariate statistical analysis methods are applied to identify the biomechanical factors that possess the explanatory power in terms of the changed myocardial contractile function after STEMI. The individual required contractility (*T*_{req}), normalized active tension *T*^{norm}_{a} and the systolic myofilament kinetics *C*^{s} are found to have the strongest explanatory power, and the statistical method KNN has shown the best performance for the classification of myocardial contractile function, followed by GP-ARD. We further observe strong correlations between the biomarkers (*T*_{req}, *C*^{s}) and the changed LVEF at six months from baseline (*r* = − 0.79, *p* < 0.01, *r* = 0.70, *p* = 0.02). We conclude that the patient-specific contractility *T*_{req}, the normalized active tension *T*^{norm}_{a}, and the myofilament kinetics *C*^{s} all have potential clinical values for prognostication on LV contractile status post-MI, their significance merits further study in larger and unselected patient cohorts.

## Ethics

The study was approved by the National Research Ethics Service, and all participants provided written informed consent.

## Data accessibility

Electronic supplementary material is provided. The IB/FE simulation platform can be found from https://github.com/IBAMR/IBAMR. Statistical analysis is done using *R* (https://www.r-project.org/).

## Author's contributions

H.G., D.H., X.Y.L. and C.B. designed the study. K.M. and C.B. conducted the CMR scans and analysed the imaging data. H.G. performed the mechanical modelling, H.G and X.Y.L. analysed the biomechanical results. A.A. performed the statistical classification, A.A and D.H analysed the statistical results. H.G, A.A., X.Y.L., D.H., K.M. and C.B. wrote the manuscript.

## Competing interests

We declare we have no competing interests.

## Funding

This research is supported by the British Heart Foundation Grant (Project grant nos. PG/14/64/31043, PG/11/2/28474), the Engineering and Physical Sciences Research Council Centre for Multiscale Soft Tissue Mechanics (EP/N014642/1), the National Health Service and the Chief Scientist Office. C.B. was supported by a Senior Fellowship from the Scottish Funding Council. X.Y.L. was supported by a fellowship from the Leverhulme Trust (RF-2015-510).

## Acknowledgments

We thank the patients and volunteers who participated in this study and the staff in the Cardiology and Radiology Departments.

## Footnotes

Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.3827926.

↵1 The sensitivity, also called the true positive rate (TPR) or recall, is the proportion of correctly identified MI hearts, and is defined as TP/(TP + FN).

↵2 The specificity, also called the true negative rate (TNR) or 1 minus false positive rate, is the proportion of correctly classified healthy hearts, defined as TN/(TN + FP).

↵3 The total misclassification error is defined as (FN + FP)/

*n*= (FN + FP)/(TP + FP + TN + FN).↵4 Consider two classifiers, with true positive rates TRP

_{1}and TRP_{2}, and with true negative rates TNR_{1}and TNR_{2}. Now define a family of new classifiers as follows: with probability λ ∈ [0, 1], they use classifier 1, and with probability (1 − λ), they uses classifier 2. The expected true positive and true negative rates of these classifiers are λTRP_{1}+ (1 − λ)TRP_{2}and λTNR_{1}+ (1 − λ)TNR_{2}, respectively; hence, they lie on a straight line connecting the scores of the two original classifiers, (TRP_{1}, TRN_{1}) and (TRP_{2}, TRN_{2}). From the TRP and TNR scores of the ensemble of classifiers, we can thus obtain the convex hull by linear interpolation between the most extreme (TRP, TNR) pairs. This is the ROC curve representing the optimal classifier combinations (figure 10).↵5 This can be seen from figure 11

*a*. The figure shows the classification boundaries between MI patients and healthy controls in the two-dimensional space spanned by two parameters:*C*^{s}and*T*_{req}. The two patient groups can be clearly separated by the nonlinear decision boundaries. However, a successful classification requires information about both values of*C*^{s}and*T*_{req}together. The values for*C*^{s}are distributed across the whole parameter range for both classes; so knowing*C*^{s}alone, without*T*_{req}, will not allow successful classification of myocardial contractile function as MI versus control.

- Received March 16, 2017.
- Accepted July 4, 2017.

- © 2017 The Authors.

Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.