Introduction

Heart failure (HF)—and its most common co-morbidities, coronary artery disease (CAD) and hypertension—is a major cause of death and hospitalization worldwide. HF represents a major concern for medicine and society, in general, both from a public health and a financial perspective. Its prevalence is very high, reaching 10 % among the elderly and is expected to increase by 46 % over the next two decades in the USA [1]. Current treatment for HF patients, especially those with systolic dysfunction, is mainly palliative and involves diuresis, chronic administration of beta-blockers, antagonists of angiotensin-converting enzyme (ACE), and aldosterone and neprilysin inhibitors [2, 3]. In addition, patients’ adherence to treatment is often limited [4] and some become refractory to medication [5]. Patients with ventricular dyssynchrony can benefit from cardiac resynchronization therapy (CRT), with or without an implantable cardioverter defibrillator [2, 6]. Alternative solutions for some end-stage HF patients include left ventricular (LV) assist devices or transplantation [2, 7]. HF patients with diastolic dysfunction and preserved ejection fraction (HFpEF) represent almost half of all HF patients [8]; however, there are no standard diagnostic criteria or treatments for this condition, and HFpEF represents a major unmet medical need [9]. Overall, and despite advances in medical management, the prognosis of HF remains poor, and the number of deaths associated with this condition has not decreased over the last 15 years [1]. The need for investigational new drugs (INDs) and other therapies that might reverse or prevent the development of heart failure is therefore urgent.

Purpose of Translational Experiments

The main objective of translational preclinical studies is to establish the safety and efficacy of a treatment in order to translate results from basic science into therapeutic advances for humans. However, in numerous instances, therapies with efficacy in preclinical models have failed when applied to human clinical trials [10, 11]. In the field of heart failure alone, many examples of translation failure can be cited, including cytokine inhibitors, endothelin antagonists, and vasopeptide blockers. This is a serious shortcoming that has substantially diminished interest within the pharmaceutical industry regarding insights gleaned from animal models. Worse, novel therapies for patients with heart disease are lacking. Undoubtedly, the explanations underlying these failings are multifactorial. However, we believe that, in some cases, the preclinical studies were suboptimal and/or biased and, hence, premature to invoke large, multicenter, randomized clinical trials in humans with their tremendous costs.

Examination of those prior trials can be instructive by revealing several reasons that likely contribute to the failure of effective translation [12]. In some cases, strategies tested in preclinical models did not manifest robust cardioprotective benefit. Many were not verified in multiple independent labs. Often, the tested agents were not administered at a clinically relevant point in time, i.e., after injury has occurred; rather, animals were pretreated with drug before injury, a situation with little clinical relevance. Many observations made in mice were not tested in other species, including large animals. In some cases, protocols required of a rigorous clinical trial were not employed [12]. Furthermore, some models—especially for HFpEF—failed to provide evidence of HF (i.e., pulmonary edema and congestion) [13]. In these guidelines, we provide a set of recommendations for researchers carrying out translational experiments with the aim of developing therapeutic agents with greater prospects for success in humans. Translational experiments must accomplish some general objectives [14] as listed below (Table 1):

Table 1 General aims of translational experiments
  • Efficacy. Establishing a causal relationship between treatment exposure and a pharmacodynamics (PD) or salutary effect is an obvious major aim of preclinical experiments. Testing for proof of concept (POC) would, by definition, be performed in simulated HF, recognizing that animal models do not necessarily recapitulate all the features of the human syndrome, including its usual co-morbidities.

  • Safety. While assessing efficacy can be more dramatic and gratifying, recognizing the toxicity of a new treatment is at least as important as demonstrating efficacy. This is a main cause of clinical trial failures and represents a major focus of regulatory agencies. As discussed below, the safety of cardiac drugs can critically depend on co-morbidities, especially background structural heart disease [15].

  • Validity. Recognizing that all models have limitations, choosing a HF animal model that mimics the human condition as closely as possible, with its common patient features and usual co-morbidities (e.g., CAD and hypertension), is essential for the generalization of treatment effects to humans (construct validity). Limitations of construct validity also include the use of treatments, delivery routes, and analytical methods that do not replicate the clinical setting [14].

  • Standardization. The lack of standardized protocols in translational studies is a major obstacle to the proper comparison of preclinical results. Standardization will allow results to be compared and reproduced by other researchers.

  • Proof of concept preclinical studies. It should be recognized that experiments carried out in experimental HF to provide POC for an investigated therapy are formally different from preclinical safety studies which require a more comprehensive approach and must be performed under good laboratory practice (GLP) conditions. Neither the US Food and Drug Administration (FDA) nor the European Medicines Agency require that POC studies adhere to GLP requirements, but they look as closely for any evidence of toxicity in the disease models as they do for salutary activity. To avoid undue repetitions of tests, it may be advisable to carry out the POC experiments under a GLP environment if it can be reasonably anticipated that, if positive, their results will contribute to the marketing authorization (MA) dossier.

Translational Animal Models of Heart Failure

Translational successes in HF are uncommon. Multiple factors are involved in this disappointing outcome, including limitations of many preclinical animal models and/or lack of appropriate use of these models. Much emphasis over the last two decades has been placed on genetically engineered mouse models, which have seemed to fall short on their promise to elucidate true targets for potential HF treatment. Positive results in mouse models should be the starting point of further investigation and not the reason for bold claims of future translational promise. Instead, small animal studies should lead to large animal preclinical studies in order to directly test the safety and efficacy of a treatment.

Small Animal Models

Despite the aforementioned limitations, small rodent models of cardiac injury and HF have been extensively and increasingly used over the last two decades to elucidate the pathologic responses to chronic ischemia and/or pressure overload stress. Their use has advanced our knowledge of adaptive and maladaptive remodeling and, in the process, has unveiled the molecular mechanisms underlying these responses as rational drug targets [16]. A recent statement paper extensively described some of the models used in HF research [17]. Herein, we do not aim to recapitulate the methodological details of creating these models, but rather to focus on mice and rats as translational models to study novel potential therapeutic strategies that can ultimately lead to first-in-man studies.

The most commonly used mouse model of HF is transverse aortic constriction (TAC), originally developed to study compensated left ventricular (LV) hypertrophy in response to sustained afterload [18]. Varying degrees of delayed maladaptive cardiac remodeling, LV hypertrophy, LV dysfunction, and HF occur in this model depending on the degree of aortic narrowing and consequent hypertrophic stress [17]. This model has revealed several molecular targets that delay post-TAC progression of HF [19], although the overall clinical relevance of these targets remains speculative.

More recently, as surgical methods have improved, myocardial ischemia models in mice have been used, including both myocardial infarction (MI) via permanent occlusion of the left anterior descending (LAD) coronary artery and ischemia/reperfusion (I/R) injury through temporary ligation of the LAD artery [20]. Although the I/R injury model is best used to probe for acute cardioprotection, the post-MI HF that develops can be a target to test strategies to prevent or reverse LV dysfunction. Again, these models have been used to identify myocardial biochemical lesions in HF, but only a handful have been further tested as potential targets in large animal models [21]. Genetic mouse models of cardiomyopathy have also been used to test POC for a particular molecular pharmacological approach to HF [22].

Rat models of pressure overload and MI are also extensively used either for gene therapy approaches [23] or for testing small molecules [24] for halting or even reversing HF. Because such models have been used so often, much about the properties and the metabolism of small molecules in this species are already known. Accordingly, these models may be preferred over mouse genetic models, which are more costly and not as amenable to hemodynamic monitoring. In some respects (discussed below), rats do not offer much more in terms of translational significance but can add significant benefit in situations where known small molecules are tested for re-purposing as cardiovascular agents, especially when the rat pharmacokinetics is already known. The utility of the rat for modeling the chronic cardiac response to severe MI and for forecasting therapeutic effects of ACE inhibitors is well established. For instance, this model reproduces the effect of enalapril and captopril in humans with HF, namely restoring filling pressures and contractility [25, 26], attenuating the progressive LV enlargement and dysfunction [27, 28], and prolonging survival [27, 29, 30].

Rabbits with experimental aortic insufficiency (AI) have also been used to model chronic LV volume overload and eventual LV decompensation [17]. Surgery in rabbits is relatively uncomplicated, and they are amenable to invasive hemodynamic and serial prospective echocardiographic monitoring. The model forecasts utility of vasodilators in AI and also provides echocardiographic markers of LV decompensation that are useful for timing drug intervention or replacement of the valve [31].

Limitations of Small Animal Models and Recommendations

Small animal models of HF or cardiac stress, especially genetically engineered mouse models, have elucidated the molecular pathways involved in the responses to such stress as targets for drug intervention. However, there are several limitations beyond uncertainty of translational testing in larger, possibly more relevant and easier to monitor preclinical models. One limitation is that the mouse cardiovascular system is not necessarily pertinent to human physiology. The rapid heart rate (300–600 bpm), small size, and atypical handling of calcium and other ion currents in the rodent heart can limit extrapolating of any findings to larger animals and to humans. Thus, for targets involving calcium handling, one must be aware of this limitation and rely more on non-rodent models in which calcium is handled in a human-like fashion (e.g., rabbits and dogs) [32]. Another limitation of post-TAC, post-MI, and genetic rodent models is that rodents bred for laboratory use are genetically homogeneous if not virtually identical. This can limit successful translation because human patients are genetically diverse. Accordingly, any result of a treatment or genetic manipulation in a mouse model needs to be robust in terms of HF modulation and reproducibility because reliable large animal models are more genetically diverse. Although larger animal models may offer advantages in terms of human relevance, their therapeutic robustness and reproducibility may be more difficult to achieve because physiological responses of large animals are likely to vary more.

Overall, there is certainly a place for small animal models in the translational paradigm to elucidate and evaluate novel HF therapies, but the preliminary results in these animals must be corroborated in larger animals that provide additional efficacy and safety evidence necessary to extrapolate to humans.

Large Animal Models

As mentioned above, results in large animal models of HF are important and, in some cases, critical for a more informed prediction of a treatment’s effect because of the opportunity for robust testing in a more genetically diverse mammal that is more similar to humans and in whom co-morbidities might be more readily introduced. Sheep, pigs, and dogs all have been used for modeling HF [17]. In these models, HF can be induced by tachycardia (i.e., pacing) [33, 34], ischemia provoked by either coronary artery occlusion [35, 36] or micro-embolization [37], or volume overload [38, 39]. They differ in their pathological characteristics and offer distinct advantages and disadvantages for testing therapies. For instance, the pacing model is very reproducible in its phenotype, but sustaining tachycardia with full instrumentation restricts utility to short-term studies of potential therapies [17, 33]. Models of ischemic HF in large animals probably offer the best simulation of human disease, although robust and reproducible phenotypes may be difficult to obtain, and some species, such as the pig, are prone to arrhythmias [17, 40]. In any case, these large animal studies are the logical next step in testing a small molecule or biological material that might portend salutary or adverse activity once initial evidence has been obtained in small animals. Importantly, when studies proceed to the large animal stage, it is crucial to use a study design that has the most clinical relevance (summarized in Table 2). We point to one example of a recent, large animal study by one of us (JAH), which employed rigorous methodology mimicking those typical of a clinical trial: prespecified endpoints and power calculations, randomization, and strict blinding of personnel [41].

Table 2 General recommendations for animal models and experimental design in preclinical heart failure studies

Other Considerations

Comorbidities

Patients with HF usually present with CAD and/or hypertension, often in the context of arrhythmias, structural heart disease from prior MI, angina, diabetes, sleep apnea, or anemia, all of which can affect therapeutic outcome [1, 2]. Therefore, it is prudent to take into account the usual and possible co-morbidities when designing a preclinical experiment in trying to replicate the context of the target patient population. A prominent example of this need is the excess arrhythmias and sudden death caused by certain class 1c anti-arrhythmic agents (e.g., encainide and flecainide) used in patients with prior MI and depressed LV function in the Cardiac Arrhythmia Suppression Trial (CAST) [15]. This arrhythmogenic activity of flecainide was forecasted by its behavior in a model of acute focal myocardial ischemia [42].

Validation in Other Species and Strains

The results obtained in one species or even in one strain of animals are often difficult to reproduce in a similar model in another species or strain. For instance, p38 inhibition has a cardioprotective effect in the infarcted mouse heart, but not in the pig [43]. Similarly, survival rate and extent of heart remodeling and LV dysfunction in the mouse after myocardial infarction are strain-dependent [44]. These reports reinforce the need to validate results in a second animal model in which the PD activity of the IND is expressed and calibrated with positive controls.

Induction of Disease

Human disease is induced in animal models in different ways that may significantly affect the outcome of the tested treatment. Induction of dilated cardiomyopathy (DCM) through genetic methods (e.g., knockout or transgenic overexpression) may elicit a different response from administration of cardiotoxic agents, pressure overload, myocardial infarction, or other interventions that induce DCM.

Need for More Than One Model

Efficacy should be documented in more than one preclinical model. All models have limitations that may obfuscate the results of an experiment and may lead to optimistic over-interpretations of a treatment’s potential [17].

Reproducibility

Reproducibility should be documented across several labs working independently. Environmental factors, such as the generally accepted microorganisms found in any animal facility, may confound results. Similarly, small differences in methodology, reagents, infrastructure, or the researcher’s technical skill have resulted in different outcomes for the same intervention [45].

Systems Biology as a Tool to Establish Differences Between Animal Models and Human Disease

All animal models have intrinsic features and limitations. Perhaps, systems biology will emerge as an effective means of deciphering these complexities. The term “translational systems medicine” has been coined for strategies that involve the use of basic and clinical approaches combined with high-throughput technologies in genomics, proteomics, nanofluidics, single-cell analysis, and computational strategies [46]. This holistic approach to science in the context of a cross-disciplinary environment may provide the necessary insights to decipher which effects are common to humans and animals [46]. Although this promise has not materialized yet, it will likely lead to even better animal models.

Effective Experimental Design for Preclinical, Translational HF Studies

Comparison to Standard of Care

Initial POC studies usually aim to demonstrate efficacy of a new therapy in treated vs. untreated animals. Once this has been achieved, tested interventions should be compared against current standard of care (SOC). Sometimes, what constitutes SOC can be challenging to define, especially in non-human studies, and it might be uncertain whether the relative effectiveness ranking is the same in animals and humans. It should be transparent to outsiders what the comparator was in each experiment and why it was selected. The components of the comparator interventions need to be carefully described so that the experiments can be confirmed by independent investigators. To illustrate, recent studies have compared experimental therapies to β-adrenergic receptor blockade and included β-blocker arms alone vs. together with the therapy [23, 47]. ACE inhibitors, angiotensin receptor blockers (ARBs), and diuretics that are approved for HF can be added to the study design to test not only for positive or negative drug interactions but also to further calibrate the model. Calibration with drugs approved for HF will guide the future choice of a model. Moreover, testing for safety in a model where targeted PD activity is also evident provides a “veterinary” therapeutic safety multiple to be compared with that of a SOC agent. Unless there are conditions where HF SOC drugs are not used, adding such cohorts to a study design is important because, at least initially, all advanced clinical trials will be conducted to demonstrate at least non-inferiority to the SOC cohort(s) in terms of efficacy and/or tolerability. This could be an issue in large animal studies as it increases the number of animals and space required, together with the cost of the study. Certainly, studies using mouse and rat models are more amenable to adding these SOC cohorts and should precede any large animal studies. To show that a new treatment strategy is more effective than β-blockers, for example, it is necessary to test it in small animals before using a large animal model to justify the increased cost and time commitment in the attempted translation.

Age and Sex

Preclinical models should try to approximate the human disease context as closely as possible. In this regard, age and sex may be important modifiers. There is little empirical evidence about how interventions in animal models perform when tested in different sexes or in animals of different age, but often animals are chosen with only one sex or with a peculiar, if not irrelevant, age range without proper justification [48]. These issues of generalizability based on basic demographic characteristics in the clinical trials can, of course, restrict labeling at time of approval.

Negative Controls

Careful consideration should be given to the choice of controls. Depending on the intervention, they may need to involve sham procedures, which may give very different results from no treatment intervention at all. For cell experiments, controls may include various types of cells to increase potential insights into safety aspects.

Randomization and Blinding

In testing therapeutic or preventive interventions in preclinical (or clinical) research, there is no justification for not using a rigorous randomized design. Lack of randomization can allow biases that are very difficult, if not impossible, to control post hoc by analytical manipulations. Moreover, for all preclinical studies, it is essential to ensure that the investigators are blinded to the allocation of interventions to combat potentially powerful allegiance biases. If blinding of the investigators is impossible (e.g., due to easily recognizable secondary changes being introduced into the experimental system by one of the interventions or comparators), this should be clearly described along with the other measures taken to minimize potential subjective bias in reading the results. Bias due to lack of blinding is likely to be more prominent for measurements that entail some subjective assessments.

Timing and Dose–Response

Timing and dose–response assessments should also try to anticipate the eventual extension of successful interventions to human clinical trials and human use. Differences in half-life and perhaps in active metabolites have to be considered when trying to establish or extrapolate exposure-response relationships.

Define Inclusion/Exclusion Criteria

Inclusion/exclusion criteria for relevant variables (e.g., infarct size) should be carefully reported. For example, it should be clarified whether the choice is intended to create a sample that is more uniform for minimizing experimental variability or more widely representative of different conditions and settings.

Analysis Time Points

It is also important to prespecify time points at which results will be analyzed, as well as to justify their choice based on pathophysiology considerations, previous studies, or other evidence. Analyses of multiple time points should have an explicit analysis plan to avoid multiplicity problems from non-corrected multiple views at the data. Analyses should accommodate the baseline measurements and properly handle missing data to avoid some popular but erroneous techniques (such as last observation carried forward).

Housing, Treatments, and Interventions

Animal Housing and Confounding Physiological Variables

Many factors should be considered in planning for both adequate and appropriate physical and social environments, including housing, space and availability, or suitability of enrichments. Several guidelines are available on the subject, in publications [49], and websites (www.aaalac.com, iacuc.utk.edu). Furthermore, good laboratory practices should be followed given that the results will be submitted to regulatory agencies such as the US FDA and EU European Medicines Agency (EMA), which also issue specific guidelines periodically [50].

Temperature, humidity, lighting, noise, and concentrations of gases and particulates can affect metabolism, physiology, and susceptibility to disease. Body temperature of the animals must be maintained in their physiological range (rodents, 38–39 °C; rabbits, 38–39.6 °C; pigs, 38.3–39 °C). This is especially delicate in rodents because they easily lose body temperature (animals may die under 35 °C). Animals suffering heart failure will harbor lower body temperature in their extremities, so heating pads or nest material should be provided to keep them comfortable.

Exposure to sound louder than 85 dB or non-homogenous animal handling can affect blood pressure in small animals like rats as well as in larger animals such as non-human primates [51, 52]. Thus, changes in blood pressure might not directly reflect HF, but rather environmental factors. Assessing blood pressure in conscious animals can be challenging if they are not acclimated. Rodents will experience tachycardia when under threat or manipulation stress. Therefore, it is important to perform repeated training and handling before starting the experimental procedure. This is crucial if blood pressure is one of the parameters to study.

Studies of circadian biology for cardiovascular disease indicate that diurnal rhythms (24-h day–night cycle) should also be taken into consideration, as the choice of the animal—nocturnal (rodents) vs. diurnal (larger animals)—might affect the disease model and the preferred time of the therapeutic intervention [53, 54]. The major importance of diet has clearly been illustrated in, for example, the high-cholesterol diet rabbit model widely used for experimental atherosclerosis. Such a diet combined with arterial wall injury produced valuable results on plaque rupture in early models of atheroma inflammation [5557], inflammation-associated atherosclerosis [58], and even in models of vulnerable atherosclerotic plaque [55]. Veterinary advice also should be available to ensure the best diet to maintain animals in optimal health with all nutrient requirements provided.

Anesthesia, Analgesia, and Potentially Confounding Treatments

Careful administration of anesthetic and analgesic drugs provides an appropriate level of hypnosis, muscle relaxation, and analgesia while minimally affecting autonomic function and the cardiovascular system. These drugs must suit the particular needs of the animals and the characteristics of the experimental procedure employed. Opioids should be administered for invasive and painful procedures, such as a thoracotomy to place an aortic band in dogs or a transverse aortic constriction in mice. For example, fentanyl can be used for severe pain as it provides potent analgesia, but only in short duration; buprenorphine can be administered for moderate pain and for postoperative care as it provides a good level of analgesia with long duration. Inadequate analgesia during and after surgery can lead to undesirable pathophysiologic responses, such as tachycardia, elevated blood pressure, or high catabolic demand and will thus increase postoperative mortality and morbidity.

Anesthesia can be provided either with inhaled or intravenous agents. In general, administration of low doses of the anesthetic drug has lower risks and produces less cardiovascular depression. A combination of anesthetic and analgesic drugs (balanced anesthesia) to reduce the amount of anesthetic agents will increase the safety of the procedure. In particular, the use of sevoflurane instead of isoflurane or halothane is recommended due to the adverse effects of the latter, especially in pigs, which are more susceptible to developing malignant hyperthermia and cardiotoxicity when used with β-blockers. When using anesthesia during the experimental period or sample isolation (blood sampling, control echocardiography, etc.), the animals may show different degrees of hypotension, bradycardia, arrhythmias, hypoxia, and an increase in diuresis that could mask heart failure symptoms.

Some anesthetic drugs, including propofol and halogenated volatile anesthetic agents, manifest pre- and post-conditioning cardioprotective properties via different mechanisms of action. Therefore, the choice of the anesthetic agent and the protocol of administration should be carefully considered when evaluating the therapeutic potential of a given compound. No changes in anesthesia should be introduced in the middle of an experimental protocol so as to avoid unknown confounding effects on the experimental model and experimental results.

Treatment Delivery

Delivery of the studied therapeutic product in experimental animals should follow the same pattern as in humans, but doses must be adjusted to the size of the animals [59]. Intramuscular administration is an easy delivery route, requiring only the proper handling of the animal, but it is probably the least frequent in humans. Oral administration in small animals can be performed by oral gavage if the drug can be dissolved. If not, it can be given mixed with the chow. In large animals, drugs can be mixed (if powdered or crushed) with the chow or administered with a special pill gun as a whole pill. If given with the chow, it may be convenient to confirm the presence of the drug or its metabolites in the blood to ensure delivery. Chemical stability of the studied drugs must be known before deciding the route of administration. Administration of biological products, such as cells or viral particles, should follow the same route that would be used in humans (e.g., through PCI). Treatment delivery can be confirmed in live animals or after necropsy (see “Confirmation of Treatment Delivery”). Animal models are also useful to guide the development of new delivery methods [60].

All treatment administrations that require the intervention of the investigator are able to induce stress in animals. As death can occur in diseased and weak animals, this is an important issue that will require expert handling by the investigator and prior training of the animals.

Co-treatments

Once the efficacy of a new treatment has been established against untreated control animals in POC experiments, additional investigations should be carried out in the context of clinical SOC. New treatments should be tested in the presence of adequate co-medications. This is particularly relevant to biological products because initial clinical trials in humans will likely be carried out in the presence of SOC drugs that might mask the beneficial effect of the tested product. Therefore, co-treatment with standard HF drugs (β-blocker, ACE inhibitors, etc.) should try to mimic the scenario of a future clinical trial.

Endpoints and Analysis

Given the large number of end points that can be selected and the even larger array of options of how to measure them, the protocol in animal studies should prespecify explicitly the end points of interest, how they are to be measured, and how they are to be analyzed (Table 3).

Table 3 Summary of endpoints and analyses to be carried out in preclinical HF models

Cardiac Function and Ventricular Remodeling

Cardiac function can be measured invasively or non-invasively. Invasive measurements provide detailed information on pressure-related parameters, whereas non-invasive measurements are capable of evaluating structural parameters. Echocardiography and magnetic resonance imaging (MRI) are the most frequently used non-invasive modalities for the evaluation of cardiac function. Both modalities provide similar systolic function parameters, such as left ventricular ejection fraction and stroke volume, which are also widely accepted indices in clinical care [61]. While echocardiography offers wide availability and lower costs, MRI has superior spatial resolution that provides more accurate volumetric assessments and less variability [62]. When using echocardiography, the use of three-dimensional echocardiography is recommended for assessing ejection fraction to reduce geometric assumptions made by two-dimensional and M-mode echocardiography and their less reliable fractional shortening index of LV function [63, 64].

Recent advances enable assessment of myocardial deformation with both MRI and echocardiography, offering opportunities to assess regional wall motion [65]. These approaches provide insight into the mechanical function of the myocardium in myocardial infarction, valve disease, diastolic dysfunction, and ventricular dyssynchrony for patients or animals undergoing CRT [66]. Myocardial deformation imaging can detect subclinical ventricular dysfunction and allows early detection of cardiotoxicity [67]. Invasive metrics of systolic function include left ventricular end-systolic pressure, maximal first derivative of LV pressure (dP/dt max) obtained by a left ventricular manometer, and cardiac output by a Swan-Ganz catheter. All of the above indices are known to be influenced by loading conditions, and several approaches are proposed to adjust for load status [68, 69]. The pressure-volume loop relationship can be determined to accurately evaluate changes in intrinsic, i.e., load-independent, contractility.

A frequently overlooked issue in the non-invasive assessment of cardiac size and performance is the effects of anesthesia. Rodents exposed to deep anesthesia become bradycardic, hypothermic, and hypercarbic owing to respiratory depression, each of which strongly affects cardiac function. Rodents analyzed by MRI are particularly susceptible, as the anesthesia required is deep and ventilator support is often not provided. It is critical that these changes be tracked and accounted for in the ongoing analyses.

Assessment of diastolic function is better established by echocardiography for non-invasive measurement with the use of Doppler-derived parameters, including tissue Doppler imaging. Many indices are available, including the ratio of early (E) to late (A) diastolic filling, E wave deceleration time, and mitral inflow propagation velocity [70]. Passive diastolic function can also be evaluated by end-diastolic pressure–volume loop relationships, although this is an invasive approach and usually requires subsequent euthanasia. Pressure-derived indices have the advantage of higher temporal resolution for assessing the active diastolic function as represented by the time constant of relaxation (tau) and minimal first derivative of LV pressure (dP/dt min).

Parameters that reflect functional impairment vary depending on the type of disease. Ejection fraction is more sensitive than dP/dt max in detecting MI-induced systolic dysfunction [69], but it is well maintained until the late stage of mitral regurgitation-induced HF [71]. Thus, any of the above modalities alone is insufficient to independently provide full information on cardiac function, and a combination or complementary approach is usually preferred. Multimodality assessment is recommended to comprehensively evaluate cardiac function.

Tissue Remodeling, Fibrosis, and Inflammation

Tissue remodeling can be assessed by different histological and molecular approaches. Collagen deposition can be assessed as Masson’s trichrome or picrosirius red staining. Picrosirius red also allows quantifying collagen cross-linking by using polarized light. Cardiac fibrosis can also be estimated with MRI. Late gadolinium enhancement allows visualization of the scar region [72], whereas diffuse fibrosis can be determined using the myocardial extracellular volume fraction (ECV) as an indicator of the connective tissue fraction [73]. Cardiomyocyte hypertrophy can be estimated by measuring cell cross-sectional area after staining with wheat germ agglutinin [74]. Myocardial expression of HF markers such as brain natriuretic peptide (BNP), α-skeletal, actin, and, in some species, β-myosin heavy chain can be determined by qRT-PCR or by Western blot [75, 76]. Expression of inflammation and fibrosis-related markers can also be quantified in this manner [77]. Additionally, ELISA can be used to detect BNP, troponin, cytokines, and hormones in the blood [78]. The analysis of proteomic and transcriptomic profiles in the affected heart will provide further information on protein and gene modifications [79, 80].

Cardiac Metabolism

Myocardial viability and metabolic function can be assessed by non-invasive imaging [81]. Glucose and fatty acid uptake can be determined by positron emission tomography (PET) using 18F fluorodeoxyglucose (18F-FDG) and 18F-fluoro-6-thia-heptadecanoic acid (18F-FTHA) as radiotracers. Metabolic changes and mitochondrial activity can be determined by magnetic resonance spectroscopy (MRS), which allows non-invasive real-time quantification of pyruvate, phosphocreatine, ATP, and inorganic phosphate [81, 82]. Changes in metabolite concentration can also be determined biochemically, although tissue phosphocreatine and ATP are unstable and are best determined by MRS.

Confirmation of Treatment Delivery

For some treatments, especially biologicals such as those used in cell or gene therapies, it is necessary to verify that the therapeutic agent has been effectively delivered to the targeted organ. The delivery of a viral vector can be confirmed by qPCR, and the expression of the therapeutic gene can be quantified by qRT-PCR or Western blot. The biodistribution and expression in organs other than the target organ or cell type should be analyzed by qRT-PCR and immunohistochemistry. The presence of transplanted cells can be visualized in the live animal by MRI (superparamagnetic iron oxide (SPIO)-labeled cells) or after necropsy by histological methods and confocal microscopy. Cell delivery and engraftment can be quantified by qPCR or Western blot using markers specific for the therapeutic cell population.

Confirm Efficacy via More Than One Measurement

Efficacy should be measured by more than one index when possible. Imaging should be accompanied by target tissue analysis and tentatively by systemic biomarker changes, as discussed above, ideally in both small and large animal species as recommended above.

Secondary Effects of Treatment on Other Organs

Risk for any direct or indirect collateral pharmacodynamic activity can be tested in vitro and in vivo. The pharmaceutical industry and regulators both rely upon data from a battery of in vitro receptor and transporters binding assays in forecasting such activity. In intact animals, and in some animal disease models, circulating markers of organ function and histomorphology provide evidence of any acute safety pharmacology issues, organ dysfunction, or tissue damage. A variety of methodologies and techniques can be used to test for unintended effects of a given intervention, including qPCR, biochemistry, and histological analysis, for direct or immunopathic lesions. Animal disease models may be needed to identify certain risks, such as the arrhythmic activity of class 1c anti-arrhythmic agents and positive inotropes.

Survival

Whenever possible, survival experiments should be prioritized to gain information on efficacy and safety. However, these follow-up studies are expensive, and hence, their utility must be clearly foreseen and accepted by granting agencies and pharmaceutical companies, especially in large animal studies. During the experimental procedure, all animals should follow the welfare protocol that assesses the end points. Animals that have successfully finished the experimental procedure plus follow-up should be euthanized in order to obtain samples for the study. Animals that cannot be further studied should be euthanized following the ethical codes for animal research.

Pharmacokinetics: ADME

Absorption, distribution, metabolism, excretion (ADME) parameters (i.e., area under the curve (AUC) and Cmax) determine exposure in a healthy animal or a disease model and, therefore, the responses that are exposure-dependent in either scenario. Depending on whether there is subsequent hepatic or renal dysfunction associated with the HF or if any co-morbidities or co-therapies were introduced into the model, the blood levels of liver metabolites and/or renally excreted drugs can be significantly affected. Blood and tissue levels may also change over time in a trial in simulated HF [83]. Actual AUC or Cmax exposures achieved in vivo at pharmacodynamic or POC vs. toxic dosages, rather than relative milligrams per kilogram or milligrams per square meter dosages, provide an index of inherent therapeutic safety and the bases for forecasting “veterinary” or clinical safety risks as for selecting a prudent “first-in-man” starting dose.

Statistics and Power Calculations

Studies should be properly powered to detect meaningful effect sizes. Underpowered studies run the risk of increasing the problem of both false negatives and, given the current research environment with selective reporting, false positives [84]. Moreover, when detected, estimates of false positives or false negatives are likely to be inflated in underpowered studies, even for genuine effects -[85].

Besides proper sample size calculations, some other aspects of the statistical design are essential. Some have argued that the protocol and analysis plan should be prespecified and ideally deposited in public before the study is performed and data are analyzed. This may remedy the need for data dredging (inappropriate data fishing) after the fact. Even in experimental, randomized studies, there is ample room for data dredging that can take place post hoc, including changes in the eligibility criteria; changes in the time, types, and measures for assessment of outcomes; and changes in the analytical model adopted for handling missing data, covariates, or other study-specific caveats. When analyses have not been prespecified, investigators should clarify that this is the case so that the statistical analysis can be seen as exploratory and requiring independent validation in subsequent studies.

Toxicity

Cardiac Toxicity in Animal Models

Mammalian hearts share many physiological, cell signaling, and energetic features. They also share vulnerability to direct cytotoxic, ischemic, and overload insults. For instance, they react similarly to these insults with anatomic, electrophysiological, and energetic “remodeling” during chronic overload or persistent atrial fibrillation [86]. This can render animal disease models not only suitable for POC testing but, in some cases, also for more reliable safety (toxicity) testing. The potential use of animal models for functional safety assessment is increasingly discussed, as new therapeutic modalities and targets are being identified and developed, especially including those for HF [87]. Moreover, histomorphological or functional safety studies performed in a disease model can provide a “veterinary” safety multiple based on the dosage that provokes histologic lesions and/or compromises cardiac function compared to a pharmacodynamic dosage. Animal models of cardiac disease have reliably forecast adverse effects on cardiac physiological function, especially on normal sinus rhythm (e.g., pro-arrhythmogenicity of certain inotropes and anti-arrhythmic agents) [42, 88]. Use of experimental models of acute MI, for example, confirmed the clinically observed detrimental effect of class 1B and 1C anti-arrhythmic drugs in patients with acute MI vs. beneficial effect of beta-adrenoceptor-blocking agents in that context [8991].

Translational Toxicology Studies

Conventional toxicology studies of an IND for HF usually monitor gross and microscopic tissue integrity in a healthy animal rather than in a genetic or surgical model of HF, mainly due to the possible confounding of background lesions and the cost and inconvenience of the latter. Besides histomorphology and gross pathology, safety endpoints should include mortality (with cause of death), clinical observations, body weight, physical examinations, food and water consumption, clinical pathology (blood and urine), and organ weight. When testing biologicals, such as stem/progenitor cells or viral vectors, additional analyses should investigate the possibility of tumor formation or activation of the immune response. Known toxicities for similar treatments already found in other organs or disease settings should also be tested, taking into account that toxicities may appear late after treatment. Therefore, potential detrimental effects should be tested at different time points.

Conventional toxicity studies are not typically designed to capture cardiac performance, rhythm, contractility, and loading conditions. This is an important limitation of routine testing for cardiac toxicity because adverse functional cardiac side effects are perceived to be more frequent than morphological and biochemical lesions [92]. An adverse effect on function can be an extension, or an indirect consequence, of targeted PD activity or it can be off-target, but sometimes that distinction cannot be made. Some positive inotropic agents like milrinone and vesnarinone promote cardiac death in HF, most likely due to ventricular tachyarrhythmias [93]. Certain anti-arrhythmic agents and other drugs used in HF that prolong the QTc interval carry a risk of torsades de pointe arrhythmia [42, 88]. A HF model best suited to capture this or other arrhythmia liability would ideally recapitulate the underlying structural disease, mechanical and neurohumoral factors, electrolyte abnormalities, ischemia, and co-therapies that make up the context of such toxicity. However, some factors may be more important than others. For example, as opposed to the intact dog, the chronic anesthetized AV-blocked dog is reported to be highly vulnerable (>70 %) to the torsadogenic effect of multiple class III anti-arrhythmic agents (e.g., dofetalide, almokalant, and ibutilide), likely as a result of anatomic and “electrical” ventricular remodeling during prolonged AV block [88, 94, 95]. Normal sinus rhythm and stroke volume can be compromised by focal or global disturbances in conduction of electrical impulses or in myocardial metabolism, which are manifested as disruptions in atrial or ventricular rhythm or contractility. Cardiac output may be normal in that context via the operation of compensatory mechanisms, such as electrical or chamber remodeling that may ultimately be deleterious.

When important behavioral toxicity is detected in a disease model used to provide POC, especially at exposures less than the no observed adverse effect limit (NOAEL) in conventional safety studies, such a scenario should be further assessed by a GLP study, as the starting dose in humans will rely on the new NOAEL. Indeed, the safety data may even forecast risk, if not contraindication, in certain patient populations. A prominent example is the capacity of flecainide to raise the defibrillation threshold for cardioversion of programmed electrical stimulation (PES)-induced arrhythmia in dogs with recent myocardial infarction [96]. This is consistent with the label warning against the use of this and other class 1C anti-arrhythmic agents in patients with recent myocardial infarction and ventricular arrhythmias.

Limitations of Animal Models

Although mammals share the same basic molecular biological architecture for coordinated cardiac contraction and the mechanisms that control it, the PD activity and safety of any given cardiac treatment do not always translate across species. For instance, the vasodilator and positive inotrope milrinone, a cardiac and vascular phosphodiesterase (PDE)-III inhibitor with limited approval in HF, is not cardiotonic in the rat, though the rodent expresses the cardiac PDE-III enzyme as does the human [97]. However, it is arrhythmogenic in both the intact rat and in the congestive HF patient, where it promotes sudden death [98]. Another critical translation principle is that safety can be disease-state dependent. The tachy-arrhythmogenicity of milrinone is not apparent in the intact dog even when it is given with an arrhythmogenic dosage of ouabain, but it causes ventricular fibrillation when focal ischemia is present [99, 100]. Similarly, the class 1c anti-arrhythmic agent flecainide is pro-arrhythmic and essentially contraindicated in cardiac patients when they have a history of myocardial infarction, but its use is not restricted if structural heart disease is absent [15]. Accordingly, FDA’s Division of Cardiovascular and Renal Products, for example, expects that cardiotonic and anti-arrhythmic INDs will be tested for any pro-arrhythmia in animal disease models, especially those with background structural heart disease—the context in which excess arrhythmic death was seen in the CAST [15]. These examples illustrate that expression of toxicological cardiac dysfunction and other cardiac pathology depends on the pathophysiological state of the heart.

Regulatory Aspects

The development and adoption of new therapies for cardiac patients are tightly regulated. Compliance with these rules during preclinical investigation is intended to facilitate their translation to the clinic. Two independent but complementary concepts have to be distinguished in the lifecycle management of a medicinal product. The first concept deals with the “relevant criteria” and the “technical requirements” that need to be satisfied and documented. These establish the quality via chemical/biological analysis (GMP quality control testing) and both the safety and efficacy profile per studies in animal models (see above) and in clinical trial. Preclinical safety trials in at least one rodent and non-rodent animal species with PD activity are mandated in addition to tests on safety (functional) pharmacology, genotoxicity, reproductive toxicity, and oncogenicity. In patients, placebo-controlled, randomized, double-blind trials are needed.

The second concept is known as “good practice” (GxP) rules, which are to be followed in the development and commercialization chain. The GxP rules have been developed in at least three important sectors of a medicinal product development. The good laboratory practice (GLP), good manufacturing practice (GMP), and good clinical practice (GCP) rules are thus intended to be implemented deliberately for preclinical testing, for the production and quality control of any investigational or commercial product planned for a human use, and for the conduct of the clinical trials, respectively.

In the USA, non-clinical in vitro, in vivo, and ex vivo studies of toxicity and safety pharmacology are implemented by the Code of Federal Regulations [101], whereas in the EU, the non-clinical development program is implemented in annex I of Directive 2001/83 [102]. For both regions, the GLP mandate is intended to assure validity and reproducibility of results of such studies by requiring that all underlying procedures be defined, monitored, and documented. GLP ensures the generation of verifiable, quality data for the dossier to be submitted for MA (IND in the US, MA in the EU) consideration. However, GLP validation is not scientific validation, rather a set of rules and requirements to ensure that agreed procedures have been followed (in the USA, per the general provisions of CFR, sub-part A, section 58.3). Such requirements are waived for basic exploratory studies performed to predict efficacy or to identify physical or chemical properties of a test article. Importantly, neither GLP nor GMP is sufficient to verify the relevance of the tests or process performed under this “quality system.” The adequacy of both the manufacturing process or quality control strategy and of the non-clinical testing plan to achieve the legitimate expectation of quality and safety for a medicinal product are primarily dependent on the design and execution of the development plan.

Technical Requirements

Marketing authorization of a medicinal product can only be issued after assessment and review of a “marketing authorization dossier” documenting the following “relevant criteria”: i) the quality profile of the product, which includes the characterization studies, the production/purification process, and the related quality control strategy; ii) the pharmaco-toxicological data to establish the safety profile; and iii) the clinical data to establish the clinical benefits and the adverse effect profile. Based on this composite, the valuation process will derive a risk/benefit assessment for granting marketing authorization and writing the product literature for both prescribers and patients.

Technical Criteria for Documenting the Quality Profile

Quality aspects cover the characterization and establishment of production and purification processes, ensuring that any batch will be released in a consistent manner with the expected quality, purity, and biological activity. It is strongly recommended, especially for advanced therapy medicinal products (ATMPs), that a dialogue be initiated with the regulators from the very outset of any clinical program to ensure that bench-scale procedures can be successfully translated into an approvable protocol.

Cell Manufacturing

In the case of cell-based therapies, specific conditions apply. The origin of the cells, infectious disease screening, collection, cell culture, expansion, cryopreservation/storage/thawing procedures, and transport conditions should be precisely documented. Specific data to be recorded also vary according to the origin of the cells. In the case of autologous cells, the potential risk that the manufacturing process may lead to propagation of a pathogenic agent initially present in the donor should be assessed. For allogeneic cells, an extensive screening for infectious diseases (HIV, hepatitis, CJD, etc.) should be completed. When using cell scaffolds, these will be evaluated as medical devices according to tissue engineering-specific requirements (e.g., composition, degradation profile, biocompatibility, description of the manufacturing process and characterization of its residuals, sterility, and toxicity). For the combination product, cells will have to be evaluated as any isolated cell product, but cell–material interactions will also have to be thoroughly assessed.

Raw Materials and Other Reagents

All materials and reagents, including those specific to cell culture media, have to be qualified and certified before they enter the GMP-certified production plant. As such, the traceability of all ancillaries and documentation about origin, sterilization procedures, and quality controls for each batch are major requirements, particularly if there is a risk that some of these compounds may persist in the final product. These issues should be worked out from the outset as it is often challenging to collect the relevant information, which may be protected by intellectual property rights. Also, to anticipate what might likely become a mandatory requirement in the years to come, the avoidance of xenogeneic products, the use of fully defined chemical media and synthetic surfaces, and the development of automated systems incorporated in closed bioreactors should all be strongly encouraged [103].

Drug Substances and Drug Products

A robust in-process quality control system, as well as an end-product testing program, are needed to generate a consistent product that meets the criteria required for release of a given batch prior to its administration (be it for a clinical trial or for commercialization). The major criteria include viability, sterility, identity (detection of specific surface markers or functional assays), tumorigenicity and genomic stability, purity (through the minimization of undesirable non-cellular impurities and cell debris), and stability both throughout the process (particularly if an intermediate period of cryopreservation has been used) and under the proposed shipping conditions. An important feature of this quality control system is the reliance on robust, sensitive, and specific analytical tools. Another key challenging criterion is the demonstration of potency, i.e., evidence that the biological activity of the final product meets the intended mechanism of action and therefore the therapeutic effect.

Technical Criteria for Documenting the Safety Profile in Non-clinical Developments

The main objectives of non-clinical (preclinical) tests are to identify the pharmacology and toxicity of an investigational drug and to minimize the risk associated with it in the initial clinical trials. The rationale for the non-clinical tests as well as the choice of the animal model(s) should be discussed and justified.

Statistical Evaluation

Power calculations to estimate sample sizes, randomization, choice of appropriate controls, and blinding are strongly recommended.

Test Articles

Although the formulation may vary, the active pharmacological ingredient (API) tested in non-clinical experiments should be essentially identical to the product intended for use in humans, especially for the initial clinical studies. However, it is not unusual for the production/purification process of the API to be subsequently modified to improve product quality, consistency, and robustness during the normal development course of a medicinal product, particularly for ATMPs. The potential implications of such changes for extrapolation of the animal findings to humans should be considered and fully justified, especially if the impurity or degradation profile differs from that of the batches used in the critical non-clinical trials.

GLP

The basic objective of good laboratory practice (GLP) is to assure that reliable data for forecasting safety of pharmaceutical products is collected [104]. GLP consists of a quality system that encompasses the organizational process and the conditions under which non-clinical safety studies are planned, performed, monitored, recorded, archived, and reported. GLP rules also allow a mutual acceptance of the results and outcome from non-clinical animal tests, avoiding repetition and unnecessary use of animals. The US FDA accepts studies performed per OECD guidelines or as promulgated under 21CFR58.

Human Resources

The non-clinical study plan should include a clear management structure that identifies the study director, the product under investigation, and the specific purpose of the study. The team should ensure compliance with GLP guidelines, conduct audits on the facilities and processes, and ensure that the final report correctly reflects the raw data accumulated during the study.

General Considerations

Facilities should have enough space to conduct the study, including isolated rooms if necessary. Source, date, and conditions of animal arrivals should be recorded. A period of quarantine needs to be respected before experiments can be started. The product under investigation will need clear identification of date of receipt, expiration date, storage conditions, and stability under these conditions, quantities received vs. actually used in studies, batch number, and composition. All reagents must be clearly identified and fully traceable.

Procedures

Written standard operating procedures (SOPs) should be available for the tested products, equipment, and reagents. Relevant information on animals (species, strain, weight, age, sex, and source of supply) and product administration (route, dose, and frequency) should be precisely reported. As for a clinical trial, the experimental design, end points, and methods should be recorded and included in the final report. All data generated during the study should be recorded accurately, signed, and archived for the period specified by the appropriate authorities. A detailed report should be issued at the completion of the study, including extent of compliance with GLP principles. Areas of non-compliance, particularly with biologicals, should be identified and their significance evaluated relative to the overall safety assessment [105].

GMP

Any medicinal product intended to be administered to a human subject, be it for routine treatment or for clinical trial, should be manufactured and released in good manufacturing practice (GMP)-certified establishments under the supervision of a qualified person in accordance with the quality standards appropriate to their intended use and described in the clinical trial or MA dossier. GMP principles have been laid down for many years now, both in the USA and Europe, and the general principles are fairly identical to the GLP rules. They mainly deal with the quality management system, personnel, premises and equipment, documentation, production, quality control as well as other activities surrounding the production system. Whereas some non-clinical studies have been conducted, and accepted, without full compliance to GLP mandates, the GMP rules have to be applied for the production of any batches intended for humans.

iPS Cells as an Alternative

In some cases, the safety and/or efficacy of a new treatment can be first assessed in vitro using induced pluripotent stem cells (iPSCs). Since their discovery in 2007 [106], this ground-breaking technology has been used to model diseases, interrogate drug response and toxicity, and create multiple cell types for therapeutic transplantation. To create iPSCs, somatic cells are first isolated from patient or animal tissues, including skin from punch biopsies, peripheral blood, lipoaspirate, cord blood, and amniotic fluid, among others. These cells are then induced to develop into a pluripotent state similar to embryonic stem cells (ESCs) through ectopic expression of pluripotency transgenes, bypassing the most controversial ethical concerns associated with ESCs. Induction of pluripotency gives these cells the innate ability to be propagated indefinitely and to differentiate into therapeutic cell populations of all three germ layers, as demonstrated in cardiomyocytes (CMs), pancreatic beta-islet cells, hepatocytes, and neuronal cells [107109].

The emergence of useful assays for the discovery and development of novel cardiovascular therapies depends on the ability to provide more efficient, predictive, and biologically relevant in vitro models to assess the potential on- and off-target toxicities of these new drugs. Once approved, approximately 4 % of drugs are withdrawn from the market due to safety issues [110, 111]. As discussed above, cardiac toxicity is the leading cause of drug attrition during pharmaceutical development and is a primary cause of drug withdrawal after market release, accounting for 42 % of all drugs withdrawn due to safety concerns between 1994 and 2006 [112, 113]. Under current phenotypic drug discovery approaches, a model is identified in which one or more key aspects of a disease process are recapitulated in vitro. Large compound libraries are then screened or new molecules synthesized to identify potential drugs that normalize function or correct/reverse the disease phenotype observed in vitro. Patient-specific iPSC-CMs make an ideal platform for drug discovery assays because of their capacity to recapitulate “disease-in-a-dish” phenotypes [114].

The most extensively studied models are those of arrhythmic disorders caused by mutations in cardiac ion channels (“channelopathies”) [115]. Electrophysiological assessment using multi-electrode arrays (MEAs) and patch-clamp techniques enables identification of prolonged field/action potentials (FP/AP) in iPSC-CMs from patients carrying long QT syndrome (LQTS) type 1 and type 2 mutations in potassium ion channels, which is associated with prolonged QT intervals on the patients’ electrocardiograms [116, 117]. Another class of in vitro-modeled cardiac disorders is the cardiomyopathies that are associated with deterioration in myocardial function and linked to heart failure and sudden cardiac death. One such disorder is dilated cardiomyopathy (DCM), which is characterized by ventricular dilation and systolic dysfunction. DCM iPSC-CMs carrying a cardiac troponin T (cTnT) mutation has demonstrated irregular sarcomeric organization, a scattered distribution pattern of Z bodies, reduction in contractile force, altered regulation of Ca2+ traffic, and decreased tolerance to β1-adrenergic challenge [118]. Similar to DCM, familial hypertrophic cardiomyopathy (HCM) is characterized by thickening primarily of the left ventricular wall and has been efficiently modeled using iPSC-CMs. Lan et al. [119] generated iPSCs carrying a thick myofilament myosin heavy chain 7 (MYH7) mutation and demonstrated that iPSC-CMs were significantly enlarged compared to controls and had increased myofibril content, contractile arrhythmias such as DADs, Ca2+ cycling perturbations, and intracellular Ca2+ elevation. Adrenergic stimulation exacerbated Ca2+ transient irregularities and arrhythmias, mirroring the development of arrhythmias in HCM patients under sympathetic stimulation. These abnormalities were counteracted by the β-adrenergic blocker propranolol or the L-type Ca2+ channel blocker verapamil, thus supporting the notion that Ca2+ cycling perturbations and intracellular Ca2+ elevation are central mechanisms for disease development at the cellular level.

A separate study on the LEOPARD syndrome, a multi-organ condition in which 80 % of patients present hypertrophic cardiomyopathy as the most life-threatening aspect, showed that disease-specific iPSC-CMs had significantly increased median surface areas, sarcomeric disorganization, and nuclear factor of activated T cells (NFATC4) localization in the nucleus [120]. Patient-specific iPSC-CMs thus can provide not only a powerful platform for the identification of novel cardiovascular therapeutic targets and compounds but may also help address the current lack of productivity in the pharmaceutical research and development process. By shifting attrition to the preclinical stage of heart failure drug development process, these cell lines not only could help aid in the early identification of lead drug candidates but also accelerate the screening of cardiotoxic drugs and off-target effects of these pharmacological agents [121].

Ethics

The principle of the three Rs (reduction, replacement, and refinement) should be applied throughout the animal experiments. The number of animals should be reduced as much as possible. A wide search of the literature will help avoid duplication of experiments. Power calculations should be used to determine the minimal number of animals needed for the experiments in order for the results to be meaningful. Tissue from the same animal can be used to assess many structural, functional, and metabolic parameters. To reduce variability, inbred strains of animals may be used, although results should always be validated in another strain or species. In vitro cell culture models, including iPSC-derived cells, can be employed to replace the use of animals in certain conditions, as described earlier. Procedures should be refined to minimize animal suffering, including the use of adequate anesthesia, analgesia, and postoperative care. Aseptic environments will reduce the risk of infection. Following procedures, all animals should be observed in an appropriate recovery cage or room, with continuous monitoring. These recommendations will help reduce mortality after procedures. Animals should be examined regularly for development of any adverse signs and symptoms indicating pain, distress, or discomfort. Any animal with evidence of suffering excessive stress as a result of surgery or intervention should be humanely sacrificed by approved methods.

Reporting

The usefulness and reproducibility of translational experiments strongly depend on how robustly they are reported [122]. Full recommendations on how to report animal experiments have been provided in the ARRIVE guidelines [123]. Here, we summarize some important aspects that should be taken into account:

  • Objectives. The study report should clearly indicate the initial hypothesis as well as the primary and secondary objectives.

  • Design and methodology. The experimental design of the study and the methodology used should be clearly described. Sample size, experimental groups, randomization, negative controls, experimental procedures, treatments, animal housing, and statistical methods should be elaborated sufficiently to enable confirmation of results. Ethical permissions should also be specified.

  • Construct validity threats. Limitations of the animal model used in the study and possible threats to the translation of results to humans should be clearly acknowledged.

  • Outcome and analysis. Results should include baseline characteristics and measurements for each animal group. All participating animals should be included in the report. If some of the animals are not included in the analysis, reasons for exclusion should be provided. Negative results should also be included and discussed in the final report.

Useful checklists are provided in the ARRIVE guidelines and also for specific types of study designs, as assembled in the EQUATOR initiative [123, 124].

Clinical Significance and Conclusions

Despite advances in recent years, HF prognosis remains poor and the need to develop new therapies is urgent. Several interventions with therapeutic potential have been identified over the last two decades in animal models, mainly mice. However, translation of these results to the clinic has been minimal. Inappropriate or incomplete preclinical studies represent a major hurdle for the development of new therapies for HF. Animal models need to express the human patient characteristics more faithfully. The introduction of co-morbidities and co-treatments in the experiments should be considered when establishing the model and confounding variables should be controlled. Translational studies should address toxicity as well as efficacy because many clinical trials fail due to lack of safety rather than lack of efficacy. A concerted, collective effort is needed to improve the usefulness of preclinical studies and boost the translation of results in animal experiments to humans. Researchers should improve the experimental design, statistical analysis, and use rigorous methodology. Funding agencies should recognize the need for proper financing of translational experiments, which will likely be more expensive than exploratory projects focused on the generation of knowledge. Academic institutions should value the effort and time invested in a truly translational project. Regulatory agencies could demand stronger preclinical studies prior to entering clinical trials (the fact that the FDA expects positive inotropes and anti-arrhythmic agents to be tested for arrhythmic activity in a cardiac disease model is a step in that direction). Similarly, scientific journals and other stakeholders could encourage the scientific community and the pharmaceutical industry to investigate species differences in adaptive and maladaptive responses to heart disease to increase the utility of animal studies in new cardiac drug discovery. These joint actions would help improve the quality of preclinical studies and reduce failure in subsequent clinical trials.