Keywords

1 Introduction

Clinical trial design for stem cell therapy evaluation has been considered in a series of workshops involving academia, industry and regulators to produce recommendations for the translation of this potential treatment modality, the Stem Cells as an Emerging Paradigm in Stroke (STEPS) meetings [1,2,3]. The content and concerns of the four STEPS meetings (three reported and published, the report of a fourth being prepared at the time of writing) reflect the evolution of the field from a focus on animal models through to early phase and now later phase clinical research, and the developing experimental data in the field. Issues relating to clinical trial design and progress in the cell therapy field have additionally exercised the thinking of many investigators [4,5,6,7,8]. In particular, thinking has evolved from the initial concept of cell or tissue replacement as the dominant biological effect mediator, to an appreciation that much (or indeed all) of the therapeutic benefits from cell therapy might be mediated by indirect effects [6], and most recently that only certain components of cells might be necessary for a therapeutic action [9].

2 Acute Cell Therapy

It is recognised that cells delivered within the first hours after induction of cerebral ischaemia in animal models limit the extent of brain injury, and are associated with improved functional outcome [10,11,12]. Cells of various types and delivered by various routes have been reported to be effective, but some concerns regarding methodological quality and reporting bias have been identified [10]. More specific focus on studies of intravenous delivery of bone marrow-derived cells in rodent middle cerebral artery occlusion (MCAO) has yielded a more consistent evidence base [12]. Typical time to administration has been prior to induction of ischaemia or within a few hours of ischaemia.

Biodistribution studies in mouse hypoxia and focal ischaemia indicated limited distribution of mouse neural stem cells to the brain [13]: intravenously delivered cells were cleared almost entirely within 1 week, while intra-arterial delivery produced higher concentrations in brain, but almost complete clearance within 2 weeks. In rat MCAO, intravenous bone marrow stromal cells were distributed predominantly to the lungs, with transient distribution to the brain, and negligible cell persistence even by 8 days [14]. In another rat MCAO study, no preferential brain uptake in MCAO rats compared to controls was seen, and again distribution was predominantly to lung [15]. In human studies, intravenously delivered bone marrow-derived mononuclear cells (BMMCs) were distributed predominantly to lungs and spleen by 2 and 24 h after injection [16]. It therefore seems unlikely that there will be significant distribution of cells to the CNS after intravascular administration, and any engraftment is likely to be very limited. With respect to outcomes, differences between intravenous and intra-arterial routes have not been observed consistently in animal studies [17]. Therapeutic effects have nonetheless been seen with most cell types in animal models of stroke, therefore it is assumed that “bystander” effects reliant on release of trophic factors and immune modulation underpin the action of stem cells delivered in this manner, and there is experimental evidence of multiple systemic effects of potential relevance. Systemic engraftment of some cell types in the lung may allow more sustained action.

Within the first hours after ischaemic stroke, intravascular delivery of a therapeutic agent is the most feasible option: intravenous delivery is straightforward, and intra-arterial delivery is increasingly an option with the wider use of endovascular thrombectomy. Intravenous delivery is clearly more practical since almost universally available, and possible at any stage after stroke.

Intra-arterial delivery requires appropriately skilled personnel and facilities, and while thrombectomy is becoming more widely available, endovascular treatment is indicated and possible in only a minority of patients, and speed of intervention is critical. It is logistically challenging to add intra-arterial cell delivery to an emergency procedure such as thrombectomy, and it would be entirely impractical to infuse autologous cells due to the requirements for cell harvest and preparation. In addition, animal studies of intra-arterial cell delivery have been complicated by cell clumping and downstream arterial occlusion [18] that produced significant complication rates potentially outweighing any advantage of improved CNS retention of cells, and requiring modification of injection rates to overcome [19, 20]. The practicality of intra-arterial delivery in the later subacute period is also questionable—the logistics and safety of intra-arterial procedures potentially requiring anaesthesia outside the acute time window for thrombectomy are challenging, and targeted delivery may be impossible due to persistent occlusion of the relevant artery. Very slow recruitment was evident in one pilot clinical study of subacute intra-arterial cell delivery, with only 6% of screened patients proving to be suitable and many exclusions due to medical instability and major arterial occlusion precluding vascular access [21]. Whether such an approach is advantageous is unclear: while the evidence above suggests that this may allow some degree of cell delivery to the brain, it is uncertain whether this leads to long term engraftment, or whether this is necessary for a therapeutic effect [22].

Several small clinical studies using intravenously delivered cells have reported findings, but with broad patient inclusion criteria and a wide time window [23,24,25,26,27]. The acute use of autologous cells is extremely challenging since invasive harvest procedures are likely to be more hazardous in the face of recent thrombolytic and antithrombotic drug treatments, and neurological impairments will impair cooperation by many patients with significant stroke deficits. Two moderately large controlled clinical trials have reported early subacute use, one using autologous bone marrow cells and the other allogeneic cells [28, 29].

The phase 2 Indian multi-centre trial of Prasad and colleagues [28] had a time window of 7–30 days, treating patients with autologous bone marrow mononuclear cells infused intravenously. Randomisation to cell infusion or control was undertaken, and assessors for the outcome measures were blinded to treatment allocation. The control arm patients underwent no invasive procedures—neither cell harvest from bone marrow, nor intravenous infusion.

The MASTERS trial [29] of the Athersys “multistem” allogeneic bone marrow-derived multipotent adult progenitor cells aimed to treat patients 24–36 h after stroke onset, but requirements for cell processing in specialist facilities caused the trial to expand the time window to 48 h for practical reasons. Eligible patients were randomised to either cell infusion (400 or 1200 million cells) or placebo. The global results showed no difference in outcomes compared with control, although sub-group analysis of those treated within 36 h of onset was interpreted as supporting this earlier time window.

No safety issues were identified in these studies, and similar paradigms of intravenous infusion of cells have been deployed in other therapeutic areas including myocardial ischaemia and multiple sclerosis.

As outcome measures, these trials used well-characterised clinical scales that are familiar from other therapeutic modalities in the acute stroke setting: general measures of neurological function (the National Institutes of Health Stroke Scale [NIHSS]) [30, 31], of activities of daily living (the Barthel Index [BI]) [32] and of disability (the modified Rankin Scale [mRS]) [33, 34]. These scales are advantageous in being well understood, widely used in routine practice as well as trial settings, acceptable to regulators, and being applicable to all potential patients irrespective of stroke mechanism, location or size. For an acutely delivered therapy, especially within the first 36 h, detailed patient selection is not practical. An inclusive approach to trial design also improves generalisability of results and would allow widespread adoption of treatments if efficacy is established. The properties of the scales are well understood, and specific training is available for NIHSS and mRS to minimise inter-observer variability. Standardised structured interviews and centralised video interpretation can be used to further reduce mRS variability, with the additional possibility of ensuring blinded independent outcome scoring.

Dosing can be reasonably controlled for allogeneic cell therapies, but is highly variable in autologous cell delivery [25, 27, 28]. The population of cells present in bone marrow mononuclear cell aspirates is also mixed, with a range of cell types including haematopoietic progenitor cells and mesenchymal stem cells, and unless additional ex-vivo culture expansion is undertaken (with the additional time incurred in this step delaying any potential therapeutic use), it is extremely difficult to characterise the cells that are actually delivered.

3 Chronic Stroke Cell Therapy

Early investigation of teratocarcinoma-derived neural cells [35,36,37] or porcine xenografts [37] established the feasibility and basic methodological approach for delivery of cells by direct cerebral implantation. Subsequent trials have adapted these methods for cell delivery [38]. Chronic stroke—arbitrarily proposed to be 6 months or more after the ictus—is a huge therapeutic need, several million people worldwide living with long-term neurological disability as a consequence of stroke and having high costs of social and medical care [39]. Current medical interventions are limited to secondary prevention of further events, prevention or treatment of physical complications such as spasticity or mechanical joint disruption, and behavioural adaptations to deficits. The challenge is uncertainty about whether there is useful plasticity in the injured brain at late stages after stroke [40], with very limited animal model data of uncertain relevance to this time point in human stroke.

Chronic, stable patients allow trials to adopt more careful selection, planning of procedures, and targeting of cell delivery. Intracerebral implantation of cells ensures that cells can be delivered in a defined dose to a specific location. While the mechanism of cell action in chronic stroke is not fully characterised, it more plausibly includes engraftment and integration than does intravascular administration, in addition to local anti-inflammatory, immunomodulatory and trophic effects, including stimulation of endogenous neurogenesis and angiogenesis [41,42,43]. What proportion of cells survive and engraft long-term is unknown, as is the differentiation fate of these cells.

Two trials using human stem cells have reported recently, the Preliminary Investigation of Stem Cell Effects in Stroke (PISCES-1) trial, and the SanBio phase 1 trial. Each included small numbers of patients with no control group, in order to establish safety and tolerability of different doses.

PISCES-1 [44] used ReNeuron’s human foetal cortical neural stem cell line genetically modified with the c-mycER-TAM transgene to allow large-scale cell production for allogeneic cell therapy, CTX0E03 [45,46,47]. The study followed a similar paradigm to that studied in rodent MCAO models of “chronic” stroke, with intraputaminal implantation of doses of up to 20 million cells. In PISCES-1, 11 patients received doses of between 2 and 20 million cells, and no cell-related safety issues were identified over a 2 year follow-up period.

The SanBio trial [48] included 18 subjects administered up to ten million cells placed stereotactically in small deposits around the margin of the infarct, using genetically modified human bone marrow-derived mesenchymal stem cells. Both studies included patients 6–60 months after ischaemic stroke, with a median time to treatment of around 2.5 years. Some improvements in neurological and functional scales over the first 3 months after implantation were described, with static function thereafter. The time scale for neurological change was unexpected and not clearly consistent with cell differentiation and replacement as the major mode of action.

For phase 2 studies, trials of these agents have opted to target patients with motor deficits specifically, and employed more specific neurological scales focusing on motor function as their primary end-points. The PISCES-2 study completed recruitment in summer 2016 and reported preliminary favourable early functional improvements sufficient to justify further clinical trial development (http://4965zs3ha2l125fk78zkozo3.wpengine.netdna-cdn.com/wp-content/uploads/ReNeuron-PISCES-II-data.pdf, accessed 7 July 2017). PISCES-2 selected a population of subacute stroke patients 3–12 months after onset, with major upper limb dysfunction, and its primary endpoint was recovery of useful upper limb function defined on the Action Research Arm Test (ARAT) [49].

4 Trial Design

Rigorous standards of trial design and interpretation are key to the credibility and ultimate adoption into practice of clinical trial results.

4.1 Controls

Clinical investigations of stem cells have hitherto justified small studies without concurrent controls or blinding on grounds of practicality and patient acceptability, but controlled trials will be necessary to advance clinical practice. Randomisation, placebo controls, and blinding are critical to the integrity of trials but the invasive nature of cell harvest for autologous cell preparation and of several delivery approaches means that some compromise may be required. As noted above in relation to the experience of acute intravenous cell delivery trials, optimal design is feasible for an allogeneic cell approach such as that of the Athersys multipotent adult progenitor cell studies, but blinding becomes problematic when autologous cell harvest is required such as in the Prasad trials. For more invasive intracerebral approaches, ethical issues become significant. The acceptability of placebo surgical procedures is debated. While investigators and regulators encourage this approach in recognition of the importance of placebo effects, [50, 51] patient groups have questioned it [52]. Opinions derived from patient experience in other diseases may not be applicable in stroke: in a neurodegenerative process such as Parkinson’s Disease, deferred cell implantation has been considered to be an acceptable offer for those trial participants randomised to the control arm (assuming that test implantation is established to be effective). In stroke, however, the injury is acute and the mechanisms by which cell therapy might be effective are heavily weighted towards the early subacute period, so deferred treatment cannot be reasonably assumed to have the same potential value as early treatment.

Placebo surgery is deemed to be acceptable and is preferred by some regulators. What constitutes an appropriate balance between placebo procedures and reasonable risk is a grey area. The phase 2 SanBio trial uses a partial thickness Burr hole under local anaesthesia, without dural incision or any intraparenchymal injection, allowing a similar procedural duration and blinding the participant and trial team (outside the operating theatre environment). Other trials are likely to adopt the same approach. This has potential advantages, but two alternatives merit consideration. First, the potential harmful effect of surgery cannot be assessed by this approach. Surgery may have negative consequences, including local infection, pain or bleeding, adverse effects of sedative medication for the procedure, and of temporary cessation of preventative antithrombotic medication, as well as many more if procedures are undertaken under general anaesthesia (as was the case in the PISCES trials, for example). An alternative design would be randomisation to surgical implantation or to a non-surgical control group, allowing evaluation of the net effect (both benefit and potential harm) of the procedure, but with the disadvantage of lacking control for a placebo effect.

A second alternative would be a more invasive approach, where the control group undergoes identical intraparenchymal injection of vehicle solution. This would offer additional blinding (except possibly for the surgeon), would also control for potential non-specific (positive) effects of injecting a volume of fluid to introduce a focal lesion in the brain, and allows investigators to distinguish specifically cell-related adverse effects from those of the procedure: for example, do the T2 hyperintensities seen around needle tracts in both the SanBio and PISCES trials represent a specific tissue reaction to cell implantation, or a non-specific reaction to vehicle, or some other aspect of the process? This more invasive approach has scientific merit but is likely to be deemed to expose the control group to unacceptable risk.

It is important to frame discussions with patients in terms that recognise the experimental nature of stem cell administration, since there is a widespread assumption that benefit is expected, or indeed inevitable. Uncritical reporting of early phase clinical trial findings (for example http://www.dailymail.co.uk/health/article-3622589/Major-breakthrough-doctors-REVERSE-symptoms-stroke-Patients-walk-talk-live-normal-life-stem-cell-treatment-3-YEARS-later.html, accessed 7 July 2017) and the widespread unregulated online advertising of supposed “stem cell therapy” clinics contribute to this environment. The potential for harm (including very limited long-term safety data for most cell types) is one component of a complex discussion [53].

4.2 Sample Size Estimation and Endpoints

In the acute setting, clinical trial design is informed by a large body of experience that will allow realistic estimates of credible effect sizes using the same general outcome scales as have been deployed in other trials. The mRS benefits from development over many years to minimise subjective inter-observer variability in scoring, including rater training, independent video assessment [54] and structured interviews [55,56,57,58]. Large databases such as the Virtual International Stroke Trials Archive (VISTA) can be interrogated to model the impact of inclusion and exclusion criteria on expected outcomes in the control population [59]. Statistical methods have been developed to maximise study power (or reduce sample size), particularly through analysis of the entire distribution of the mRS rather than arbitrary dichotomous outcomes [60, 61]. The use of a broad disability scale as a primary endpoint has the advantage of applicability to all types of stroke deficit: inclusion and exclusion criteria can therefore be less restrictive, with advantages for recruitment rates and generalisability.

Sample size is likely to be moderate or large for a credible range of treatment effects, particularly with subacute interventions. While there is an analogy with acute trials in terms of design, a less appealing analogy is the failure of all clinical trials in acute stroke other than those involving reperfusion, a source of much soul-searching in the neuroprotectant field, among other notable translational failures [62, 63]. Many of the deficiencies that were postulated to underpin the failure of neuroprotectant drug trials might also apply to cell therapies: overestimates of effect size leading to trials that were too small, insufficient phenotypic detail to select relevant patients with biological targets, and a tendency to seek “responder populations” by tortured analysis of small phase 2 trials and their subgroups with consequent restrictive and (with hindsight) misdirected patient selection criteria in repeat phase 2b/3 trials. Nonetheless, there is now a track record of success for these general approaches in acute stroke through both thrombolytic drugs and thrombectomy, and the design features are recognised by clinicians and regulators.

Trials in chronic stroke face potentially greater challenges as the methodology is less standardised, largely a reflection of the absence of positive clinical trials in the rehabilitation and regeneration field [64]. Trials have elected to target what is effectively a human model system, motor deficits of upper, or both upper and lower, limb. Similar approaches have been taken in trials of rehabilitation strategies such as constraint-induced motor therapy [65]. While there is logic in the approach, since motor deficits are common after stroke and scales for motor function assessment are available, potential difficulties are illustrated by previous motor rehabilitation studies, which have been characterised by slow recruitment through the need to target patient populations with very specific deficits, and high drop-out rates. Despite enthusiasm for specific motor function scales such as the Fugl-Meyer scale among specialists in rehabilitation [66, 67], the scale is less widely understood among stroke physicians and its acceptability to regulators is unclear. Familiarity with more specialised scales such as the ARAT is less still, and specific equipment and training are required to perform the assessment. There is also uncertainty over the extent of change on ARAT that is meaningful to patients [68].

Motor function change represents only one component of a complex multi-dimensional deficit. Whilst emphasis has been placed on inclusion of adjunctive physical therapy to minimise between-site variation in multicentre trials, defining the minimum necessary duration and intensity of therapy input has proved to be challenging [69], and the content of therapy programmes may vary widely. Physical therapy represents only one of several therapy inputs that a patient is likely to receive, and even description of rehabilitative inputs has proved challenging, let alone quantification.

Combination of several different outcome scales has been proposed in order to better capture the multi-dimensional nature of stroke recovery [70], and has been an effective strategy in some acute trials such as the National Institutes of Neurological Disorders and Stroke (NINDS) trial of thrombolysis [71]. This statistical approach may strengthen findings when there is a common direction of effect, but may weaken study power if effects diverge.

Sample size ultimately depends on the variance of the outcome measure and the magnitude of the effect of the intervention. An uncomfortable reality for the field is that the costs, invasive nature of the interventions, and complexity of cell supply, are all likely to place practical constraints on trial size in cell therapies. The magnitude of the effect is difficult to estimate, and cannot be extrapolated reliably from animal studies where cell delivery has typically been much earlier than has been attempted (or is likely achievable) in the clinic. The schematic representation of different processes contributing to brain injury after stroke as a series of waves of differing size, latency and duration [72] is a useful conceptual framework, which recognises the dominant effect of very early cell necrosis due to severe ischaemia. Imaging studies support the very short time window during which the greatest part of an ischaemic lesion becomes damaged irreversibly. Thereafter there may be smaller contributions to the final infarct from late processes such as inflammation and apoptosis, but these appear to be minor contributors to the physical extent of an infarct. Clinically it has been difficult to demonstrate either delayed infarct growth (other than oedema), or any measurable neurological consequence of this, although there may be a relationship [73]. Interventions delivered at 36 h after stroke onset and likely to have a predominantly anti-inflammatory action may thus be anticipated to have a small treatment effect and to require very large sample sizes for convincing demonstration of any therapeutic action. The failure of other anti-inflammatory treatment strategies [74,75,76,77,78] may reflect ineffective drugs, late delivery, or trials that were too small, but equally it may signify that this mechanism has insufficient impact on tissue fate at late stage to be detected with anything other than an extremely large trial.

4.3 Patient Selection

Sample size is greatly inflated if trials include patients with no relevant biological target for a therapeutic intervention [79]. Recent acute trials have highlighted that modest sample sizes can produce highly significant and persuasive results when the relevant target population is selected—for example with selection of MCA occlusion and appropriate volumes of viable tissue on perfusion imaging, for both intravenous thrombolysis and for thrombectomy [80,81,82]. The relevant clinical phenotype has not been clearly defined for acute systemic cell therapy, but demonstration of a worthwhile clinical effect is likely to follow similar principles to other acute stroke trials. In chronic stroke trials based on motor recovery paradigms, variable outcomes are well recognised and are not reliably predictable from clinical scales alone [83], especially for the more severe deficits that are likely to be over-represented in any clinical trial of an invasive therapy. Selection of patients with potential motor response might be possible using combinations of clinical scales, brain imaging (for example to define the integrity of the corticospinal tract) and excitability of the motor system using transcortical magnetic stimulation (TMS) or similar [84,85,86,87]. Refining practical methods for patient selection that could be applied consistently across multiple clinical centres in a trial is a challenge: very complex and time-consuming imaging analysis may be difficult and may not be feasible for all scanners [88], and methods such as TMS are poorly standardised and not widely available. The adverse consequences of insufficiently informed patient selection are, however, clear, and can be documented both by head-counts of neutral and failed trials, and by the huge accompanying cost to the academic and pharmaceutical industry communities [89]. In addition, insufficient levels of phenotypic detail will impede any attempt to better define a potential “responder” population for future studies. Inclusion of imaging at least sufficient to offer an opportunity to enhance mechanistic understanding would be advantageous both for individual trials and for the field as a whole. In an analogous setting where clinical scales offered limited insight, multiple sclerosis, the identification of an imaging biomarker was the key that unlocked the door to disease modifying therapy development, by enabling clinical trials of reasonable size to be undertaken. Stem cell therapies would benefit from the same approach.

4.4 Time Windows

The nature of the intervention places constraints on cell supply for trials, as noted above, and this introduces additional limits on time windows. Experimental models of stroke indicate a complex and dynamic environment after ischaemia, and biological targets change both over time and anatomically. Underlying disease states may modify cell therapy effects and are rarely investigated in model systems. Patients undergo complex rehabilitation interventions, receive a variety of secondary preventative medications, and are prone to complications that reflect physical disabilities as well as systemic effects of stroke such as immunodepression, all of which might influence recovery patterns. To deviate far from the experimental evidence underpinning a particular cell therapy adds another confounding factor, and it would appear logical to limit at least this element that can be under the control of the trial investigators. Acute intervention in animal models has been overwhelmingly delivered in early acute stages; “chronic” intervention at most a few weeks after stroke in rodents. Thrombolytic therapy for stroke could not show benefit when delivered an average of 4.5 h after onset [90]. It was only when the NINDS trial adhered rigidly to the narrower time window of 3 h based on animal model data (and insisted on even earlier treatment in 50% of participants by design) that benefit was eventually evident [91]. We may ignore the lessons of animal model time windows at our peril.

5 Summary and Conclusions

Two distinct paradigms for cell therapy in stroke are under investigation.

Acute systemic administration of cells follows a pathway reasonably well informed by other acute trials, but has to consider what might be a credible effect size and an appropriate time window for delivery. Allogeneic cells appear to have an advantage over autologous cells in this setting. Only limited patient selection is feasible, but generalisability of results is straightforward.

Chronic stroke allows for more targeted intracerebral administration of cells, but the invasive nature of the procedure places constraints on trial design and size, and the human model of motor system recovery is less standardised in the absence of successful clinical trial interventions. Patient selection, employing complex imaging and other advanced methods, and development of biomarkers will likely be critical to success.