Introduction

Hemifacial spasm is characterized by unilateral involuntary tonic and clonic contractions of a number of facial muscles. Bilateral involvement is rare (Jamjoon et al. 1990). It occurs twice as often in woman compared to men with an overall prevalence of 10/100,000 and it usually appears in the fourth to seventh decade of life. In some populations, such as the Asian, the prevalence is much higher (Auger and Whisnant 1990). The usual cause of hemifacial spasm is a vessel touching the facial nerve near its origin from the brainstem. However, up to 25% of normal controls also have vascular loops compressing the seventh cranial nerve (Tan et al. 1999). The diagnosis of hemifacial spasm is based on clinical observation and medical history, but radiological imaging can be helpful to exclude alternative organic causes (Nagata et al. 1992; Wang and Jancovic 1998; Kenney and Jankovic 2008).

Hemifacial spasm is a chronic disease and spontaneous remission is rare. Quality of life (QoL) can be significantly impaired due to the unusual appearance and the excessive closure of one eye. In consequence hemifacial spasm interferes with the patient’s professional and social life leading to relevant health and possibly economic implications. The invasive approach of microvascular decompression results in a long-term cure rate of 88–97%, but post surgical morbidity such as monolateral hearing loss and facial weakness in addition to the risk of intracranial hemorrhage remains a concern (Barker et al. 1995; Frei et al. 2006).

The treatment of hemifacial spasm with botulinum toxin (BoNT) was pioneered by Elston in 1985 (Elston 1986). Meanwhile BoNT has become the symptomatic treatment of choice for hemifacial spasm (Jost and Kohl 2001). The effect of BoNT is due to the blockade of acetylcholine release at the neuromuscular junction, which reduces or avoids excessive muscular contractions and leads to normalization of muscle activity. It is still not clear how to explain the fact that BoNT diminishes the innervation and in the same way the muscle force, but the pathologic innervation impulses resolve completely. The effects of BoNT are reversible which also applies to the undesirable effects, and the mean duration of action in hemifacial spasm lasts 2.6–4 months (Jost and Kohl 2001). In individual patients the effect of BoNT can last up to 1 year.

The evidence proven by randomized, controlled studies supporting BoNT use in hemifacial spasm is very limited (Jost and Kohl 2001; Costa et al. 2005; Simpson et al. 2008). However, there is a large magnitude of open-label studies and case reports documenting the beneficial effect of BoNT treatment (see appendix of this paper). This may have discouraged efforts to study BoNT in properly controlled clinical studies, especially in placebo-controlled trials.

The lack of evidence from controlled studies raises the question how to assess the treatment effect of BoNT from open-label trials and how to compare the treatment outcome of different clinical studies. Although BoNT is widely accepted as the treatment of first choice for hemifacial spasm, further research is required to optimize therapeutic regimens and to compare BoNT with other treatment options like surgical interventions. Furthermore, there is a need to define consistent assessment tools and rating scales which can not only be used to evaluate the effects of BoNT in research projects but also in routine clinical practice, especially to ensure a standardized quantification of long-term treatment effects. Because there exist three preparations of botulinum toxin A (and one of botulinum toxin B, whenever not approved for the use in HFS), there is a great interest to know which are the advantages and disadvantages of each medication. In the daily practice, a number of patients report a different treatment result when using different Botulinum A drugs which cannot be predicted. It is our experience when doing such comparing studies predominantly in essential blepharospasm, that rating scales are often insufficient in pointing out differences in effectiveness, especially in patients with minor complaints.

In contrast to blepharospasm (Wabbels et al. 2011) there is no consensus on main clinical rating scales for hemifacial spasm. Duration of effect and global rather than disease-specific rating scales have been used to assess the outcome in most of the studies with BoNT. The present review describes different approaches to evaluate the efficacy of BoNT in patients with hemifacial spasm and provides a literature search of published clinical studies. The paper concludes with a discussion and suggestions how to best assess the efficacy of hemifacial spasm therapy.

Rating scales for hemifacial spasm and their use in BoNT trials

For many trials with BoNT in patients with hemifacial spasm duration of improvement or relief is stated as the primary variable for efficacy. But, there is often no clear definition how duration is evaluated and the method varies from study to study. Duration of improvement is mostly assessed subjectively as the interval from treatment to patient reported waning of effect or until the patient requests another treatment (Taylor et al. 1991; Sampaio et al. 1997; Jitpimolmard et al. 1998; Chen et al. 1996; Gupta et al. 2003).

On the other hand, there is also a wide diversity of rating scales used to assess treatment outcome in the relevant clinical trials with BoNT in patients with hemifacial spasm (Table 1). In accordance with a previous review of rating scales to assess blepharospasm (Wabbels et al. 2011), these scales for hemifacial spasm can be classified into three general categories (1) clinical scales, (2) activities of daily living/functional ability status scales, and (3) global rating scales. In this section the different types of rating scales are described considering their advantages and disadvantages. The clinical value of rating scales used in the main clinical studies with BoNT will be discussed.

Table 1 Studies of BoNT treatment of hemifacial spasm (limited to controlled studies of any size and open-label studies with ≥50 patients)

Overview of clinical scales

Clinical scales were developed to facilitate objective classification of clinical symptoms and function by independent assessors. Some studies use videotape recordings before and after treatment for objective scoring or cross-over evaluations by two assessors (Chen et al. 1996; Yoshimura et al. 1992; Van den Bergh et al. 1995). Other clinical scales are based on numerical ratings with related descriptors for symptom frequency, severity and/or functional disability.

Although clinical scales are used in a considerable number of clinical studies with BoNT (Table 1), there is no evidence that a specific rating scale has become established for hemifacial spasm. Moreover, many of these studies enrolled both patient populations with blepharospasm and hemifacial spasm. This might be the reason why rating scales for blepharospasm historically developed in the 1980s and early 1990s were also applied for hemifacial spasm patients.

The Elston functional scale which was originally designed for the rating of blepharospasm was implemented in one of the pivotal studies with Dysport® (Elston 1992, see Table 1). The Fahn scale, a more complex rating scale also developed for blepharospasm (Fahn 1985), was applied as a quantitative measure of the modification of the clinical status in a randomized, single-blind, parallel group comparison of Dysport® versus Botox® in patients with blepharospasm and hemifacial spasm (Sampaio et al. 1997).

Interestingly none of the randomized controlled studies or those enrolling at least 50 patients summarized in Table 1 used the Jancovic rating scale (Jankovic and Orman 1987), which is probably the most adequate clinical scale to assess the severity and frequency of blepharospasm (Wabbels et al. 2011).

Nevertheless, there is the obvious need for—as far as possible—objective symptom assessment and therefore several clinical scales were implemented by different research groups as shown in Table 2 (Park et al. 1993; Chen et al. 1996; Tan et al. 2004; Tunç et al. 2008). Park et al. 1993 used a rating scale of facial and orbicularis muscle spasm similar to the Jancovic rating scale for blepharospasm severity. Although all of these four scales have an ordinal numbering from 0 = no signs/no spasm to 4 = severe signs/severe spasm, the descriptors for each of the scores are not necessarily the same. This makes the efficacy evaluation of different clinical studies hardly comparable. The advantage of all above mentioned five-point rating scales is their simplicity, so that they can easily be implemented in a multicenter trial with several assessors. But the main drawbacks are the lack of disease specificity and the fact that neither the patient’s subjective impression of functional and social impairment nor the course of hemifacial spasm over time is taken into account. Furthermore, it has to be considered that a one-point reduction on a five-point scale corresponds to approximately 25% improvement. This renders such numerical rating scales inappropriate to describe subtle changes of disease severity, which may, however, be relevant from the patient’s point of view or for comparison between different BoNTs.

Table 2 Five-point rating scales for hemifacial spasm in BoNT trials

Clinical scales in trials with BoNT

Three randomized, controlled studies using clinical scales as efficacy parameters will be described in more detail in the following section. These studies were assessed in an evidence-based review (Simpson et al. 2008) and classified as Class II (Yoshimura et al. 1992; Sampaio et al. 1997) and Class III (Park et al. 1993) according to the American Academy of Neurology criteria (http://www.aan.com).

Only one study (Yoshimura et al. 1992) fulfilled the criteria to be included in the Cochrane review of BoNTA therapy for hemifacial spasm (Costa et al. 2005). Each of the 11 enrolled patients cycled through the following four treatment arms: an arbitrary dose based on clinical experience of between 2.5 and 10 U of Botox®, half the dose, double the dose, and placebo. Clinical status was quantified using a ten-point scale as follows: frequency (0–3), where 1 = 0–10, 2 = 11–20, and 3 = >20 spasms per minute or sustained spasms for more than 10 s; number of muscles involved (0–4), among the frontalis, orbicularis oculi, muscles active about the angle of the mouth, and platysma; severity (0–3), from 1 = mild, 2 = moderate, and 3 = severe. Differences of 0.5 score points were considered significant and of 2 or more were regarded as substantial. The clinical scale was sensitive enough to clearly discriminate the three BoNTA doses from placebo and there was a tendency for substantial response to increase with higher doses of BoNTA.

Another study (Park et al. 1993) applied a five-point rating scale of intensity of facial and orbicularis muscle spasm (Table 2) resembling the Jancovic rating scale for blepharospasm severity, however, the descriptors are not completely identical. Upon inclusion in the study 86.1% of 101 patients with hemifacial spasm belonged to grade 3 or 4. After BoNTA injection 46.0% were rated as grade 0 and 52.4% as grade 1. This means that the majority of patients showed an improvement of at least 2 points on the spasm intensity scale.

Sampaio et al. (1997) recruited 49 patients in a randomized study with parallel group design to compare Dysport® and Botox® (dose ratio 4:1). Although duration of action was chosen as the primary endpoint, in addition the modification of the clinical status was assessed by the Fahn rating scale for blepharospasm as described above. Unfortunately, the publication only shows the data for the primary variable and not for the clinical status. It remains unclear, if the primary variable chosen would have been sensitive enough to detect a difference between both BoNTA preparations.

Considering these three studies as well as the others of Table 1, it seems appropriate to rate hemifacial spasm similar to blepharospasm to cover the aspect of impaired function. However, in contrast to blepharospasm there is only one eye affected, and the sole rating of functional disability without taking into account psychosocial factors may not adequately reflect the burden imposed on patient’s well-being.

Overview of activities of daily living/functional ability status scales

Consideration of ratings by independent assessors in the evaluation of treatment outcomes has the advantage of objective assessment of disease-related function but does not take into account patient’s perspective as well as variability of complaints over time. The approach to evaluate health-related quality of life (HRQoL) was developed to fill this gap. The concept of HRQoL has been applied in ophthalmology and neurology (Bremond-Gignac et al. 2002; Meyers et al. 2000), however, studies in patients with hemifacial spasm are rare. One study showed that HRQoL improved in patients with hemifacial spasm treated with BoNT as measured by a visual analog scale (VAS) (Schnider et al. 1999).

Tsai et al. 2005 assessed improvement of functions of daily living in their study enrolling 48 patients with blepharospasm and hemifacial spasm. There are six variables comprising this score: reading, watching TV, house work, working, driving, and outing alone. The items of the score used in the study of Tsai et al. are similar to those of the Blepharospasm Disability Index (BSDI) (Goertelmeyer et al. 2002; Roggenkämper et al. 2006). For each variable a rating from 0 to 4 is performed to calculate a sum as the overall disease severity score. In the study of Tsai et al., the disease severity score was significantly lower 6 weeks after treatment with BoNTA compared to baseline. There are two main disadvantages of the score described above, i.e., the study of Tsai et al. is the only study using this score and does not differentiate between the population of blepharospasm and hemifacial spasm patients, and the disease severity score is not validated. The validation of HRQoL scales requires evidence of a positive correlation with established instruments such the SF-36 questionnaire (36-item short-form health survey questionnaire) in the respective patient population.

The SF-36 was developed to assess multiple health-related domains (physical functioning, role physical, bodily pain, general health, vitality, social functioning, role emotional, and mental health) (Ware et al. 1993). The study of Reimer et al. (2005) used the SF-36 amongst other HRQoL questionnaires. It was demonstrated that hemifacial spasm is accompanied by substantial HRQoL impairment, but functional improvement due to treatment with BoNTA does not necessarily translate into HRQoL gains.

In a very recent study, the SF-36 questionnaire was implemented in a randomized, double-blind study to compare two different formulations of BoNTA in blepharospasm and hemifacial spasm (Quagliato et al. 2010). In both treatment groups, patients with blepharospasm showed improvement in the emotional aspects domain 16 weeks after treatment with BoNTA compared to baseline. In contrast, there were no differences in SF-36 scores before and after treatment in patients with hemifacial spasm. Since patients with hemifacial spasm frequently suffer from emotional and related mental problems rather than physical disability, generic HRQoL instruments may not capture the full impact on patient’s HRQoL. This emphasizes the need for a validated, disease-specific questionnaire to assess HRQoL in patients with hemifacial spasm.

Tan et al. (2004) developed a HRQoL instrument specific for hemifacial spasm by adopting the subscale classification of an existing questionnaire for Parkinson’s Disease (PD), the PDQ-39 scale which has been validated in many languages (Peto et al. 1995). The hemifacial spasm questionnaire, HFS-30 consists of seven domains: mobility, activities of daily living, emotional well-being, stigma, social support, cognition, and communication (Table 3). The research group created 30 questions based on their experience and interviews with patients. As some of the questions in PDQ-39 were relevant for hemifacial spasm as well, 14 of these were included in HFS-30. All of the items are scored on five-point scales ranging from 0 (never) to 4 (always). Huang et al. (2009) used the HFS-30 to validate a Chinese version of the hemifacial spasm questionnaire. They added a new domain including five items for bodily discomfort and one item in the stigma domain (HFS-36). From an open, prospective study in 103 patients with hemifacial spasm the authors concluded that HRQoL was significantly improved after treatment with BoNTA assessed by HFS-36 and SF-36, but compared to SF-36, the HFS-36 scale was more sensitive and specific to evaluate HRQoL in hemifacial spasm.

Table 3 HRQoL questionnaire HFS-30 (Tan et al. 2004)

However, Tan et al. recognized a couple of shortcomings of the HFS-30 questionnaire: due to its length the practicability is limited, and its discriminant validity between patients and controls as well as its correlation with a generic HRQoL scale has not been proven. The HFS-7 questionnaire is a short and simple clinical tool to assess HRQoL in hemifacial spasm (Tan et al. 2005). The selection of items was based on the experience with the HFS-30 questionnaire, of which seven items were chosen from the domains mobility, activities of daily living, emotional well-being, and stigma (Table 4). Six of these items were selected based on the results of a previous study (Tan et al. 2004), because they had been shown to be most sensitive to response of BoNT. In a case control study enrolling 85 patients with hemifacial spasm and matching healthy controls, it could be demonstrated that every item in HFS-7 is able to discriminate between disease and controls. Furthermore, the HFS-7 scale closely correlated with the SF-36 summary score, in particular with the emotional and social domains (Tan et al. 2005; Tan and Seah 2007).

Table 4 HRQoL questionnaire HFS-7 (Tan et al. 2005)

Activities of daily living/functional ability status scales in trials with BoNT

Tan et al. (2004) examined the validity and reliability of the self-rating HRQoL questionnaire HFS-30 (Table 3) in 80 patients with hemifacial spasm. Furthermore, the correlation with a neurologist assessment of disease severity and response to BoNTA treatment was investigated 6–8 weeks after injection on a five-point scale (0 = no effect, 1 = mild effect, 2 = moderate effect, <50% improvement, 3 = moderate effect, >50% improvement, 4 = marked effect, almost complete resolution).

There was a significant positive correlation of the HFS-30 score before treatment with the severity of hemifacial spasm, in particular for questions on mobility, activities of daily living, and stigma scales. Only social support scores had a poor correlation. This means that the more severe the hemifacial spasm, the greater is the impact on patient’s perceived impairment of HRQoL. HFS-30 scores rated by patients 6–8 weeks after treatment correlated with physician’s assessment of response to BoNTA, i.e., the more patients responded to therapy as judged by the treating physician on the five-point global scale described above the merrier was their self-rating by the HRoQL questionnaire. Especially subscales of stigma, emotional well-being, and social support demonstrated a significant correlation. The results of Tan et al. 2004 support the observation that BoNT can improve HRQoL in patients with hemifacial spasm. The validity and reliability of the HFS-30 questionnaire should be examined in larger patient populations and in controlled, preferably double-blind studies.

The same research group applied the validated disease-specific HRQoL scale HFS-7 (Table 4) in a prospective study with hemifacial spasm patients (Tan et al. 2008). The aim of this study was to examine whether a better level of knowledge of the disease would lead to an improved HRQoL and treatment response. Only 25% of the patients were considered by the authors to have a high knowledge of their disease. Patients with a good knowledge of hemifacial spasm reported higher, i.e., more severe HFS-7 scores before treatment, but experienced a significantly greater improvement in HFS-7 total score and HFS-7 subscore after BoNT injection. Although not mentioned by the authors, a possible explanation for this result could be that patients who are affected more severely are more interested to gain information about their disease.

It can be concluded from these studies that there is a need to focus on activities of daily living when evaluating treatment outcomes in patients with hemifacial spasm. But in fact only one group of researchers worked on the development of disease-specific scales, which were not applied for other clinical studies, and randomized, controlled BoNT trials using the HFS-30 or HFS-7 questionnaire are outstanding. In a retrospective analysis of patients with hemifacial spasm who underwent microvascular decompression, a modified version of the HFS-7 questionnaire was applied, and the authors conclude that patients experienced a significant and prolonged improvement in postoperative HRQoL (Ray et al. 2010). However, there are no studies comparing BoNT treatment with surgical measures.

Overview of global rating scales

Global rating scales are not disease-specific but are general and simple tools to facilitate patient’s rating of treatment effects. A couple of BoNT trials in patients with hemifacial spasm apply a VAS from 0 to 100% to rate improvement after treatment in addition to the duration of the effect (Jitpimolmard et al. 1998; Rieder et al. 2007; Barbosa et al. 2010). Other studies use simple four-point rating scales with short descriptors, e.g., excellent, moderate, mild, no improvement or worse (Poungvarin et al. 1995a, b; Chang et al. 1999; Defazio et al. 1990).

The advantage of global rating scales is that they do not cover single aspects or defined symptoms of a disorder, but provide a general judgement of treatment effects. This means that they reflect patient’s overall assessment of the disease state and thus cover those aspects which are most important from the patient’s perspective. The difficulty is to conclude from changes in global rating scales to which extent these are clinically relevant for the patient. Unfortunately, in many studies this is not defined prospectively. Depending on patient’s interpretation of the descriptors, global rating scales are prone to subjective variability. Furthermore, patients might have difficulties to remember the baseline disease status when they have to assess the treatment effect after several weeks.

Global rating scales in trials with BoNT

One of the pivotal studies with Dysport® used patient assessed improvement on a VAS as the primary variable for efficacy (Jitpimolmard et al. 1998). Peak improvement was subjectively determined using a VAS and reported in percentages (0–100%). The treatment was considered unsuccessful if peak improvement was below 20%. The response rate was 97%, and the mean peak improvement ranged from 72.7 to 80.1%. Long-term results showed no significant difference over the series of the first-to-twelfth treatments.

Poungvarin et al. (1995a and b) applied a global four-point rating scale to evaluate response after BoNTA treatment in a double-blind, placebo-controlled, cross-over study (n = 55) and in an open study in a very large patient population with hemifacial spasm (n = 592). Evaluation of efficacy was based on patient’s self-reported rating 2 weeks after treatment: 1 = excellent (more than 50% improvement), 2 = moderate (25–50% improvement), 3 = mild (less than 25% improvement), 4 = no improvement or worsening. The applied rating scale was sensitive to distinguish between BoNTA and placebo treatment in the double-blind study, however, there is no information how severely the patients were affected before treatment.

Another very recent open, prospective, parallel group study compared Botox® (n = 78) and Dysport® (n = 55) in 133 patients with hemifacial spasm for a treatment period of 6 years (Kollewe et al. 2010). The mean ratio of BoNTA dosages in the Botox® and Dysport® group was 1:2.56. In addition to the duration of effect, a 0–3 scale (global clinical improvement scale, GCI) was used to measure the treatment effect. No significant differences in efficacy could be found between the two BoNTAs, however, it is questionable if the GCI scale would have been sensitive enough to detect small differences between BoNTAs as there are only four possible ratings for improvement.

Discussion and conclusions

The aim of this review was to identify the relevant clinical studies with BoNT in hemifacial spasm and to evaluate the rating scales implemented to assess treatment effects. Several studies with BoNT simply use duration of effect or duration of improvement as the primary efficacy variable, but in many cases there is neither a definition what is meant by “effect” or “improvement” nor is duration specified. In consequence, results for duration of BoNT effects from different clinical studies are not comparable. Duration is usually assessed subjectively by the patients, i.e., return of similar degree of spasm, patient reported waning of effect or the interval between the injection until the patient requests another treatment. Measurement of duration may therefore be imprecise, as in our experience patients tend to return for the next treatment before the effect has completely ceased.

An important intention of this review is to emphasize that measuring duration of effect is not adequate to assess treatment outcome, but validated rating scales could offer a more precise option to define the outcome in studies with BoNT. Although BoNT is widely accepted as first-line treatment of hemifacial spasm, it has to be born in mind that there are no controlled clinical trials of high level evidence. The requirement to conduct clinical studies with BoNT remains to optimize therapeutic concepts, refine dosing as well as application schedules, and to reduce the proportion of non-responders. Furthermore, there are no comparative studies of BoNT with surgical interventions, and new therapeutic concepts will have to prove their value in comparison to BoNT. Standardization of rating scales for hemifacial spasm will provide a sound scientific basis for outcome assessment in future research projects. Moreover, the assessment of HRQoL becomes increasingly important to evaluate the effectiveness of treatment regimes, and it has to be questioned if former studies have measured those aspects which are most relevant to patients. Finally, it has to be considered that hemifacial spasm is a chronic disease and long-term treatment effects should be assessed with standardized rating scales.

However, with regard to clinical scales used for hemifacial spasm the picture is very heterogeneous. In contrast to the Jankovic rating scale, which has become the most widely used current clinical scale for blepharospasm (Wabbels et al. 2011) there is no established rating scale for hemifacial spasm. Due to the lack of a disease-specific instrument, some research groups simply applied clinical scales, which have been historically developed for blepharospasm (e.g., the Elston functional scale or the Fahn rating scale). As listed in Table 2 numerous ordinal ratings anchored by descriptors, mostly five-point scales ranging from 0 (no symptoms) to 4 (most severe symptoms), have been applied in BoNT trials. However, the descriptors vary between different research groups, e.g., the rating 2 can be defined “as mild, noticeable fluttering, no functional impairment” in one study (Park et al. 1993) or as “moderate disability, no functional impairment” in another study (Tan et al. 2004). This makes a standardized interpretation of treatment effects impossible and emphasizes the need to come to a consensus for a clinical scale to rate severity and frequency of hemifacial spasm similar to the Jancovic rating scale for blepharospasm.

However, it has already been discussed in a previous review focusing on blepharospasm that clinical scales in the form of ordinal, numeric ratings have some disadvantages (Wabbels et al. 2011). A significant drawback is the lack of sensitivity due to the limited number of possible ratings. Although clinical scales were able to differentiate between BoNT and placebo in two controlled trials enrolling patients with hemifacial spasm (Yoshimura et al. 1992; Park et al. 1993), it remains at least questionable if these tools would be sensitive enough to detect slight differences between effective treatments. Evidence from controlled studies with different BoNT formulations or comparative trials of BoNT and surgical measures is missing. Furthermore, due to their limitations with regard to sensitivity clinical scales may not be appropriate to demonstrate improvement in patients whose function is only mildly impaired. In the study of Park et al. (1993) 86% of the patients had moderate or severe hemifacial spasm before BoNT injection, and 98% experienced an improvement of at least 2 score points on the five-point rating scale after treatment. In contrast, we frequently encounter patients in our daily practice, who show just mild functional disability but suffer from significant discomfort and psychosocial problems due to disfigurement by this disorder.

Recommendation (1): Despite some drawbacks the Jancovic rating scale is widely accepted as a standard tool for evaluation of blepharospasm. For the objective rating of spasm severity and frequency it does not make a relevant difference if one or two eyes are affected. Therefore, the Jancovic rating scale is appropriate to assess functional impairment in patients with hemifacial spasm, but the scale is not sufficient to cover all facets of the disorder.

In our literature search, we could not identify a clinical rating scale to describe cheek involvement, and in consequence none of the studies summarized in Table 1 address this feature of hemifacial spasm. For patients with blepharospasm severity of eyelid spasms leading to visual impairment is the most prominent factor of the disorder. According to our clinical experience in patients with hemifacial spasm, eyelid spasms can be often treated more effectively than cheek involvement which remains a source of embarrassment for patients. Therefore, a clinical rating of muscle spasms of the cheek should be developed to assess this symptom being of relevance for patient’s judgement of severity. Otherwise, there could be inexplicable discrepancies between patient’s and physician’s rating of improvement after treatment.

Recommendation (2): We suggest adding a rating for severity of cheek involvement which is not covered by the Jancovic rating scale. For the sake of consistency with the established five-point scale from 0 to 4 for severity of eyelid spasms the rating could be as follows: 0 = no spasm, 1 = mild, barely noticeable spasm, only recognized by the patient, 2 = mild, noticeable spasm, 3 = moderate spasm including the corners of the mouth, 4 = severe spasm with involvement of the whole cheek. Frequency of cheek involvement can be assessed comparably to the rating of eyelid spasm according to the Jancovic Scale.

To take into account patient’s perception of the disease status several studies applied global rating scales such as a VAS (0–100%) for self-rated improvement after treatment. In comparison to the ordinal, numeric scales discussed above the VAS has the advantage that it comprises a broader range of ratings. Nevertheless, it can be anticipated that the assessment of improvement by a patient will considerably depend on current mood and personal expectations with regard to the administered treatment. Furthermore, it might be very difficult for a patient to exactly remember the baseline status of the disorder, when improvement is rated several weeks after treatment. To overcome the latter problem the suggestion of Wabbels et al. (2011) could be followed to use the percentage of normal function scale (Brin et al. 1995) rather than rating “improvement” on the VAS. The percentage of normal function scale captures the reduction of normal (i.e., 100%) function on a VAS from 0 to 100%. This VAS can be provided in a patient diary to be completed at baseline and at several time points after the respective treatment. As a general and simple tool it can be easily implemented in large, multicenter studies.

Recommendation (3): An additional VAS would be a suitable tool to capture patient’s global rating of disability. To be consistent with the other items where higher values correspond to a more severe disease, we recommend a VAS ranging from 0% (no complaints) to 100% (suffering extremely). However, due to the fact that global rating scales are prone to patient’s subjectivity, they are only recommended as a secondary parameter and not as the primary efficacy variable in clinical trials.

The communication on the website of the Hemifacial Spasm Association (HFSA), an international online community, illustrates that thousands of patients with hemifacial spasm suffer from embarrassment, frustration, and depression. To address this issue other outcome parameters have to be measured in addition to visual disturbance and functional disability. To rate impairment of daily activities in patients with blepharospasm the BSDI was developed, which has been applied in a number of recent BoNT studies (Wabbels et al. 2011; Roggenkämper et al. 2006). In contrast, for hemifacial spasm there is basically one group of researchers who worked on disease-specific HRQoL scales (Tan et al. 2004, 2005), but these have not been generally established in clinical trials with BoNT.

An ideal HRQoL scale should on the one hand be short and concise and on the other hand demonstrate good validity, reliability and sensitivity for detecting a change when it has occurred. The HFS-30 is a more complex questionnaire comprising 30 items in 7 subscales (Table 3). It was designed to overcome the obvious lack of a validated, disease-specific scale for evaluating BoNT response in hemifacial spasm. As a second step, the same research group could prove the validity of a short and simple HRQoL instrument consisting of only seven questions (HFS-7, Table 4). The HFS-7 scale closely correlated with the emotional and social domains of the SF-36 questionnaire (Tan et al. 2005; Tan and Seah 2007). Many vision-related activities such as reading, watching television, and driving had significant impact on HRQoL as perceived by patients with HFS, and it could be demonstrated that BoNT treatment improved these symptoms.

Although there was a significant positive correlation of the HFS-30 and HFS-7 scores with the severity of hemifacial spasm, it has to be further followed up if these questionnaires really capture those aspects which are most important to patients. A possible limitation of HRQoL scales arises from the fact that the emotional and psychical state of patients may vary from day to day. Furthermore, research on HRQoL has demonstrated that self-reported health status differs across gender and other sociodemographic and socioeconomic status characteristics, such as race, marital status, education, and income (Cherepanov et al. 2010).

Up to now all studies with HFS-30 and HFS-7 had an open, prospective design. In consequence, the question remains unacknowledged if these tools would be suitable to differentiate between several effective treatments. To address this issue disease-specific HRQoL instruments have to be implemented in controlled, double-blind clinical trials with a sufficient number of patients. It would be especially interesting to select patients who present with just mildly impaired function to investigate if HRQoL scales are appropriate to cover those aspects of hemifacial spasm related to psychosocial well-being, which cannot be adequately assessed by clinical scales.

Recommendation (4): The BSDI comprising six daily activities (Goertelmeyer et al. 2002; Roggenkämper et al. 2006) is also applicable for patients with hemifacial spasm, but we suggest the following modifications. To focus on those aspects which are most relevant to patients, two items out of six should be selected by the patient. Assessment of these items should be performed on a VAS instead of an ordinal, numeric scale from 1 to 4, because the VAS offers a broader range of possible ratings.

Recommendation (5): In addition, questions 4–7 of the HFS-7 questionnaire (Table 4) could be used to quantify the psychosocial burden of hemifacial spasm. Questions 1–3 of HFS-7 are already covered by the items which can be selected from the BSDI.

In our literature search, we identified a large number of clinical studies with BoNT in hemifacial spasm (Table 1), however, the evidence from controlled trials is very limited. Many different evaluation criteria have been applied which makes a comparison of treatment outcome across several studies nearly impossible. BoNT has become the established pharmacotherapy of choice for hemifacial spasm, but the challenge remains to come to a consensus how to rate treatment effects. An ideal instrument to assess the efficacy of BoNT should on the one hand be simple to apply and on the other hand cover functional aspects of the disease as well as patient rated impairment and improvement after therapy.

As clinical scales cannot cover all aspects of the disorder, further research is needed to refine disease-specific HRQoL questionnaires in a way that they address those items known to be relevant for patients. The HFS-7 scale developed by Tan et al. seems to be an appropriate questionnaire, but is has only been validated in an Asian patient population. The question remains if HFS-7 really covers those items of hemifacial spasm which are the most important for patients with a different cultural background. More clinical studies have to be conducted to validate a brief and simple HRQoL instrument in a broader range of patient populations.

A single rating scale will not be adequate to cover all complex aspects of the disorder. In consequence, a combination of several scales should be employed for the assessment of hemifacial spasm. According to our approach we recommend to combine the following scales: (1) the Jancovic rating scale for severity and frequency of eyelid spasm, (2) an additional rating scale for severity and frequency of cheek involvement, (3) a VAS for patient’s global rating, (4) the BDSI for rating of disability, but only a reduced selection of items chosen by the patient and (5) questions 4–7 of the HFS-7 to quantify psychosocial burden of hemifacial spasm. Table 5 shows an assembly of this proposal. These suggestions for the development of a standardized rating for hemifacial spasm represent only a first approach which needs to be further worked out with a panel of experts. As a further step the recommended combination of rating scales has to be validated and implemented in clinical studies.

Table 5 Proposed comprehensive scale for the estimation of treatment results of Hemifacial Spasm