Introduction

Stereotactic radiosurgery (SRS) and endovascular techniques are commonly used for treating brain arteriovenous malformations (bAVMs). They are usually used as adjuncts to microsurgery but may also be used as primary treatment options (either alone or in combination). Judicious patient selection requires a clear prediction of the efficacy and complication profile for each of these techniques applied to the individual patient. The Spetzler-Martin and supplemented Spetzler-Martin grading systems are well-established and commonly used for outcome prediction following microsurgical intervention [34, 64]. However, several studies have shown they may not be accurate in predicting outcome following SRS and/or endovascular therapy [40, 45,46,47, 50, 53, 58], stemming from the fact that these classification schemes do not incorporate relevant parameters specific to SRS or endovascular treatment. Therefore, validated technique-specific classification schemes play an important role in patient selection. This paper is dedicated to modern classification schemes designed for SRS and endovascular techniques.

Stereotactic radiosurgery

Factors affecting outcome

Stereotactic radiosurgery (SRS) is an effective therapeutic alternative to microsurgical resection, especially for deep AVMs. Obliteration rates of 50–90% are reported [4, 6, 9, 10, 17, 37, 39, 51, 52]. However, not all AVMs are obliterated after SRS. Furthermore, the latency period between SRS and complete AVM obliteration carries the risk of AVM rupture [19, 52, 54]. Increased age, greater AVM volume, subependymal or brainstem location, low minimum radiation dose, and proximal, para- or intra-nidal aneurysms are among the factors related to higher rates of AVM rupture during the latency period [30, 44, 49, 62].

Meanwhile, SRS can cause tissue injury in the adjacent brain leading to temporary and/or permanent symptomatic complications including seizure, headache, focal neurological deficit, and radiation-induced brain injury [4, 15, 16, 20, 72]. AVMs with larger nidi, specific arterial involvement (lenticulostriate or Heubner arteries), those having greater numbers of feeders, and patients with previous sensory deficit or seizure are more prone to SRS-related morbidities [21, 37, 66]. AVM volume and total volume receiving > 12 Gy of radiation are also associated with higher risk of permanent post-SRS complications [15,16,17].

Regarding SRS for brain AVMs, “excellent” outcome is defined as complete AVM obliteration with no decline from the pre-op neurological status [14]. Some factors related to “excellent” outcome include small AVM volume, non-eloquent location, low number of draining veins, higher marginal or maximum radiation dose, and younger age [47, 66].

Application of surgical grading scales to SRS

The Spetzler-Martin grading system has been widely used and validated to predict neurological outcome after microsurgical resection. Few studies have shown the accuracy of the Spetzler-Martin grading system for outcome prediction after AVM radiosurgery [67]. On the other hand, several studies showed this grading scale lacks desirable accuracy to predict outcomes after SRS [40, 45,46,47, 50, 53, 58]. A closer look to the main criteria of the Spetzler-Martin grading system and the subsequently proposed Spetzler-Ponce three-tier classification system reveals that these systems do not include the main outcome-related factors relevant to SRS (see above) [42, 47, 53, 65]. For example, while the Spetzler-Martin grading system includes lesion size, it lacks the favorable precision of volume appreciation needed for SRS. Any AVM measuring < 3 cm in diameter is considered “small” according to the Spetzler-Martin grading scale. However, a bAVM measuring 5 mm in diameter is < 0.07 cm3 in volume whereas a bAVM with a diameter of 2.5 cm is more than 8 cm3 in volume (> 100 times larger). Therefore, the Spetzler-Martin grading system underscores the total volume of treatment which is important in SRS planning and overall treatment outcome.

On the other hand, the Spetzler-Martin grading system considers eloquence of the AVM location [64]. However, experience shows that cortical eloquent locations are less prone to be associated with unfavorable SRS complications compared to deeper locations such as thalamus and brainstem [15, 35]. Besides, treatment nature, complications, and patient selection criteria are essentially different between microsurgery and SRS. Therefore, the Spetzler-Martin grading system cannot be a reliable tool to predict outcomes after radiosurgery.

SRS-based classification schemes

Earlier models to predict obliteration rate

  1. I.

    K-index (1997)

Karlsson et al. developed the first SRS-specific outcome measure for AVMs undergoing SRS known as the K-index [31]. Based on the fact that obliteration rate is linearly correlated with the minimum radiation dose delivered to the lesion and the lesion size, the K-index is calculated as follows:

$$ \mathrm{K}-\mathrm{index}=\mathrm{minimum}\ \mathrm{dose}\ \left(\mathrm{Gy}\right)\times \mathrm{AVM}\ \mathrm{volume}\ \left({\mathrm{cm}}^3\right). $$

Karlsson et al. showed the obliteration rate increases in a linear fashion up to a K-index of 27 and plateaus on 80% beyond this point [31].

  1. II.

    Obliteration prediction index (OPI) (1997)

The OPI was developed after analyzing a total of 436 patients from two different centers in Canada and the UK [59]. OPI is calculated using the following formula:

$$ \mathrm{OPI}=\frac{\mathrm{Marginal}\ \mathrm{dose}\ \left(\mathrm{Gy}\right)}{\mathrm{AVM}\ \mathrm{diameter}\ \left(\mathrm{cm}\right)}. $$

Using the least squares method, the authors provided a formula to calculate the probability of lesion obliteration:

$$ P=1-A\times {e}^{-B\times \mathrm{OPI}}, $$

where P is the probability of lesion obliteration, and A and B are 1.15 ± 0.14 and 0.114 ± 0.07, respectively. The relationship between the OPI and probability of obliteration was exponential and plateaued in OPIs > 20–25 [59].

The central problem with the K-index and OPI models is that they do not include AVM features relevant to SRS-related obliteration [41, 47]. Also, these scales are relatively old and based on biplanar angiographic data rather than more recent 3D angiographic images. Additionally, radiation dose is included in both scales. Therefore, they are not true pre-operative scales based exclusively on patient/lesion characteristics. Finally, these scales only predict successful obliteration and are silent about the risk of complications and permanent symptomatic neurological deficits.

Models to predict SRS-related complications

  1. I.

    Symptomatic post-radiosurgery injury expression (SPIE) scale (2000)

The SPIE scale was introduced by Flickinger et al. in 2000 [15]. This scale was developed to predict permanent neurological complications. The SPIE scale is based on two variables: (1) the total tissue volume receiving ≥ 12 Gy (which was found to estimate the risk of radiation-induced imaging changes) and (2) AVM location (SPIE score) which was scored according to regression coefficients for each location by normalizing them to a scale of 0–10. The frontal lobe had the lowest score (0) and pons/midbrain had the highest score (10). The 12-Gy-volume could be estimated before the radiosurgery session to allow prediction of the complication risk prior to the actual procedure. The authors suggested the following formula to estimate the probability of necrosis:

$$ P\ \left(\mathrm{necrosis}\right)=\frac{e^B}{\left(1+{e}^B\right)} $$

where B = constant (− 7.8713) + 0.7506 × (SPIE score) + 0.0734 × (V12) (V12 = volume receiving ≥ 12-Gy radiation). The main disadvantages of this grading system are (1) low number of patients examined (n = 85), (2) multiple location categories (n = 11), and (3) inaccuracy of predicting SRS complications with very small lesions; for very small (< 1 cm3) lesions of the brainstem, there is a 40% predicted chance of symptomatic radiation necrosis which is an unrealistic figure. Additionally, no prediction of the obliteration rate is provided.

  1. II.

    Pittsburgh radiosurgery-based AVM scale (RBAS) and its modifications (2002–2008)

Pollock and Flickinger proposed this radiosurgery-based system in 2002 to overcome the shortcomings of the SPIE scale [47]. This scale was originally introduced in 1997 under the name of Pittsburgh AVM radiosurgery (PAR) grading scale [46], and the authors refined it in 2002 [47]. Their original scale takes into account all the proven variables affecting the overall outcome of AVMs undergoing SRS, i.e., (1) lesion volume, (2) lesion location, (3) number of draining veins, (4) patient age, and (5) prior embolization. The AVM score is calculated using the following formula:

$$ {\displaystyle \begin{array}{c}\mathrm{PAR}\ \mathrm{AVM}\ \mathrm{Score}=0.13+(0.1)\times \mathrm{AVM}\ \mathrm{volume}\ \left({\mathrm{cm}}^3\right)+\\ {}(0.03)\times \mathrm{Age}\ \left(\mathrm{years}\right)+(0.64)\times \mathrm{Location}+\\ {}(0.35)\times \mathrm{Number}\ \mathrm{of}\ \mathrm{draining}\ \mathrm{veins}+(0.67)\times \mathrm{Prior}\ \mathrm{embolization}\ \left(0=\mathrm{no};1=\mathrm{yes}\right)\end{array}} $$

Lesion location is coded as follows: 0 is assigned to lesions in frontal or temporal lobes, 1 is used for parietal, occipital, intraventricular, corpus callosum, or cerebellar lesions, and 2 for basal ganglia, thalamic, or brainstem AVMs. AVM score significantly correlated with patient outcome (R2 = 0.92), However, due to its complexity, the authors proposed a simplified version. The simplified version had the similar three-tier system for location coding but it only included age, lesion volume, and lesion location as variables to calculate the score [47]:

$$ \mathrm{RBAS}\ \mathrm{AVM}\ \mathrm{score}=(0.1)\times \mathrm{AVM}\ \mathrm{volume}+(0.02)\times \mathrm{Age}+(0.3)\times \mathrm{location} $$

The RBAS AVM score was validated by several studies for Gamma knife [1, 5, 40, 51, 52] and Linac technologies [1, 56, 74, 75], deep AVMs [2, 51], and pediatric AVMs [5, 74]. In 2008, Pollock et al. modified the AVM score calculator by reducing the number of variables from 5 to 3 and changing it from a three-tier scale for AVM location to a two-tier scale as follows:

$$ \mathrm{Modified}\ \mathrm{RBAS}\ \mathrm{AVM}\ \mathrm{score}=(0.1)\times \mathrm{AVM}\ \mathrm{volume}+(0.02)\times \mathrm{Age}+(0.5)\times \mathrm{location}, $$

where location score was 1 for basal ganglia, thalamus, and brainstem, and 0 for the rest of the brain. The following cutoffs of AVM score were used to predict the declining outcome of patients undergoing SRS: ≤ 1, 1.01–1.50, 1.51–2.00, and > 2, with a score ≤ 1 predicting a 90% chance of lesion obliteration with no neurological decline. This modified scale did not differ from the original scale in terms of accuracy of predicting excellent outcome while it was simpler than the original model (including having no y-intercept) [48]. Another advantage of the modified system was that is has been validated for both biplanar angiography and stereotactic MRI [52]. Interestingly, the RBAS score is independent of treatment dose. In fact, increase in radiation does not alter excellent outcome ratio, because although it is associated with higher obliteration rates, it causes a concurrent increase in post-SRS complications; hence, no change in “excellent outcome” ratio. The modified RBAS scale was also externally validated by Wegner et al. [70].

  1. III.

    Heidelberg score (2012)

The Heidelberg group proposed a pre-operative AVM scoring system based on two important outcome predicting variables: age and AVM diameter (Table 1) [43]. The Heidelberg score is an integer-based system in which each lesion could get a score of 1, 2, or 3. The authors showed that with every increase in score, the obliteration rate is decreased by a factor of 0.447. The authors examined the proposed scale on 293 patients and reported a higher accuracy compared to the dichotomized RBAS score (≤ 1.5 versus > 1.5). However, their proposed scale has not been externally validated and was not compared to the non-dichotomized RBAS score.

  1. IV.

    Virginia radiosurgery AVM scale (2013)

Table 1 The Heidelberg score [43]

Starke et al. developed a new scale in 2013 based on multivariate predictors of favorable outcome for brain AVMs undergoing SRS [67]. They identified the following predicting variables of excellent outcome after analyzing 1012 patients undergoing SRS: age < 65 years, AVM volume < 2 cm3 and 2–4 cm3, non-eloquence of the lesion location, no previous history of hemorrhage, and no prior embolization. They simplified the scaling system by taking age and no-embolization history based on the fact that omitting these variables did not reduce accuracy (Table 2). According to this system, AVMs could get a score between 0 and 4, with higher scores associated with less favorable outcomes (grades 0 and 1 have almost 80% chance for favorable outcome). The authors showed that the Virginia system is more accurate than the RBAS. This system is simpler than the RBAS score and is analogous to the Spetzler-Martin system. However, it has not been externally validated so far.

  1. V.

    Proton-beam SRS (PSRS) AVM score (2014)

Table 2 The Virginia score [67]

Several studies have shown the efficacy of proton-beam therapy in obliterating AVMs [33, 60, 63, 68]. Hattangadi-Gluth et al. proposed the PSRS AVM score [25]. They showed PSRS score was more accurate than the modified RBAS score for lesions undergoing PSRS:

$$ \mathrm{PSRS}\ \mathrm{AVM}\ \mathrm{Score}=(0.26)\times \mathrm{Nidus}\ \mathrm{volume}\ (cc)+(0.7)\times \mathrm{Location}\ \mathrm{score}, $$

where score is assigned similar to the modified RBAS score [48], (1 for basal ganglia, thalamus, and brainstem, and 0 for other locations). Although the authors mention a significant correlation between the score and outcome, they fail to provide a grading scale based on cutoff points similar to RBAS score.

Comparison of different SRS-based classification schemes

Pollock et al. published a comprehensive comparative analysis of different SRS-related AVM grading scales using the data of 381 patients undergoing Gamma knife SRS [55]. They compared Spetzler-Martin, modified-RBAS, Virginia, Heidelberg, and PSRS scores. They concluded that modified-RBAS and PSRS are the most accurate systems to predict post-SRS outcome for bAVMs.

Endovascular-based classification schemes

Endovascular therapy is one of the main therapeutic interventions for brain AVMs. It is usually used as an adjunct to radiosurgery and/or microsurgery. However, a small percentage of these lesions (10–20%) are definitely cured with endovascular therapy [11, 12, 18, 21]. Few studies have shown a good correlation between the Spetzler-Martin grade of the lesion and outcome of endovascular embolization [32]. On the other hand, several studies failed to disclose such correlation [8, 11, 24, 32]. It is also conceivable that the Spetzler-Martin grading scale has prongs that reflect the important aspects of microsurgical resection, and does not appreciate some major anatomical and technical aspects important in endovascular obliteration of AVMs [24, 64]. For example, the number and diameter of arterial pedicles of a brain AVM may have more importance during endovascular treatment than during microsurgical resection. Several studies have mentioned some of the factors related to complete obliteration of AVMs such as age, AVM diameter, Spetzler-Martin grade, nidus morphology, preprocedural hemorrhage, venous drainage pattern, and number of branches embolized [21, 23, 29, 32, 36, 57, 71, 73]. However, a comprehensive validated classification scheme is needed to help surgeons predict the risks and success rates when encountering a candidate brain AVM for endovascular therapy. Such a classification scheme has not been introduced to date. However, there have been several efforts towards developing a grading system for bAVMs specific for endovascular therapy.

  1. I.

    Viñuela-Guglielmi grading system (1995)

This is probably the first classification scheme developed specifically for AVMs undergoing endovascular therapy. This system is based on the number of arterial feeders to the AVM, size (diameter) of the lesion (< 2, 2–4, and > 4 cm), and the presence of pial versus perforating feeders. A low-grade AVM is a small lesion fed by a single non-perforating artery, whereas a high-grade AVM is a large (> 4 cm) lesion fed by > 3 feeders, at least one of which is perforators [69]. This grading system is relatively old and does not include other outcome-related factors. Additionally, the external validity of this grading system was never assessed.

  1. II.

    Sheikh et al. grading system (2000)

This grading system was proposed based on four factors: (1) number of feeders, (2) origin of feeders, (3) feeder type, and (4) the presence or absence of stenotic venous outflow (Table 3) [61]. Regarding feeder origin, a feeder was considered “proximal” when it was supratentorial, and branching from the first or second divisions of the main arteries of the circle of Willis. Also, two types of feeders were defined: (1) end-on feeder which is a feeder solely ending on the AVM nidus and (2) transit feeder which passes through the nidus to feed adjacent brain. The authors recommend using this five-tier grading system to predict morbidity and mortality. However, the internal and external validity of this grading system was never assessed.

  1. III.

    Toronto score (2001)

Table 3 Major endovascular-based bAVM grading scales

This system was proposed for small (< 3 cm) bAVMs by Willinsky et al. in 2001 (Table 3) [71]. The scheme is an angioarchitectural system based on (1) nidus size, (2) number and (3) type of feeders, and (4) number of draining veins. Range of grades is 0–6. The authors tested this scale on 81 patients and showed it has superior accuracy in predicting cure rate compared to the Spetzler-Martin grade. Also, a good correlation between lesion score and complication rates was found. No patient with a score of 0–2 suffered from complications, whereas 10% of patients a score of 5 or 6 did (overall complication rate was 9%). Limited applicability is the main shortcoming of this system (small AVMs). Also, the predictive power of this scale was low as the compilation rate in high-grade lesions was similar to the overall complication rate. No study assessed the external validity of this classification scheme.

  1. IV.

    Puerto Rico grading scale (2010)

Feliciano et al. proposed the Puerto Rico grading scale to predict the risk of complications and long-term outcome of endovascular treatment for brain AVMs [13]. The authors developed this grading system after reviewing previous publications on studying the factors affecting the outcome of endovascular therapy [21, 24, 26, 32, 36]. The components of this grading system include number of AVM feeders, eloquence of the AVM location, and presence versus absence of arteriovenous fistula. Feliciano et al. did not test the validity and accuracy of their proposed classification scheme. However, Bell et al. retrospectively studied the applicability of the Puerto Rico score in 126 patients [3]. They found that a Puerto Rico grade ≤ 2 reliably predicts successful lesion obliteration with isolated endovascular therapy, whereas as grades ≤ 3 are strongly associated with cure after multimodality treatment and favorable neurological outcome. They also found that there is a stepwise increase in complications with increase in Puerto Rico grade [3].

  1. V.

    Buffalo score (2015)

The Buffalo score was designed to predict the risk of complications after endovascular treatment for bAVMs [11]. The authors selected three variables relevant to endovascular embolization of bAVMs including location eloquence, diameter of arterial pedicles to the nidus, and number of pedicles and created a classification scheme to predict the complication profile of endovascular therapy (Table 3). It is a relatively simple scale and is designed to mimic the structure of Spetzler-Martin grading system, yielding to a range of grades from 1 to 5 (Table 3). The authors tested this system on 50 patients with bAVM undergoing endovascular embolization with an intention to cure. NBCA and Onyx were both used in this series to show that the Buffalo score is independent regarding the embolysate. Importantly, obliteration rate (10%) was not predicted either by Buffalo score or Spetzler-Martin grade. However, Buffalo score strongly correlated with complication rate (p < 0.0001, grades 1 and 2, 0%; grade 3, 14%; grade 4, 50%; and grade 5, 75%). Spetzler-Martin grade failed to reliably predict complications (p = 0.28). Shortcomings of this study include the small sample size and failure of the proposed score to predict lesion obliteration.

  1. VI.

    AVM embocure score (AVMES) (2015)

The authors conducted a retrospective study on 39 patients undergoing endovascular therapy (using Onyx) with an intention to cure to find significant factors related to angiographic cure [38]. This system is based on (1) nidus size, (2) number of feeders and (3) draining veins, and (4) vascular eloquence (Table 3). AVMES ranges from 3 to 10. The authors introduce the concept of “vascular eloquence” which is novel. According to their definition, a vascular eloquent artery is a small and short branch of a larger artery whose injury or occlusion could lead to neurological deficit [38]. The authors showed an increasing AVMES is associated with lower rate of obliteration and higher risk of complications. They emphasize the simplicity of AVMES and the relevance of its prongs to endovascular treatment as the advantages of this grading system. However, the score is not very precise, as scores 4 and 5 had a similar obliteration/complication profile. Likewise, all scores > 5 had a similar obliteration/complication profile. Using analysis of area under receiver operator characteristic curve, the authors propose that AVMES accurately predicts both obliteration rate and complication rates.

  1. VII.

    Rothschild-Montreal grading scale for deep AVMs (2017)

Robert et al. proposed a location-based grading system for deep supratentorial AVMs (basal ganglia, centrum semiovale, and midbrain) undergoing endovascular therapy [57]. Studying a group of 134 patients, they found the following factors to be related to complete obliteration: (1) diameter < 3 cm, (2) lateral type (see Table 4), (3) Spetzler-Martin grade < 3, (4) compact nidus, (5) absence of concomitant anterior and posterior circulation contribution to the nidus, and (6) unique venous outflow. Their proposed grading system used the Spetzler-Martin grade as one of the prongs yielding a score of 0–10 (Table 4). The authors showed a correlation between the AVM score and obliteration rate. However, the preciseness of this scale is under question as scores 3 and 4, and scores 9 and 10 have similar obliteration rates. Furthermore, no correlation was reported between the grade and complication rate [57].

Table 4 Rothschild-Montreal grading system for endovascular treatment of deep AVMs [57]

Comparison of different endovascular-based classification schemes

So far, none of the proposed grading systems for endovascular therapy has gained widespread popularity and few studies were performed to assess the external validity of these classification schemes. For example, Gupta et al. sought to retrospectively assess the validity of Spetzler-Martin, Buffalo, and Puerto Rico scores to predict complications in 39 patients [22]. Despite having an acceptable complication profile, the authors concluded that none of the tested grading systems could reliably predict complications of endovascular therapy.

In a large recent multicenter retrospective study, Jin et al. published the results of the validity assessment for Spetzler-Martin, Puerto Rico, Buffalo, and AVMES grading systems to predict various outcome aspects of endovascular therapy for bAVM [28]. Comparing the Spetzler-Martin and Puerto Rico scales regarding the long-term neurological outcome, the authors concluded that the Puerto Rico scale was superior. As for short-term procedural complications, the Puerto Rico and Buffalo score were superior to Spetzler-Martin and AVEMS. While the authors did not compare different grading scales for predicting obliteration rate, they stated that AVMES scale is “medium efficient” (AUC = 0.757).

With the advent of novel endovascular techniques (such as transvenous access to the AVM), the outcomes of this treatment modality are improving [7, 27]. While several grading scales are proposed, none of them have gained widespread popularity. Further large-size studies are needed to develop a simple and efficient grading scale for predicting outcomes after endovascular therapy. An ideal system would be applicable to both multimodality management paradigms as well as treatment only consisting of endovascular intervention.

Conclusion

Microsurgical resection, SRS, and endovascular therapy for bAVMs continue to evolve, and so do the treatment outcomes and complication profiles. This gradual change in treatments and outcomes will surely change the significant parameters affecting patient outcomes. On the other hand, more and more bAVM cases are being treated using multimodality management. This phenomenon calls for comprehensive classification schemes predicting the results of combination therapy. The roles of genetic, molecular, and hemodynamic factors in the natural history of bAVMs are also being elucidated. Future classification schemes may also include such criteria to increase accuracy and facilitate decision-making for this challenging pathology.