Introduction

Nowadays, cortical language mapping in humans can be performed with a variety of imaging and stimulation techniques. In this context, especially functional magnetic resonance imaging (fMRI), magnetoencephalography (MEG) and direct cortical stimulation (DCS)—the gold standard in terms of testing cortical structures related to neurological function—are applied widely. In addition, another tool, which was originally introduced for eliciting motor evoked potentials in 1985 [1], is increasingly used in patients and healthy volunteers: transcranial magnetic stimulation (TMS). When combined with a neuronavigation unit and applied with high frequency (repetitive navigated TMS = rTMS), it allows for individualized assessment of the cortical distribution of language-related areas during an overt-speech task [27]. While DCS is performed intraoperatively or postoperatively after the implantation of subdural grid electrodes, rTMS shares the advantage of non-invasiveness with fMRI and MEG, which principally allows for safe application in healthy volunteers with limited contraindications [810].

Basically, fMRI and rTMS work by different ways of functioning. While the different statuses of the hemoglobin molecule and the blood oxygenation level dependent (BOLD) contrast enable the visualization of brain activations dependent on different tasks [11], rTMS can inhibit certain cortical regions [12]. During rTMS, the induced electrical field can change the transmembrane potential and results in a virtual lesion [12]. It is assumed that this works through synaptic inputs [13]. Recently, several publications have investigated the correlation of cortical rTMS language maps with those generated intraoperatively by DCS [4, 7, 1416] and preoperatively by fMRI [14, 15, 17] or MEG [7] in patients suffering from intracranial lesions. However, since rTMS represents a comparatively new method for language mapping, data regarding the correlation of rTMS language maps to fMRI results in healthy subjects have been rare so far. Therefore, the present trial systematically explores the relationship between fMRI and rTMS results for future application in neurosurgical patients within a cohort of 40 healthy, right-handed subjects who underwent mapping by both modalities. In this context, the most common tasks were used for each method (fMRI: word-generation = WGEN task; rTMS: object-naming = ON task).

Materials and methods

Ethics

The experimental setup was approved by the local ethics commission (registration no. 2793/10) and was conducted in accordance with the Declaration of Helsinki. Written informed consent was obtained from all subjects prior to the examinations.

Subjects

For the comparison of cortical mapping by fMRI and rTMS, 40 healthy volunteers (20 males and 20 females) were enrolled in the present study. Inclusion criteria were age above 18 years, right-handedness, German as mother tongue and written informed consent. Exclusion criteria were age below 18 years, bilateral or left-handedness, subjects who grew up bilingual, previous seizures, general rTMS exclusion criteria (e.g., pacemaker, cochlear implant, deep brain stimulation) [9, 10] or intracranial pathologies on magnetic resonance imaging (MRI) scans.

All volunteers were scheduled for anatomical and functional MRI during the same scanning session. Subsequently, rTMS of the left hemisphere was carried out within the next 2–4 weeks in our department.

Anatomical MRI

First, each volunteer received a structural MRI scan on a 3-T MR scanner (Achieva 3 T, Philips Medical System, The Netherlands B.V.) using an eight-channel phased-array head coil. The protocol contained a three-dimensional (3D) gradient echo sequence (repetition time (TR)/echo time (TE): 9/4 ms, 1 mm3 isovoxel covering the whole head, 6 min 58 s acquisition time) without intravenous contrast administration. Furthermore, a T2 FLAIR (TR/TE: 12,000/140 ms, inversion time: 2,500 ms, 30 slices with 1 mm gap, voxel size 0.9 × 0.9 × 4 mm3, 3 min acquisition time) was obtained to exclude incidental cerebral pathologies. Individual imaging data were screened by an experienced neuroradiologist prior to rTMS. Then, the 3D gradient echo sequence was transferred to the rTMS system for cortical mapping.

Functional MRI

For BOLD fMRI, echo planar imaging was performed with the following parameters: α = 90°, TR = 2,500 ms and TE = 35 ms. Within 2 min 53 s, 64 dynamic sets were acquired, each consisting of 32 contiguous axial 4-mm slices with an in-plane resolution of 2.75 × 2.75 mm2. Parallel imaging (SENSE) was used to diminish susceptibility-related artifacts (SENSE factor 2). After scanning, the obtained data were post-processed using the IViewBOLD package on an external workstation (Extended MR Workspace, Philips Medical Systems, The Netherlands B.V.). Statistical parametric maps were reconstructed by use of the general linear model after motion correction and spatial smoothing (2D Gaussian filter with 4-mm FWHM, kernel 7 × 7 pixels). We chose a hemodynamic delay of 2 × TR. Solely clusters with a positive correlation bigger than 40 voxels, a t-test value bigger than 2.5 and an intensity threshold of 250 were considered activated areas. The validity of the results was then evaluated by reviewing the time-intensity diagrams of the activated voxels. All fMRI scans were conducted in combination with the anatomical sequences prior to rTMS, and evaluation of fMRI data was performed by an experienced neuroradiologist.

During fMRI, each volunteer was advised to perform a WGEN paradigm (covert speech). The prompts were displayed on a screen at the back end of the scanner and were visible via a mirror in front of the head coil for 2500 ms each. The volunteers were asked to either generate words that begin with a given capital (e.g., “R”—“relationship”) or to add verbs to the presented nouns (e.g., “newspaper”—“read”). Both tasks were presented in an irregular order within each block.

A total of 160 scans were acquired while the subjects underwent ten blocks of alternating rest and task performance. A thorough baseline trial (screening of objects without simultaneous scanning) was carried out prior to fMRI data acquisition to prepare the volunteers for the tasks, and all subjects were instructed to limit head motion as far as possible. This baseline trial was carried out until the individual volunteer had indicated that he/she understood the task and felt well-prepared for the acquisition phase. During and after scanning, each volunteer was briefly asked whether he/she was capable of performing the task properly as an attempt to ensure active participation.

For later comparison of fMRI and rTMS data, the activated areas in fMRI were anatomically located using the coronal, sagittal and axial anatomical slices, which were fused with the BOLD sequences. Then, these areas were assigned to the cortical parcellation system (CPS), which was described by Corina and coworkers [18, 19]. In this context, Fig. 1 displays the anatomical localization of the cortical areas of the CPS, including the abbreviations of the cortical regions (Table 1). Furthermore, we calculated an fMRI activation rate (AR) by dividing the total number of positive BOLD signals of each CPS region by the number of volunteers who underwent fMRI mapping.

Fig. 1
figure 1

Cortical parcellation system (CPS). This figure visualizes the anatomical areas of the CPS, as described in Corina et al. (2005). The numbers refer to the legend given in Table 1. Furthermore, the light grey areas depict anterior CPS regions (numbers 18, 34, 37), whereas the dark grey regions reflect posterior CPS regions (numbers 4, 6, 17, 31, 32) as described in the methods section

Table 1 Cortical parcellation system (CPS)

Navigated transcranial magnetic stimulation

Setup

All mappings were performed with the eXimia NBS system, version 3.2.2 or its successor version 4.3, and a NEXSPEECH® module (Nexstim Oy, Helsinki, Finland). The usage of the two different software versions did not affect the results. For stimulation, we used a biphasic figure-of-eight coil. The common setup for rTMS mapping trials in patients and healthy subjects has been repeatedly described in the past [2, 3, 57, 2022].

In short, co-registration of each volunteer's head and the corresponding MRI data was conducted after the 3D gradient echo MRI sequence had been imported. With the help of a stereotactic camera and reflectors fastened to the subject's head with an elastic strap, this allowed for visualization of the stimulated cortical area and the magnetic coil in real time during mapping. In addition, the induced electrical field was visualized over the 3D brain reconstruction, and all intracranial spots of stimulation were saved for later examination. The individual volunteers’ resting motor threshold (RMT) was determined by motor mapping of the cortical representation of the contralateral abductor pollicis brevis muscle [2325]. The individual RMT was then used as a basic value for the rTMS mapping procedure, which was conducted in combination with an ON task consisting of 131 colored photographs of everyday objects (overt speech). The photographs of the ON task portrayed familiar living as well as non-living things (e.g., snake, plane, chair), similar to the objects selected in the Snodgrass and Vanderwart pictures [26]. The inter-picture interval (IPI; interval between the screening of two consecutive objects) standardly accounted for 2500 ms, whereas the display time (DT; presentation time of each object on the screen) was 700 ms. Furthermore, the picture-to-trigger interval (PTI; interval between the presentation of an object on the screen and the onset of the rTMS burst) was set to 300 ms for all volunteers [2729]. Currently, the ON task represents the most common task during rTMS language mapping [26, 22, 30].

The rTMS intensity and frequency were individualized according to the following rules: First, trains of 5 Hz/5 pulses, 7 Hz/5 pulses and 7 Hz/7 pulses with 100 % of the individual RMT were consecutively applied to the vPrG and opIFG [3]. Then, the frequency/pulse number that elicited the highest amount of naming errors was chosen and applied during mapping. However, if naming was not clearly interrupted, the intensity was increased to 110–120 % RMT, whereas it was decreased to 80–90 % RMT when significant stimulation-induced pain was reported by the volunteer. Moreover, if there was no distinct difference in naming impairment observed between the different adjustments, the most comfortable frequency/pulse number was chosen. This protocol has already been used successfully in various previous rTMS-based mappings in healthy volunteers and brain tumor patients [2, 46, 16, 30].

Mapping procedure

Prior to stimulation, baseline testing (screening of objects without simultaneous stimulation) was performed two times to individualize the set of objects according to the subject’s vocabulary. The volunteers were advised to name the presented objects in German as quickly and precisely as possible, and all misnamed objects were discarded from the sequence. During mapping, the remaining stack of objects was presented time-locked to rTMS, and the stimulation coil was randomly moved over the left hemisphere within the IPI. To achieve maximum field induction, the coil was placed perpendicular to the skull [2, 3, 31], and cortical sites were stimulated three times each with a distance of approximately 10 mm between two stimulation sites. For later analysis, all baseline performances and stimulation trials were video recorded [2, 3, 5, 6, 20]. The spatial extent of rTMS was restricted because of discomfort in the orIFG, polSTG, polMTG, aMTG, inferior temporal gyrus (ITG) and frontopolar regions (Fig. 1, Table 1), because these areas have proven to be too painful in previous studies [4, 30]. With this in mind, we performed rTMS language mapping within the following CPS regions: 2, 4, 6, 7, 9, 10, 12–18, 21, 22, 30–34, 36 and 37 (Fig. 1, Table 1).

Data analysis

Analysis of video material was performed by the first author (MD with linguistic/mapping/video analysis experience since 2011), who was supervised by the last author (MD with linguistic/mapping/video analysis experience since 2009). In addition, a collaborating trained linguist was available for the evaluation of unclear cases. All detected naming errors were documented and categorized into the following six error groups according to the previous literature [3, 5, 19, 22, 30]: no responses, performance errors, hesitations, neologisms, semantic paraphasias and phonologic paraphasias. The evaluation of mapping videos was conducted strictly blinded to the sites of cortical stimulation and blinded to the results of fMRI language mapping.

Then, all detected naming errors sorted by error type were assigned to the CPS for systematic comparison with fMRI results (Fig. 1, Table 1). Furthermore, the 2/3 rule was applied to each CPS region. This rule has already been applied in previous publications of our group [14, 15]. In this context, a CPS region was defined as positive if at least two out of three stimulations caused a naming error [14, 15, 32, 33]. Since the 2/3 rule was applied to three consecutive stimulations at the same cortical spot, all parameters and the coil angulation were the same for these stimulations. Error rates (ERs) were calculated for each CPS region (ER = number of errors divided by the number of stimulations), which allowed for further analysis of the rTMS data with predefined error rate thresholds (ERT = ER at which a certain CPS region was defined as positive in terms of rTMS). The different ERTs were determined ongoing from 0 % in 5 % steps to 50 % (ERs: >0 %, ≥5 %, ≥10 %, ≥15 %, ≥20 %, ≥25 %, ≥30 %, ≥35 %, ≥40 %, ≥45 % and ≥50 %). In addition, we divided the total number of subjects that showed naming errors in a specific CPS region by the number of volunteers who were stimulated at this particular site with respect to the 2/3 rule and the ERTs.

Statistics

All statistical data analysis was performed using the GraphPad Prism software (GraphPad Prism 6.04, La Jolla, CA, USA). Regarding subject and mapping-related characteristics, mean values ± standard deviation (SD), median, minimum and maximum values are reported.

To explore the agreement between fMRI and rTMS mapping results, we calculated Cohen’s kappa coefficient for different comparisons (GraphPad Prism QuickCalcs, La Jolla, CA, USA). In detail, kappa values were obtained for all CPS regions together, but also for anterior (opIFG, trIFG, vPrG) and posterior areas (anG, aSMG, mSTG, pSMG, pSTG) separately (Fig. 1, Table 1) [14]. Furthermore, since no responses and performance errors proved to be among the most prominent and objective naming error categories during recent rTMS mapping trials [26, 20, 22], we decided to conduct kappa statistics for all error categories together, but also for no responses, performance errors, and no responses plus performance errors separately. All analysis was achieved for the above-mentioned ERTs and the 2/3 rule on the basis of the ERs [15]. Regarding the interpretation of Cohen’s kappa statistics, we refer to Landis and Koch (1997), who suggested the following limit values [34, 35]:

  • Kappa <0: Less than chance agreement

  • Kappa 0.01–0.20: Slight agreement

  • Kappa 0.21–0.40: Fair agreement

  • Kappa 0.41–0.60: Moderate agreement

  • Kappa 0.61–0.80: Substantial agreement

  • Kappa 0.81–0.99: Almost perfect agreement

Results

Subject characteristics

In total, 40 healthy volunteers (20 males and 20 females) were enrolled in the present study according to the inclusion criteria. All enrolled subjects were purely right-handed according to their own statements. The median age of the cohort was 25 years, and it ranged between 23 and 60 years.

fMRI and rTMS mapping

Spatial identification of language-related areas in fMRI data was possible in all subjects, and all volunteers showed exclusive left-hemispheric or combined left- and right-hemispheric activation with clear domination of the left hemisphere.

Regarding rTMS, mapping of the left hemisphere was possible in all volunteers, and no adverse events except transient headache in a few subjects occurred. Concerning rTMS mapping parameters, the mean RMT was 36.8 ± 6.6 % of the stimulator output, and mapping was performed with a mean intensity of 101.3 ± 4.9 % RMT (range: 100–120 % RMT). Cortical stimulation was carried out with 5 Hz/5 pulses in 14 subjects (35.0 %), with 7 Hz/5 pulses in another 14 subjects (35.0 %) and with 7 Hz/7 pulses in the remaining 12 volunteers (30.0 %).

Comparison of fMRI and rTMS mapping

All errors

Regarding the comparison of fMRI and rTMS in terms of all naming errors together, Table 2 displays individual kappa values for all regions, anterior regions and posterior regions, respectively, as well as for all ERTs and the 2/3 rule. For all regions together, a median kappa of 0.10 was achieved (range: 0.02–0.24; Table 2). The best agreement was calculated for the 2/3 rule (0.24, Table 2), but it was still only in the range of fair agreement. For anterior CPS regions, median kappa was 0.02 (range: −0.06–0.11), whereas it was 0.05 (range: 0.00–0.16) for posterior regions (Table 2).

Table 2 Agreement between rTMS (all errors) and fMRI language maps measured by Cohen’s kappa

Within Figs. 2 and 3, which illustrate the comparison of fMRI’s AR and the number of subjects with errors divided by the number of stimulated subjects for the rTMS data, row A shows the summed-up results of all naming errors caused by rTMS. We obtained the highest percentages within the anterior areas, especially for the 2/3 rule and an ERT of 15 % (Fig. 2a). fMRI mapping showed the highest ARs within anterior areas as well (Fig. 2).

Fig. 2
figure 2

Language maps of fMRI and rTMS [2/3 rule and error rate threshold (ERT) of 15 %]. The figure shows the distribution of fMRI’s activation rates (ARs, image on top of the figure) and the quotient of subjects with naming errors and subjects stimulated by rTMS, depending on the 2/3 rule and an ERT of 15 %. The results of rTMS are further subdivided into different naming error categories (a: all errors; b: no-response errors; c: performance errors; d: no-response and performance errors together)

Fig. 3
figure 3

Language maps of fMRI and rTMS [error rate thresholds (ERTs) of 20 % and 25 %]. The figure shows the distribution of fMRI’s activation rates (ARs, image on top of the figure) and the quotient of subjects with naming errors and subjects stimulated by rTMS, depending on an ERT of 20 % and 25 %. The results of rTMS are further subdivided into different naming error categories (a: all errors; b: no-response errors; c: performance errors; d: no-response and performance errors together)

No responses

Concerning no-response errors elicited by rTMS, the comparison to fMRI data led to a median kappa of 0.00 (range: −0.01–0.18) for all regions together (Table 3). Furthermore, the median kappa was 0.01 (range: −0.04–0.04) for anterior and 0.01 (range: −0.02–0.17) for posterior CPS regions (Table 3). Correspondingly, all kappa values for no-response errors were in the range of slight to less than chance agreement.

Table 3 Agreement between rTMS (no responses) and fMRI language maps measured by Cohen’s kappa

When applying the 2/3 rule to the results of rTMS mapping, the quotient of subjects with errors and stimulated subjects was again the highest within anterior areas (Fig. 2b).

Performance errors

The comparison of rTMS to fMRI maps for performance errors distributed over all stimulated areas led to a median kappa of 0.00 (range: −0.01–0.21) with the highest value for ER >0 % (0.21, Table 4), indicating fair agreement. For anteriorly and posteriorly located CPS regions, median kappa was 0.00, respectively; it ranged between −0.03 and 0.14 for anterior regions and between −0.03 and 0.10 for posterior regions (Table 4).

Table 4 Agreement between rTMS (performance errors) and fMRI language maps measured by Cohen’s kappa

Regarding the comparison of fMRI’s AR and the number of subjects with errors divided by the number of stimulated subjects for the rTMS data, we generally obtained the highest percentages within the vPrG (Figs. 2c and 3c).

No responses and performance errors together

For the combination of no responses and performance errors, median kappa was 0.02 (range: −0.01–0.23) for all CPS regions together (Table 5). In this context, the highest value was achieved for the 2/3 rule (0.23, Table 5), which reflects fair agreement. Regarding anterior regions, median kappa was 0.00 (range: −0.03–0.17), whereas it was 0.03 (range: −0.04–0.12) for posteriorly located CPS regions (Table 5).

Table 5 Agreement between rTMS (no responses + performance errors) and fMRI language maps measured by Cohen’s kappa

The quotient of subjects with errors and stimulated subjects for these two naming error categories together was again higher within anterior areas when using the 2/3 rule (Fig. 2d). For the ERTs of 15 %, 20 %, and 25 %, we found spots with high percentages in anterior as well as in posterior areas (Figs. 2d and 3d).

Discussion

Comparison of fMRI and rTMS mapping

The present trial investigates the relationship between fMRI and rTMS language mappings within a cohort of healthy subjects. In this context, the tasks most commonly used in the neurosurgical context were applied (fMRI: WGEN task; rTMS: ON task). As a prerequisite to the further application in patients, comparison of such non-invasive approaches in healthy volunteers seems to be mandatory.

According to the present results and the values of Cohen’s kappa coefficient, not more than slight to fair agreement between the mapping results of fMRI and rTMS in healthy subjects was observed (Figs. 2 and 3, Tables 2, 3, 4 and 5). In contrast to common correlation measurements, Cohen’s kappa coefficient does not distinctly evaluate a certain technique in the light of another one that might be defined as the more established or superior one. Instead, it intends to assess the agreement between at least two methods while both of them are defined as equivalent. Since DCS—which represents the gold standard in terms of functional cortical mapping—was not used in the present trial, we aimed to assess the agreement between two methods regarded as equal, which motivated us to use Cohen’s kappa coefficient.

So far, there are only a limited number of studies available that systematically compare the mapping results of these two non-invasive techniques. In this context, a recent comparison trial came to the conclusion that the two modalities should complement each other, particularly when mapping anteriorly located language areas [36]. The possibility of combining the fMRI and rTMS language mapping in different ways has also been utilized and recommended for application in patients [15, 29, 37]. Basically, the most reliable way to compare two techniques is the comparison of each of them to the respective gold standard method. Concerning our application purpose, this would have been DCS during awake surgery [32, 33, 38, 39]. Since this approach was not feasible in our present study because of its high invasiveness, we need to be careful about calling one of the two tested non-invasive techniques the more reliable one since the results differ. The present data merely tell us whether—with respect to Cohen’s kappa coefficient—the mapping results of fMRI and rTMS gathered among healthy volunteers agree when comparing the data on a pooled intra-subject level. However, there have already been comparisons of both techniques with the gold standard of mapping, DCS, during awake surgery, which is reflected by different patient studies that showed a good correlation of preoperative rTMS mapping with the results of intraoperative DCS [4, 7, 16, 21]. Moreover, at least two studies have already described a disagreement of fMRI and rTMS mapping in patients suffering from brain lesions while revealing a good correlation of rTMS and DCS at the same time [14, 17]. In contrast, a review of studies that compared the results of fMRI and DCS concluded that fMRI appears not to be appropriate for preoperative mapping [40]. This has also been stated by a recent study once again [41].

When summing up all mapping results of fMRI and rTMS, respectively, and creating a map based on Corina’s CPS [18, 19], a spatial overlap regarding ARs and ERs can be observed for certain brain regions (Figs. 2 and 3). In this context, the distribution of cortical sites obtained by fMRI and rTMS mapping resemble each other in appearance, particularly when applying the 2/3 rule or an ERT of 15 % to the rTMS results (Fig. 2). In these cases, we found the highest ARs and quotients of subjects with errors and stimulated subjects within anterior areas (Fig. 2). It has already been reported that fMRI seems to be comparatively sensitive within anterior areas [42], and a high correlation of rTMS and DCS within anterior areas in terms of sensitivity has also been shown in a previous trial [4]. Furthermore, regarding the rTMS 2/3 rule and ERT of 15 %, we could show higher rTMS percentages than ARs in the vPrG (Fig. 2). This seems to be due to the fact that subjects merely imagine a word when performing fMRI mapping in the context of a WGEN task (covert speech), while they are advised to voice it during the rTMS mapping procedure (overt speech). However, as already mentioned above, when comparing the two techniques by Cohen’s kappa coefficient, we could only find slight to fair overall agreement of results (Tables 2, 3, 4 and 5).

BOLD contrast vs. virtual lesion

When discussing the results and comparison of the two different techniques, this has principally to be done with respect to their distinct way of functioning. The two different statuses of the hemoglobin molecule and the following change of its magnetic characteristics are the basic principles of fMRI; thus, the resulting BOLD contrast enables the visualization of different brain functions [11]. In this context, fMRI in combination with cognitive tasks has allowed for the localization and lateralization of different aspects of language function [4348]. However, in patient cohorts with changed intracerebral anatomy and metabolism, it is assumed that the underlying mechanisms of fMRI’s dependency on oxygen extraction essentially impair the technique’s accuracy [14, 49, 50]. Thus, additional non-invasive language mapping modalities such as rTMS seem to be beneficial.

The rather slight to fair agreement between fMRI and rTMS according to Cohen’s kappa coefficient might be closely related to the fact that different components are visualized by the two modalities: Whereas fMRI displays more or less all language-related activity within a certain time frame, rTMS is able to elicit different kinds of errors and, therefore, is most likely able to interfere with different language subfunctions or networks [5, 6]. In this context, the categorization of naming errors during rTMS into various error categories and their clustering, which have already previously been established within different invasive and non-invasive brain stimulation approaches [3, 19, 39], might allow for evaluation of different networks, even during the same task. The systematic assessment of different language subfunctions and networks is, self-evidently, possible in fMRI as well, but it seems that different tasks become necessary for that purpose since fMRI language mapping primarily shows graded activation in a “yes-no” fashion that does not allow for distinct error classification [47].

However, rTMS is currently regarded as too sensitive for the distinct spatial localization of language-eloquent brain function [4, 1517]. Hence, we have to distinguish between positive cortical regions, as detected by rTMS, that are language-eloquent or merely language-involved. Accordingly, currently, rTMS is more accurate for the preoperative mapping of language-negative cortical sites [4, 7, 33]. Cortical regions identified as language-positive by rTMS compromise at least parts of the respective individual’s language network, but their exact hierarchical function has to be further investigated. However, rTMS operates with the same basic principle as the gold standard of mapping, DCS, does during awake surgery—the ‘virtual lesion’ model [51]. Until some years ago, most of our knowledge of human language function in healthy brains was based on fMRI studies [47, 52]. Meanwhile, rTMS offers the opportunity of applying the virtual lesion model to healthy brains as well [2, 5]. Moreover, as mentioned above, the application of rTMS is safe, and this non-invasive technique yields valuable results to the further development of our knowledge of human brain function.

Multimodal concepts

The advantages of making use of the virtual lesion model in healthy volunteers seem to enable another kind of application. In this context, TMS primarily allows for cortical mapping, but the knowledge about subcortical pathways is crucial for basic researchers and neurosurgeons as well, especially when regarding the completion of network models. Correspondingly, the impact of different techniques, such as diffusion tensor imaging fiber tracking (DTI FT), is steadily increasing. The feasibility of DTI FT based on TMS data for detection of the corticospinal tract has already been shown [5355]. Meanwhile, also rTMS-based DTI FT of language-related pathways offers the possibility of visualizing these highly important structures [56].

The application of more than one technique seems to be the future of human brain mapping. The preference for multimodality has already been expressed concerning the preoperative mapping of patients suffering from brain lesions [4, 7, 33, 55]. The same applies to the mapping of healthy volunteers. With this in mind, the present discussion should not serve as a kind of a competition between fMRI and rTMS mapping. The opposite is true: particularly non-invasive mapping techniques, such as fMRI and rTMS, should be used to complement each other to refine the techniques’ properties and results and to compensate for their individual disadvantages. A recently published study of our group has already presented two protocols for the combination of rTMS and fMRI language mapping in patients suffering from left-sided perisylvian brain lesions [15]. Although we still cannot validate the results of the two protocols by comparing them to the gold standard technique, they could also be applied to the results of the two techniques in healthy subjects. However, the present study revealed different and partly contrary results of fMRI and rTMS mapping. For instance, the authors of another study observed differing results in fMRI and DCS, stating that these modalities would never completely agree since the underlying ways of functioning are too different [57]. This could, consequently, also be argued concerning the differing results of the present fMRI and rTMS modality comparison.

Stimulation protocol and task selection

In the present study, we used a well-established rTMS mapping protocol that has already been used in previous trials of our and other groups [2, 46, 16, 30]. In this context, it has already been shown repeatedly that different stimulation settings might also result in different mapping results [31, 58]. However, we distinctly decided to use the stimulation protocol described above since it has proven to be one of the most common ones in mapping in recent years while being safe and reliable. We are aware of the fact that a clear change in the parameters used might influence the mapping results, but the present study was not designed to evaluate and compare different protocols.

Moreover, when talking about language mapping, the choice of tasks is still subject to debate. In this context, it has been shown that the task type affects the results of rTMS mapping [59]. However, few data are available on the optimal or standardized task opportunities; therefore, we decided to use the fMRI language task that has been most commonly applied to date and that is most comparable to the ON task [57].

Additionally, we used covert speech tasks for fMRI, whereas an overt speech task was used for rTMS. In general, fMRI in combination with an overt speech task is difficult to achieve because of movement artifacts and is comparatively uncommon in the clinical setting, whereas covert speech tasks are not feasible for rTMS mapping. This might lead to disagreement of both methods in certain brain areas due to lacking or reduced activation in motor speech areas. However, it has already been shown that fMRI activation during covert and overt speech tasks might be in good spatial accordance [60] and that the WGEN task activates the classical cortical areas for language processing [4346]. Although combined application of several tasks activating different aspects of language function during fMRI might lead to more reliable maps, the WGEN task has proven to be comparatively robust, and its preferential application is based on the significant overlap with other tasks while leading to even higher activation intensities [46]. In contrast, ON tasks during fMRI, for example, are not robustly localizing and lateralizing human language function, and they routinely coactivate the visual cortex and other cortical networks (e.g., imagery pathways) that are not crucial for language [44]. Thus, although application of an ON task during fMRI and rTMS language mapping might be reasonable at first sight, this approach might not be able to capture the language network properly. Consequently, we decided not to use the same task for the given reason, but rather independently chose the most distributed and established tests for each modality separately. In this context and as mentioned before, we distinctly aimed at the comparison of fMRI and rTMS language mapping results acquired during the performance of tasks that are most frequently used for these methods in the neurosurgical context, respectively. Agreement between fMRI and rTMS results might be improved by comparing rTMS maps to the fMRI patterns derived from imaging with at least two tasks because this has been shown to depict language-related areas even better [44, 46]. The overall number of tasks that can be applied in the clinical routine is limited, but the present study intends to motivate more elaborate comparison studies among healthy volunteers using different tasks.

Limitations

Our study also harbors some limitations, which might have influenced the results of fMRI and rTMS, respectively. Like in some former studies that compared the results of rTMS language mapping to those obtained by intraoperative DCS, the positive predictive value (PPV) of rTMS language mapping is quite low [4, 14, 15]. Hence, as already mentioned above, it might be the case that some of the language-positive areas in terms of rTMS are only involved in language processing and comprehension while not being language-eloquent in a strict sense. However, since the number of subjects in the present study is comparatively large, we assume that the language-positive regions, as determined by rTMS, at least belong to language networks.

As another limitation, we have to highlight that we used a PTI of 300 ms for the mappings of the present study. However, a recently published study has shown that the application of a PTI of 0 ms might be more favorable [21]. In this context, it has to be stated that, despite several studies concerning the standardization and improvement of rTMS protocols having been performed, the technique is still not yet fully matured and has to be further refined [3, 15, 21, 31].

Regarding evaluation of fMRI language maps, finding the optimal value for thresholding during analysis is a common challenge [44]. In the present study, clusters with a positive correlation >40 voxels, a t-test value >2.5 and an intensity threshold of 250 were considered activated areas. Although this approach is well established at our institution, it remains a comparatively arbitrary level of thresholding, which represents a limitation of our and most other fMRI language mapping studies. In this context, previous literature has already suggested calculation of activation within different regions of interest instead of calculations for the whole cortex, which might reduce noise [44, 61]. However, visual rating (with more arbitrary thresholding) could be regarded as reliable as quantitative methods based on single regions of interest according to other authors [62]. In principal, a statistical and standardized method for optimal thresholding (e.g., systematic multiple thresholding comparisons) might overcome this limitation in the future, but it seems that reliable algorithms still have to be developed [44].

Although we tried to choose a comparatively homogeneous group of healthy volunteers, the age range is quite large. Although the median age is low, it might be that this limitation affects our results. Up to now, the influence of age on one or both of the two techniques is not clear. In addition, the application of the CPS with its relatively large error margins might be seen as a limitation of the present study [18]. However, we used the CPS in order to perform the statistical comparisons since it allows for systematic comparison of different, predefined brain regions. Yet, future studies should also evaluate the application of optimized methods for the comparisons. In this context, the lack of mapping results regarding certain CPS regions, for example, the lower parts of the temporal lobe, is another limitation of our study and rTMS mapping per se. When applying rTMS to these regions, pain is induced in most subjects. Hence, these regions are not accessible for rTMS in most cases [2, 5]. On the one hand, this might be an advantage of fMRI mapping; on the other hand, we could not find more activations within these regions as detected by fMRI, at least in the present study.

Conclusions

In the present multimodal approach, we performed language mapping by fMRI (WGEN task) and rTMS (ON task) in 40 healthy, right-handed volunteers. For the tasks selected, the results of the two techniques differed in a wide range, and we observed a slight to fair level of agreement between both methods according to Cohen’s kappa coefficient. Moreover, the results showed that the agreement is partially dependent on the ERTs applied to rTMS results. Since the possibility of verifying the results by comparison to the gold standard technique, DCS, is not feasible in healthy volunteers, we need to be careful about calling one of the two techniques the superior one. However, this study yields valuable results regarding the comparison of these frequently used, non-invasive language mapping techniques for the most common language tasks applied in the neurosurgical routine to date.