Keywords

1 Introduction

MEG has the ability to provide information about the temporal activity of brain signaling with excellent temporal resolution (ms), and good spatial resolution (mm range for single source localization and cm range for source discrimination (Supek and Aine 1993, 1997)), and therefore has a unique potential as a tool to investigate brain activity. Furthermore, since MEG offers the capability of providing comprehensive information concerning brain signaling it can also be used for characterizing the fine temporal dynamics of signals underlying cognitive deficits in clinical populations. However, to date, there has been a lack of accepted standards within the MEG community as to what types of analyses are optimal for which types of studies. It is understood that with a given set of assumptions and parameters, the analysis methods each have unique strengths and weaknesses, depending on how they are used (for some examples see (Liljestrom et al. 2005)). Yet a systematic understanding of these methods remains limited. This is in part due to the mathematically ill-posed nature of the inverse problem for source reconstruction of MEG data (i.e., the reconstruction of the current distribution inside the brain based on measurements made outside the head). To solve the inverse problem, constraints need to be applied to obtain a unique solution (Baillet et al. 2001). These constraints vary between analysis methods (Hämäläinen et al. 1993), thereby making certain analysis techniques more appropriate for particular research questions, and making it challenging to choose one or a few analysis methods as “best” in most cases as has occurred in other neuroimaging fields (e.g. fMRI, PET). To further complicate the standardization of MEG data analysis techniques, the various MEG systems have different types of sensor pick-up coils, different number of sensors, and a variety of filtering methods and analysis software, much of which is proprietary.

Of the four broad categories of inverse procedures: equivalent current dipole (ECD), minimum norm (L1 and L2 norms), beamformer, and Bayesian, each has limitations associated with it as discussed below. Critics of the earlier dipole modeling approaches emphasize the difficulties in: (1) accurately localizing more than one or a few point current dipoles; (2) using point current dipoles to localize extended sources; and (3) determining the number of sources to be included in the search a priori (Liu et al. 1998; Fuchs et al. 1999; Uutela et al. 1999; Huang et al. 1998, 2006; Lin et al. 2006; Mattout et al. 2006, Mosher et al. 1992). Our greatest concern for the multidipole, spatiotemporal modeling methods is that under-estimation of the number of true sources can compromise location and timecourse accuracy for the identified sources (Supek and Aine 1997; Greenblatt et al. 2005). This is because multidipole modeling methods attempt to account for the entire measured signal via a set number of sources, and the omission of one source will generally change the position and/or magnitude of other sources to account for the signal from the omitted source. This is not true for the minimum norm, beamformer, or Bayesian methods. We later discuss a CSST dipole modeling technique, and show how it can accurately localize (mm spatial resolution) simple and complex source configurations.

In contrast, critics of the minimum norm-based (Hämäläinen et al. 1994) approaches state that: (1) the results often appear smeared, even for point current sources and at times may become split across lobes which produce spurious or ghost sources leading to imprecise estimated dynamics (David et al. 2002; Michel et al. 2004; Lin et al. 2006); (2) the solution is biased toward superficial source locations leading to the application of depth weightings by some groups (Ioannides et al. 1990; Lin et al. 2006); (3) the smeared or broadened effect becomes more pronounced with a decrease in signal-to-noise, potentially leading to false positive sources (Wischmann et al. 1995); and (4) it is severely under-determined thereby requiring the use of regularization methods to restrict the range of possible solutions.

Although the linearly-constrained minimum variance (LCMV) beamformer (Vrba and Robinson 2000) has higher spatial resolution than minimum norm-based methods when cortical sources are focal, the underlying assumption is that neural sources are incoherent. Coherent signals will cause the beamformer to fail in finding locations of other coherent sources due to partial cancellation (Hui et al. 2010) which is a potential problem for cognitive data where coherence typically abounds. For example, in working memory studies, activity tends to synchronize across many widespread brain regions for seconds (Aine et al. 2003). Fortunately, several groups have recently introduced variants of the beamformer that can reportedly deal with coherent sources, with some restrictions [e.g. Dalal et al. (2006); Brookes et al. (2007, 2011); Diwakar et al. (2011); Moiseev et al. (2011)visual and auditory studies]. However both beamformer and minimum norm techniques have some difficulty in examining functional connectivity or cortical interactions, given the robust cross-talk present in the data (Hui and Leahy 2006; Hui et al. 2010). But, the general advantages of minimum norm and beamformer methods are that they require less analysis time making them quicker to use.

Finally, there are Bayesian methods (Jun et al. 2005; Schmidt et al. 1999; Wipf et al. 2010). The current drawback of these methods is that they have not yet been widely applied to empirical data. In part this may be due to a need for large computational resources since some versions utilize a Markov Chain Monte Carlo approach to generate sets of activity parameters that are distributed according to the posterior distribution (Schmidt et al. 1999). However proponents of this method state that the Bayesian method combats the issue of ill-posedness by offering a general formulation of regularization constraints. In addition, the Bayesian approach provides statistical performance tools. These tools include the estimation error covariance and the marginal probability density of the measurements (Brooks et al. 2005).

Recently, the strong interest in functional connectivity that has arisen in the MEG field has investigators combining some of the above mentioned localization methods with other types of analyses to determine which and how sources of activity are temporally related. Functional connectivity has historically been assessed in sensor space (e.g. de Pasquale et al. 2010), but new methods are being developed to determine functional connectivity in source space. For example, Brookes et al. (2011) have used a beamformer localization method, along with a Hilbert transform to derive the analytic signal, to which independent component analysis (ICA) is applied, in order to identify the functional networks of activity. The oscillatory and DMN simulations that have been created and described in Sect. 2.6 could be used to further characterize the strengths of such an analysis procedure.

Given the above, we have established the MEG-SIM website containing both a series of realistic simulated data sets and empirical data sets for testing purposes (http://cobre.mrn.org/megsim/). Through a partnership formed between the Mind Research Network (MRN), Massachusetts General Hospital, University of Minnesota/Veterans Affairs in Minneapolis, University of New Mexico, and Los Alamos National Laboratory, we acquired MEG data using three different MEG systems (VSM MedTech 275, Elekta-Neuromag 306, 4-D Neuroimaging 3,600) and three different sensory paradigms (visual, auditory and somatosensory) for each of 9 participants. A grant from NIMH (R21MH080141) then allowed us to create realistic simulated data derived from the real noise contained in the collected empirical data. A web portal was established so others can access both the simulated and empirical datasets with the hope of furthering algorithm performance assessment and development through the MEG-SIM website. We refer to the testbed as ‘realistic’ simulated data because: (1) colored noise is used in most examples (i.e., simulations are embedded in spontaneous data containing correlated noise); (2) the simulated timecourses and source locations are based on findings from empirical data; (3) focal and extended cortical patches are created from MRIs of individual participants (i.e., the SNR and orientation of sources differ across participants); and (4) in some cases each of the unique single trials and continuous data, mimicking actual data acquisition, are provided.

We assert that if an algorithm fails to identify the simulated sources and timecourses under realistic conditions (e.g., similar SNR as empirical data with real artifacts occurring at random intervals), then one cannot realistically expect to obtain correct results in empirical data. If an algorithm provides reasonable solutions to simulations then it is standard practice to next apply the algorithm to simple sensory empirical data where the literature provides information on the expected locations and timecourses of sources (e.g., non-human primate studies) before attempting analysis of cognitive datasets, where the literature is not yet well established. We have designed the simulated datasets to provide a wide range of realistic examples emulating brain activity. We specifically tried to design these simulations such that one analysis approach would not be favored. We hope developers will utilize these data to further develop and refine MEG analysis methods. Similarly, we hope that users of the algorithms will compare and contrast their favored approaches with others. Because we are avid users of a semi-automated, multidipole, spatiotemporal approach [Calibrated Start Spatio-Temporal or CSST; (Ranken et al. 2002, 2004)], many of the solutions shown herein are from the CSST algorithm to demonstrate the efficacy of these simulations. Because the empirical datasets were covered in depth in Aine et al. (2012), we only briefly describe those that are available at the MEG-SIM website in Sect. 3 of this chapter.

2 Simulated Datasets

2.1 Software

The simulated data were primarily created using MRIVIEW and MEGAN software, both of which are made available at the MEG-SIM website. MRIVIEW (Ranken and George 1993; Ranken et al. 2002) is a software tool for integrating volumetric MRI head data with functional information (e.g., EEG, MEG, fMRI—see chapter in this volume by Ranken for further details on MRIVIEW). A Forward Simulator is included in MRIVIEW for creating multiple focal or distributed-source regions of arbitrary size and orientation, allowing users to create a vast array of simulated datasets. We have used these tools previously for simulating epileptic spikes that were then embedded in spontaneous activity from patients (Stephen et al. 2003a, 2005).

MEGAN (E. Best) organizes the data from the different MEG systems into a consistent data format, netMEG, a self-documenting and highly portable file, written using netCDF format. This netCDF file is imported into MRIVIEW. The simulated sensor measurements are obtained by summing the forward fields from all of the simulated sources. White noise, simulated noise or real noise from MEG acquisitions can then be added to the calculated forwards to generate simulations of empirical MEG data. More information about MEGAN can be found in Aine et al. (2012).

CSST (Calibrated Start Spatio-Temporal) is a multidipole, spatiotemporal modeling approach to source localization that has been automated, i.e., it takes the traditional starting parameter guess(es) out of the hands of the investigator. CSST uses the Nelder-Mead non-linear downhill simplex procedure to perform a spatial search (Nelder and Mead 1965) and utilizes information based on a singular value decomposition (SVD) of the data matrix for determining an approximate number of sources to be localized (a range of source models is then chosen by the investigator). CSST runs multiple instances of the downhill simplex search from random combinations of MR-derived starting locations from within the head volume on a Linux PC cluster. CSST has been used extensively with both Neuromag 122 and CTF 275 MEG systems (Stephen et al. 2003a, b, 2005, 2006; Aine et al. 2000, 2010) as well as the Neuromag Vectorview 306-system (Stephen et al. 2012; Susac et al. 2010, 2011; Golubic et al. 2011). CSST has also been thoroughly tested on EEG data.

2.2 Physiologically Plausible Simulations

The initial simulated datasets were constructed using two different-sized patches of cortex determined via MRI (~4 and ~20 mm2) and two different source strengths (30 and 50 nAm). We used these values because our previous empirical results suggest that those current strengths are typical of what is encountered in visual and auditory studies [e.g. Table 2 in Aine et al. (2006) and Fig. 4 and Table 3 in (Aine et al. (2005)]. In addition, the empirical visual paradigm used to acquire data at each MRN partner site utilized small and large stimuli (1.0° and 5.0° visual angle) designed to activate ~4 mm2 of tissue and ~20 mm2 of tissue in primary visual cortex, according to the cortical magnification factors presented in Rovamo and Virsu (1979). We attempted to equate the simulated and empirical parameters since the goal was to produce both focal and extended activity. This is necessary to evaluate analysis methods where source extent is believed to be dealt with less effectively (e.g. dipole modeling). The somatosensory study used electrical stimulation of the index finger and median nerve, to produce focal versus extended sources. The auditory study used individual pure tones and bursts of white noise to evoke focal versus extended activity. Additional justification for parameter choices can be found in Aine et al. (2012).

2.3 Simulated Visual Data

The locations, timing, and extent of the simulated sources (see Table 1 for Sets 1–5) were generated based on our previous basic visual (Stephen et al. 2002) and visual working memory studies (Aine et al. 2006). Set 3 differs from Set 1 in having synchronous late activity. Set 1.B and 3.B differ from 1.A and 3.A in dipole strengths (i.e., larger cortical patches). Note, these latencies are modeled after empirical visual studies but they were embedded in the noise file so that ~200 ms was treated as prestimulus baseline. DLPFC (dorsolateral prefrontal cortex) and AC (anterior cingulate) were treated as ramping activity peaking later in time. Definitions of areas are: V1 = visual area 1; V2 = visual area 2; V3 = visual area 3; I. LOG = inferior lateral occipital gyrus; IPS = intraparietal sulcus; S. LOG = superior lateral occipital gyrus; RHC = right hippocampus. We varied the synchronicity of sources to allow developers to determine an algorithm’s sensitivity to fine temporal changes. Parameters that vary within and across datasets include: number of sources, focal versus extended sources, source strengths, degree of synchrony of sources, and noise level or type of noise (white noise or spontaneous noise). The first 5 sets were produced for 5 participants using individual cortical geometries, different SNRs, and empirical noise data from both the CTF Omega 275 and Neuromag Vectorview 306 MEG systems. Although it was a goal to simulate these cases for the 4-D Neuroimaging Magnus 3,600 system as well, funds for this project ended before we could do so. Timecourses were usually modeled using 3 Gaussians (e.g., early spike-like activity followed by later slow-wave activity) as typically found in many visual and auditory MEG studies (Portin et al. 1999; Aine et al. 2003, 2005, 2012; Vanni et al. 2004; Kovacevic et al. 2005).

Table 1 Onset latencies and amplitudes of sources in different visual areas used for each simulated dataset. Reprinted from Aine et al. (2012) with permission from Springer

In the simulated example shown in Fig. 1, a Freesurfer-segmented gray matter/white matter boundary for the simulations was imported into MRIVIEW (Fig. 1a), although the segmentation may also be accomplished within MRIVIEW. The simulated activation timecourses (signal) are shown Fig. 1b. In each case, 100 single trials of real spontaneous background activity were averaged together as the noise trial for each of the 5 participants and for each of the MEG systems (Fig. 1c). Then the signal was embedded within the averaged noise file (Fig. 1d). For all simulated datasets on the web portal, a spherical head model was used for the simulations and modeled data; however, a boundary element model (BEM) is also available in MRIVIEW.

Fig. 1
figure 1

A Freesurfer-segmented gray matter/white matter boundary for the simulations (shown in red) was imported into MRIVIEW from which patches (a) of simulated activity (b) were generated. 100 passes of spontaneous activity or noise (c) were identified using CTF software (Data Editor) and averaged together using MEGAN. The simulated activity was embedded within the averaged noise file (d) and saved in netCDF format (i.e., a netMEG file in MEGAN). Reprinted from Aine et al. (2012) with permission from Springer

Table 2 shows actual source locations, CSST estimated source locations, and errors when either noise was absent (no-noise) or empirical noise was present for visual simulated data Set 4. CTF head-centered coordinate system is used, where -x points out the back of the head, +y points out the left ear, and +z points out the top of the head. Average error across the 6 sources was 0.1 mm for the no-noise condition and 6.8 mm for the real noise condition. Standard deviation (SDev) is shown for estimated solutions for real-noise simulated data. This table demonstrates that the presence of real noise significantly affects source localization accuracy; however, our CSST solution for the real noise condition was still good for this complicated dataset, and inconsistent with previous critiques of dipole-modeling approaches that state dipole methods cannot accurately localize more than a few point sources of activity. Further, Table 3 lists CSST output when varying the model order (i.e. number of fitted dipoles) for a 3-dipole simulated dataset. The solutions (1–4 Dipoles) shown are for real spontaneous noise. Timecourses (shown as absolute values, bottom) are from the 4-dipole fit to 3-source data. In Table 3 the entries P1, P2, and P3 correspond to the Pk 1, Pk 2, Pk 3 timecourses. Notice that the noise timecourse is low-amplitude and without structure. As this table shows, under-modeling (1- and 2-dipoles) results in large localization errors. In contrast, localization errors are often reduced when over-modeling by 1 dipole (i.e., 4-dipoles for this 3-source dataset). Fortunately, noise sources are often easy to identify by a lack of timecourse structure and low amplitude (lower right panel).

Table 2 Actual and CSST estimated (“no-noise” and “real-noise”) locations for a 6-source, realistic simulation
Table 3 Sample output from an automated routine for determining best-fits to 3-source simulated data

Set 6 (remaining sets are not shown in Table 1) includes late activity (e.g., 400–600 ms) that was synchronous across four cortical sites (V1, I. LOG, IPS, and DLPFC), as is seen in working memory studies (Aine et al. 2006). The upper left panel of Fig. 2 displays the locations of the cortical patches (cortical patches are located at the cross-hairs) while the timecourses assigned to the cortical patches are shown beneath the MRIs. The averaged waveforms (128 trials with signals embedded in real spontaneous noise) seen across the 275 channels of the CTF MEG system are shown in the middle left column. CSST source locations are shown in the upper right panel (see tabled values). The table shows the coordinates of the actual sources, the estimated source locations, and the errors using Euclidean distance. Net source orientation errors were 42.0° for V1, 58.2° for I. LOG, 20.9° for IPS and 48.0° for the DLPFC sources. However, summarizing absolute orientation error is challenging since the original sources consisted of patches of cortex with the orientation of the patch activity conforming to the cortical folds. The middle right panel shows the estimated timecourses and source locations. The average localization error across all 4 sources was 6.7 mm with the greatest error for the I. LOG source. The cross-correlations between timecourses are shown in the bottom row of Fig. 2. We examined early activity first (200–350 ms–bottom left panel) which shows that V1 activity correlated highly with I. LOG, regions showing the initial spike-like activity (~280 ms). IPS and DLPF cross-correlations were also highly correlated with near zero-lag. The maximal correlation coefficients of the other pairs of sources were lower in value and were not near zero-lag. In contrast, the late activity (350–600 ms—bottom right panel) shows higher zero-lag correlation coefficients for activity between the 4 brain regions (i.e., late activity was synchronous across brain regions) with IPS and DLPFC revealing the highest correlation coefficient. This dataset is also suitable for examining coherence either between sensors or between reconstructed sources.

Fig. 2
figure 2

Simulation results for a 4-source model (Set 6) where all sources became synchronous during the later interval (see upper left panels for source locations (cross-hairs) and timecourses of the sources). Amplitudes and peak latencies were jittered across each of 128 single trials. The averaged waveforms seen at the sensor level for the CTF system are shown beneath the input timecourses. Upper right table shows CSST actual locations and errors associated with modeled source locations. The middle panel shows location and timecourse plots of the CSST solutions. Bottom row shows cross-correlations between source timecourses for an early interval (left) when there was some asynchrony across sources and a later interval (right) when all sources became synchronous. Adapted from Fig. 5 Aine et al. (2012) with permission from Springer

Next, single-trial datasets were created with and without oscillatory activity, with some reflecting functional connectivity in a working memory task, which are suitable for additional types of analyses (i.e., time-frequency analyses, Granger Causality, etc.). In this case, sources embedded within 128 single trials of noise were jittered about their mean latency and amplitude. This dataset (Set 7) is similar to Set 6 (VSM-CTF MEG System). Again, the four cortical sites were: (1) primary visual cortex (V1); (2) inferior lateral occipital gyrus (I.LOG); (3) intraparietal sulcus (IPS); and (4) dorsolateral prefrontal cortex (DLPFC). The cortical patch current strengths were initially assigned values similar to those we observe in our visual working memory studies (30–50 nAm peaks) using the MRIVIEW Forward Simulator (Ranken and George 1993; Ranken et al. 2002) but were then randomly jittered about those values by up to ±50 % across the single trials. Peak latencies were also jittered across each trial by a randomly selected value up to ±FWHM/2. To allow for source analysis of averaged evoked responses, the 128 single trials were then averaged together and written out to the netCDF file format. Therefore each of the 128 single trials plus the averaged file is available at the MEG-SIM website, in netCDF format.

In Set 8, oscillatory activity was added to Set 7 timecourses (Fig. 3). For the time-locked oscillatory activity, V1, I. LOG, and IPS oscillated between 30 and 60 Hz (gamma band) across the 128 trials while IPS and DLPFC oscillated between 14 and 28 Hz (beta band). Oscillatory activity for DLPFC was delayed by 20 ms relative to IPS, and IPS gamma activity was delayed by 10 ms relative to IPS beta activity (see schematic in Fig. 3a). The delays were meant to reflect normal time delays between visual areas (Stephen et al. 2002). Gamma activity mimicked local circuitry activity between V1, I. LOG, and IPS while beta activity mimicked long-range connections between IPS and DLPFC. For both beta and gamma oscillations, the amplitudes were set at 10 nAm and were then jittered between 5 and 15 nAm across the 128 trials. Note that the latencies, and therefore the phase of the oscillations, were kept constant between brain regions, and also between trials. As with the other simulated data sets, the timecourses were constructed within MRIVIEW, however, they had to be constructed independently; i.e., one timecourse contained the evoked response plus real noise while the other timecourse contained the oscillations without noise. The two timecourses were then added together using a Matlab script. Again, to allow for source analysis of the averaged responses, the 128 single trials were averaged together to create a single averaged dataset, and were written out to a netCDF file (datasets for two subjects were created).

Fig. 3
figure 3

Simulated visual working memory with long-range beta band and short-range gamma band oscillatory activity (see (a) schematic). DLPFC and IPS oscillated at 15–20 Hz while IPS, I. LOG, and V1 oscillated at 30–80 Hz. IPS generated both beta and gamma band oscillations. a The averaged input signal without noise is shown followed by sample single-trials and the averaged data as seen at the sensors of the CTF system. c CSST location estimates and their associated timecourses (d) are shown. e Time-frequency representations using Morlet wavelets for the CSST solutions shown above. Frequency was normalized to the Nyquist frequency = ½* sampling frequency (600 Hz). Oscillatory activity was given 10 nAm on average across trials. Reproduced from Aine et al. (2012), with permission from Springer

Figure 3b shows the input signal at the sensor level across sources before oscillatory activity or noise was added. Sample single trials are shown where peak amplitudes (of both the evoked and oscillatory activity), peak latencies (of the evoked activity only), and frequency of the oscillatory activity were jittered across trials so each single trial is unique. The average of the 128 single trials is shown beneath. Figure 3c and 3d show the output of the CSST algorithm. CSST provides both the locations of the dipoles and the reconstructed timecourses of activity. Table 4 contains the results of this analysis for the two visual/working memory datasets that were created for the first subject (i.e., single trials averaged with and without oscillatory activity). Our results show that CSST can accurately reconstruct both temporal and spatial characteristics of the simulated datasets, even with noisy and oscillating sources. Time-frequency plots are shown in Fig. 3e for gamma and beta bands. Gamma band activity is primarily seen in dipoles located in V1, I.LOG and IPS, which is consistent with the simulated data. No gamma activity was provided to DLPFC and correspondingly, gamma activity during this interval of time is essentially non-existent. It appears that the initial spike-like activity in the timecourse has a predominantly beta component to it as seen in the V1 and I.LOG beta band plots. IPS and DLPFC, in contrast, reveal beta band activity throughout the interval, which is consistent with the simulated data. This shows how our realistic simulated oscillatory activity datasets can be used for testing various frequency analyses and inverse procedures. Again, these data also come with all 128 unique individual trials for investigators wishing to apply single trial analysis methods.

Table 4 CSST results for simulated datasets with 4 visual sources based on averaged waveforms without oscillatory activity (top) and with oscillatory activity (bottom) for Subject #1

Many MEG/EEG investigators are familiar with more traditional analyses of functional connectivity such as that provided by coherence analysis. Here we show that coherence analysis can be conducted both at the sensor and the source level using our simulated datasets. For example, a sensor near V1 which showed a large evoked response was chosen as the sensor of interest (see Fig. 4a, sensor #273 encircled by a green ring). Next the averaged simulation file (Set 7) was imported into Matlab where “mscohere” was used to determine the coherence of sensor 273 with every other sensor in the MEG array for the frequency range 30–60 Hz. This coherence analysis was repeated for the simulation in which oscillations had been added to the sources as described above (Set 8). Results show a clear increase in coherence between sensors which had gamma band oscillations added to nearby sources. Coherence analyses were also carried out at the source level for Set 8 (Fig. 4b). In this example, coherence in the beta band was examined between sources (i.e., output from CSST). Beta oscillatory activity was added to DLPF and IPS sources and the bottom figure of Fig. 4b shows the resulting coherence between these two sources (IPS is the reference source shown in white and its coherence (normalized magnitude) with DLPF is represented by red color). It turns out that the initial spike-like activity of the timecourses also has a beta band component as indicated by the coherence between reference source V1 (shown in white in the upper Fig. 4b) and I.LOG. Recall that the time-frequency plots shown in Fig. 3e also revealed this information (see Beta activity for V1 and I.LOG).

Fig. 4
figure 4

a Sensor level coherence analysis with no oscillatory activity applied to underlying sources (top) and with oscillatory activity applied to underlying sources (bottom). b Source level coherence analysis relative to the white source (V1 Top, IPS Bottom) of Beta band activity. Level of coherence is indicated by the colorbar

For the final visual simulated dataset (Set 9), the same data as Set 8 was created for the Neuromag 306 system with different noise trials and sensor configuration relative to the CTF 275 system. In this case, a Matlab program utilized the netCDF toolbox for manipulating the opening and closing of the netCDF files containing the individual evoked waveforms and the individual oscillatory waveforms, which were created at cortical locations as similar as possible to Set 7. The simulated data were again created using MRIVIEW and MEGAN. Matlab was used to import the timecourses of the individual areas of evoked activity which were then jittered (in the same way as discussed above) and combined with randomly selected instances of Neuromag 306 noise which was read into Matlab using Fieldtrip functions (http://fieldtrip.fcdonders.nl/). One hundred single trials were created containing evoked and oscillatory activity. This was automated by the process of generating single trials described previously for Set 8. The 100 single trials were then averaged together and saved to a netCDF file, to be used with CSST analyses, and to a Neuromag 306 FIF file to be used with Curry, a commercial software package (Compumedics Neuroscan, Charlotte, NC http://www.neuroscan.com/) for the sLORETA and SWARM analyses (Wagner et al. 2007) discussed below.

2.4 Preliminary Examples of Analysis Algorithm Output for Visual Simulated Data

First, for comparison, multidipole, spatiotemporal source localization was conducted for Subject #2 using the CSST algorithm for simulated data Sets 8 and 9 (CTF and Neuromag systems, respectively). Table 5 shows the results from these analyses. Location was considered “not found” if it was ≥50.0 mm from the true source. Once again CSST determines the locations of the active cortical areas with a good degree of accuracy. We do find obvious differences between the results for the CSST dipole fits for the two different subjects (compare Tables 4 and 5) and between the same subject and the two MEG systems (Table 5). This was not surprising since the simulations were (1) created using each subjects’ MRI, therefore, the exact location of the cortical patch differs somewhat between subjects which will result in different waveform distributions at the sensor level for the different MEG systems; and (2) the V1 source was given a smaller initial amplitude (30 vs. 50 nAm) in Subject #2, making it more difficult to identify. Furthermore, there is also a slight variation in the noise trials chosen since the noise trials were taken from the empirical datasets (therefore noise varied across the MEG systems).

Table 5 CSST results for Subject #2 for both CTF (Set 8) and Neuromag (Set 9) MEG systems

We next report the results of two L2 minimum norm-based current distribution analyses, sLORETA and SWARM, available in Curry for the datasets made for Subject #2. In current distribution models, the cortex is divided up into a large number of elements, which form the solution space. Since the primary source of the MEG signal is assumed to be associated with postsynaptic currents, a current dipole is assigned to each of the many tens of thousands of tessellation elements (user chooses exact number depending upon desired resolution). Additionally, since the problem is under-determined (i.e. there are fewer equations than unknowns), the weighted least-squares criterion requiring that the prediction error is minimized must be augmented with an additional constraint to select the best current distribution among those capable of explaining the data. In the case of the basic L2 minimum norm approach, the mathematical criterion is the solution that minimizes the power (L2-norm) of the dipole moment. After adding noise normalization, statistical significance of current estimates relative to the level of noise can be determined using “dynamic statistical parametric” maps; sLORETA is a variation of this approach (Pascual-Marqui et al. 1994, 1999; Dale et al. 2000; Pascual-Marqui 2002; Wagner et al. 2004, 2008), while SWARM (Wagner et al. 2007, 2008) is an sLORETA-based method that provides current estimates instead of probabilities. Simulated data was read into the Curry software package using either DS files (for the CTF simulations) or FIF files (for the Neuromag simulations). This allowed Curry to assign the correct coordinate system when importing the data and provided access to the digitized fiducials in the files to be used for accurate alignment with the subjects MRI, which was also imported into Curry.

Figure 5 shows preliminary results of the sLORETA and SWARM analyses carried out using the Curry software package. The CTF simulations show results that are more distributed in the IPS/I.LOG/V1 areas in both sLORETA and SWARM in comparison to the simulations made with the Neuromag system, which shows more focal solutions. This is not particularly surprising based on the fact that planar gradiometers are more sensitive to signals directly below the sensors. We additionally provide the results at two different thresholds, to show that some activation may not be seen if the threshold is too high, e.g. compare the CTF sLORETA results in Fig. 5, where the DLPFC area of activity is lost at the higher cutoff. Figure 5 also shows that sLORETA was unable to find DLPFC activity at either cutoff in the Neuromag data. In addition, it is possible to extract timecourse activation from the SWARM analysis. Although Curry software provides timecourse extraction via “CDR dipoles”, an ECD method, it also contains the functionality to save the SWARM results into a Matlab file format for further investigation. We utilized the latter method. As a first step to show how timecourses can be extracted from the SWARM data we chose to identify areas of activation as simply as possible. To this end we used Matlab to identify the areas of highest activation from the SWARM data that Curry created, after importing the Curry output into Matlab. We then plotted the timecourses at those locations (right portion of Fig. 5); the only constraint was that the independent sources be greater than 2.0 cm apart, which we empirically chose such that different sources were resolvable at this separation. Note that the added oscillations (e.g., beta and gamma band activity) can be easily identified. We have less experience with these two L2 minimum norm-based analyses, therefore they should be considered preliminary and no tables of error values are offered. We present a preliminary report here hoping to encourage others to investigate these analyses further using the same simulations. It is clear however that these simulated datasets are already providing a reasonable challenge for a variety of analysis methods.

Fig. 5
figure 5

a sLORETA results using Curry at two different cutoff values (30 and 50 %) for the same active cortical areas mixed with spontaneous noise files from the CTF and Neuromag systems. b SWARM results using Curry at two cutoff values for the same active cortical areas and noise files used in (a). c Timecourse reconstructions from SWARM using simulated datasets in (b) (both CTF and Neuromag). Reproduced from Aine et al. (2012), with permission from Springer

2.5 Simulated Somatosensory and Auditory Datasets

Simulated somatosensory and auditory datasets are also available at the web portal. Simulating median nerve stimulation provides one of our simplest cases. This activity consists of contralateral primary somatosensory (SIcontra), contralateral secondary somatosensory (SIIcontra), and ipsilateral secondary somatosensory cortex activity (SIIipsi). In addition, an auditory dataset provides a simple example of initial synchronous, bilateral activity in auditory cortex. This set also includes asynchronous activation of the temporo-parietal junction and cingulate cortex (4 cortical sources). For additional details on these datasets please refer to Aine et al. (2012).

2.6 Preliminary Work on a Default Mode Network Dataset

Our newest and most preliminary simulation focuses on resting state data; that is, we have developed a simulated default mode network (DMN) based on what is typically found in the MEG/EEG and fMRI literature. For example we used a low alpha oscillation, and the approximate locations for simulated activity included prefrontal cortex (PFC)/medial prefrontal cortex, posterior cingulate cortex (PCC), and right and left anterior parietal lobes (Brookes et al. 2011; Allen et al. 2014). This first attempt exaggerates the probable size of some of the nodes for initial testing purposes, and may underestimate others. Four 20 mm diameter patches (approximately spherical) were located as shown in Fig. 6a within MRIVIEW. Each was given a 10 Hz oscillation, with, at this time, no relative phase lag. Simulations with oscillation amplitudes of 20, 100, and 200nAm were created and combined with resting state data from the Neuromag 306 MEG system. The simulations were saved as both continuous files and averaged files, in both netCDF and FIF formats. The simulation with the 100nAm oscillations was then analyzed with three different methods, CSST and SWARM (from within Curry software) which have been discussed previously, and ICA. For the ICA analysis EEGlab (Delorme and Makeig 2004; http://sccn.ucsd.edu/eeglab/) was used to separate the data into 102 independent components (ICs), using only the Neuromag magnetometer data from simulations, due to current capabilities of the EEGlab software. Next, the 5 largest alpha band contributors were determined by the EEGlab software and combined, as shown in the output, Fig. 6b. This is a typical EEG/MEG DMN pattern, as expected (Hui et al. 2010; Brookes et al. 2011).

Fig. 6
figure 6

a Distributed source locations created within MRIVIEW to simulate DMN activity. Each was given a 10 Hz oscillation, with a 100 nAm amplitude. b ICA analysis showing pattern of activity similar to that seem in the literature for DMN. c SWARM analysis using Curry software. d CSST analysis

In addition, as seen in Fig. 6c, SWARM accurately reconstructs the DMN pattern, with some additional sources of activation. And CSST, with a 6 dipole fit, does a good job of accurately locating these distributed sources, although the anterior parietal lobe sources are skewed medially, possibly due to the influence of the large PCC source. SWARM and CSST analyses were conducted on averaged data. As mentioned previously, this simulation and analysis is preliminary.

3 Empirical Datasets

Empirical MEG/MRI data were acquired for 9 participants at MRN, Massachusetts General Hospital and University of Minnesota/Veterans Affairs in Minneapolis. Data from 5 of the participants are available on the MEG-SIM website. Data were acquired using the VSM MedTech 275, Elekta-Neuromag 306, 4-D Neuroimaging 3600 systems and 3 different sensory paradigms (visual, auditory and somatosensory) for each participant. Most participants had repeat testing conducted the following day, which are also available. General characteristics of the sensory studies were mentioned in Sect. 2.2 while detailed information is presented in Aine et al. (2012).

4 Discussion

One objective of the MEG-SIM portal is to offer developers of MEG methods an extensive testbed of realistic simulated and empirical data, established for the purpose of quantifying the strengths and limitations of each analysis method for the purposes of method standardization. This will aid in the refinement and further development of algorithms. Second, we are all aware that some analysis procedures are better-suited for certain types of studies while other analysis procedures are better-suited for other studies. This set of realistic simulated data provided at the web portal (http://cobre.mrn.org/megsim/) includes sample datasets emulating sensory and working memory-related processes across visual, auditory, and somatosensory modalities. Users of MEG analysis procedures should be able to make informed decisions as to which analysis tools are best-suited for their research goals by working with these datasets.

The recent creation of continuous and single trial simulated datasets permit testing of a wider variety of MEG analysis tools. Construction of continuous data that mimic the differences between epochs of real data allow the use of analysis techniques such as ICA to be used individually or in conjunction with various source modeling techniques to identify functional networks. These results can then be compared with traditional source analysis conducted on averaged data, both at the source and sensor levels. With the addition of oscillations to the simulated datasets the accuracy of functional connectivity measures between various brain areas using different analysis methods can also be investigated. Due to requests, system-specific formats have been added, with identical cortical areas and strengths of activation. For example, some of the simulated datasets described here are now available in a variety of file formats, including netCDF, Neuromag FIF, CTF DS and Curry (Compumedics, Neuroscan). Hopefully, the creation of these new datasets and formats, including novel continuous and DMN simulations, will foster algorithm performance comparisons and facilitate cross-site collaborations. We hope that these examples provide sufficient evidence of the flexibility of the simulations we created and we encourage others not only to use the simulations that are currently available but also to suggest additional simulations that may have widespread interest within the community.