Introduction

The electroretinogram (ERG) at high temporal frequencies (typically above 30 Hz) elicited by a heterochromatically modulating stimulus is often used to obtain an objective estimate of the individual spectral luminosity function (identical to the individual V λ in humans; [17]). The stimulus and procedure is similar to the heterochromatic flicker photometric method used in psychophysical measurements to determine the individual V λ: two lights of different wavelengths are modulated in counterphase and equal contrast. The radiance one of the two lights is varied until the perceived flicker (in the psychophysical measurements) or the ERG response is minimal. The results of the psychophysical and the ERG measurements resemble each other closely. The individual variability that can be found with the psychophysical HFP measurements is also reflected in the ERG data [3, 8]. Furthermore, psychophysical sensitivities to cone isolating stimuli are found to correlate with electroretinographical data at high temporal frequencies [9, 10]. The resemblance between psychophysical and electroretinographical HFP data indicates that the high temporal frequency ERGs reflect the activity of the luminance pathway of which the magnocellular system is the physiological basis.Footnote 1 Recent work has indeed shown that ERG responses to high temporal frequency (>30 Hz) sine-wave modulations to combined luminance and red–green chromatic modulation are mainly luminance driven [11, 12].

The individual variability in the spectral luminosity function is to a large extent determined by the ratio of L- to M-cone input strengths. In agreement with the notion that high temporal frequency ERGs mainly reflect activity of the luminance channel, the L-/M-input ratio to the high temporal frequency ERG is in trichromats correlated with the psychophysical L/M ratio in the luminance channel [9]. Because deuteranopes and protanopes principally have opposite L/M ratio (infinite for deuteranopes and zero for protanopes) or L-fractions [L/(L+M); unity for deuteranopes and zero for protanopes], it can be expected that protanopes and deuteranopes can be distinguished on the basis of the L/M ratio or L-fraction in the high-frequency ERG. Psychophysical and electrophysiological data on the luminance system of trichromats reveal L/M ratios that are generally larger than one and can be as large as 10:1 with full-field stimuli [9, 13]. Therefore, data from deuteranopes and trichromats may be difficult to distinguish based on luminance sensitivity. We therefore hypothesize that protanopes can be distinguished from deuteranopes and trichromats based on 36 Hz ERG data. The distinction between deuteranopes, deuteranomalous trichromats and normal trichromats with high L/M ratios will be more difficult solely based on high-frequency ERGs to heterochromatic modulations. Additional information is needed. One possibility to provide this information is by measuring the effects of chromatic adaptation: the spectral sensitivity is expected to change with different monochromatic backgrounds in trichromats but not in dichromats [14]. We, however, found in previous studies that L/M ratio (and thus spectral sensitivities) could be quite stable in normal trichromats [13].

We previously found that the second harmonic components to 12 Hz stimuli (i.e. components at 24 Hz) also show resemblances with properties of the luminance channel [15]. Therefore, the 24 Hz components to 12 Hz stimuli may also be used to distinguish different types of colour vision.

The fundamental components of ERG responses to 12 Hz combined luminance and chromatic modulation are shown to reflect L-/M-cone opponency similar to that found in the red–green chromatic channel that is based on parvocellular retino-geniculate responses [11, 12, 16]. Dichromats lack L/M opponency. Thus, the absence of cone opponency may result in substantial different fundamental components in the 12 Hz ERG in dichromats compared to trichromats. Indeed, we previously found that the 12 Hz ERG in dichromats was substantially different from those measured in trichromats [11, 12]. Therefore, the 12 Hz ERG may be an important addition in the objective diagnosis of colour vision deficiencies. If this can be extended to anomalous trichromats is unclear.

The purpose of the present study was to test the proposal that the ERG responses to heterochromatic stimuli can be used to determine whether a subject has di- or trichromatic colour vision and to identify deficient trichromacy. Furthermore, the ERGs are used to quantify the strengths of L- and M-cone driven signals and of the signals reflecting post-receptoral activity using the ERG to combined luminance and chromatic modulation at two temporal frequencies (12 and 36 Hz). We therefore measured the responses in trichromats, dichromats and anomalous trichromats. We deliberately chose to perform the studies in a practice with access to untrained subjects with normal colour vision and with colour vision deficiencies but with little background in the technique of non-standard ERG recordings, in order to identify the methodological and analysis aspects that may need attention and improvement.

Methods

Subjects

In total, 98 subjects participated in these experiments. Their colour vision was established with the anomaloscope (HMC; Heidelberg, Germany) using the system’s program in manual mode and as recommended by Hermann Krastel (personal communication). In total, 38 deuteranopes (age 50 ± 10 years; mean ± S.D.), 16 deuteranomalous trichromats (age 42 ± 8 years), 13 protanopes (age 30 ± 5 years), 19 protanomalous trichromats (age 55 ± 6 years) and 22 normal trichromats (age 28 ± 6 years) participated in this study. The anomalous quotients were between 1.4 and a value >2 in deuteranomalous subjects and between 0.1 and 0.7 in protanomalous subjects. Dichromats displayed larger ranges although not necessarily a match was obtained for the full range between 0 and ∞. All dichromats and anomalous trichromats were males. Of the trichromats, 6 were females and 16 were males. The subjects were referred to the UBM Institute and received an ophthalmological examination and did not have a retinal disorder or any systemic disease that may have an effect on the physiology of the retina (such as diabetes). Possibly, mainly subjects with severe colour vision deficiencies were referred to UBM, so that the number of subjects cannot be considered to be an unbiased sample of the population with colour vision deficiencies. Furthermore, subjects that show dichromacy with the foveal stimuli presented with the anomaloscope may be anomalously trichromatic with larger stimuli [17]. This may have introduced some variability in the results with the anomaloscope. Finally, the normal subjects were generally younger than most colour vision deficient subjects. Age-related effects on e.g. lens optical density were considered to be negligible in the age range of subjects.

The experiments adhered to the tenets of the Declaration of Helsinki and were approved by the local ethics committee. Informed written consent was obtained from the subjects.

Stimuli

Full-field stimuli were delivered with a ganzfeld stimulator (Q450SC, Roland Consult, Brandenburg, Germany). The stimuli were described before [11]. Briefly, only the red and the green light emitting diodes (LEDs) of the ganzfeld stimulator were activated, each with a mean luminance of 100 cd/m2. The outputs of the LEDs were modulated sinusoidally at two temporal frequencies: 12 and 36 Hz. The red and green LEDs were modulated in counterphase. The contrast in the green and red LEDs (C G and C R, respectively, expressed in terms of Michelson contrast: \(C = \frac{{L_{ \hbox{max} } - L_{ \hbox{min} } }}{{L_{ \hbox{max} } + L_{ \hbox{min} } }}\), where L max and L min are the maximal and minimal luminance emitted by the concerning LED) was varied whilst the sum of the contrasts was fixed at unity (C G + C R = 1). The stimulus condition is expressed as the fraction of red contrast (F R) in the total contrast: \(F_{\text{R}} = \frac{{C_{\text{R}} }}{{C_{\text{R}} + C_{\text{G}} }}\). Thus, when F R = 0, then the output of the red diode was constant at 100 cd/m2, whereas the green diode modulated with C G = 1. The reversed was the case for F R = 1. For F R = 0.5, the red and green LED diodes modulated with equal contrasts of 0.5. For the 36 Hz stimuli, 9 values for F R were employed: 0, 0.1, 0.2, 0.3, 0.35, 0.4, 0.5, 0.7, 0.8 and 0.9. At 12 Hz, in total 11 stimulus conditions were used: the same values of F R as for the 36 Hz stimuli plus 0.45 and 0.6. We calculated L- and M-cone contrasts for the different conditions. For that we used the spectral outputs of the LEDs as specified by the manufacturer.

ERG recording

ERGs were recorded simultaneously from both eyes with fibre electrodes (Zari electrode). The eyes of the subjects were not dilated because the subjects also participated in colour vision tests involving the anomaloscope. Furthermore, we wanted to test whether the ERG method could be used in routine practice. Reference and ground electrodes were gold cup electrodes filled with electrode past and attached to the ipsilateral temple and forehead, respectively. The ERGs were acquired using the RetiPort system (Roland Consult, Brandenburg, Germany). The ERGs were amplified and band-pass filtered between 1 and 300 Hz. The signals were digitized at 1024 or 2048 Hz sampling frequency.

Results

The recordings were subjected to fast Fourier transformation (FFT) to extract the amplitudes and phases of the fundamental (first harmonic) components. From the 12 Hz responses, the second harmonic components (i.e. the 24 Hz response component) were also extracted. Figures 1, 2 and 3 show the response amplitudes and phases as a function of the red fraction, F R, for the 12 and 36 Hz responses measured in the two eyes of three subjects. Data are shown for a deuteranope (Fig. 1), a protanope (Fig. 2) and a trichromat (Fig. 3). The curves are fits of models to the data. The models were introduced previously [11] and are further described below. The 36 Hz responses of all subjects showed a linear relationship between response amplitude and F R with the absence of a response at a characteristic value of F R. The response phase changed by about 180° at the minimum. All 36 Hz responses were used for further analysis. The first harmonic components of the 12 Hz responses were generally noisier because the response amplitudes were smaller and because the 12 Hz components were more disturbed by low-frequency (1/f) noise that are probably influenced by blinks and eye movements. The second harmonic components to 12 Hz stimuli were less disturbed by this low-frequency noise and a clear trend of response amplitude as a function of F R was found for most observers.

Fig. 1
figure 1

Response amplitudes (left plots) and phases (right plots) measured in a deuteranopic subject. The upper row displays the first harmonic response characteristics as a function of the stimulus’ red fraction (F R). The middle and lower rows show the first and second harmonic responses to 12 Hz stimuli. The black circles and the red inverted triangles are the data for the right and left eyes, respectively. The black drawn and the red dashed curves are model fits to the data. The models are described in the text

Fig. 2
figure 2

Response data for a protanope. The format of the data is the same as for Fig. 1

Fig. 3
figure 3

Response data for a trichromatic subject. The format of the data is the same as for Fig. 1

In the model, it was assumed that the 36 Hz responses were the result of a vector addition of L- and M-cone driven signals. The L- and M-cone contrasts (quantified as the Michelson contrast of excitation modulation of the L- and M-cones) were calculated and were used as a quantification of the cone input strength. In the model, the L- and M-cone input strengths were multiplied by gain factors (G L and G M). Furthermore, the modulation phases of the cone excitation were determined (either in phase with the red or with the green LED). The model assumed that the phases of the cone driven signals (P L and P M) changed by 180° at the amplitude minimum (the silent substitution condition for the concerning cone) and either followed the red or the green LED. Estimates of the four free parameters (G L, G M, P L and P M) were obtained by fitting the model to the data. The fits were performed in vector space by converting the response into vectors. The lengths of the vectors encoded the response amplitudes, and the angles with the positive X-axis depicted the response phases. In the fitting procedure, the sums of square distances in vector space between measured and modelled responses were minimized using the solver routine of the Excel 2010 program. Good fits were obtained from all 36 Hz data and from all subjects. The results of the fits for the data shown in Figs. 1, 2 and 3 are shown as the curves in the upper plots. Observe that the fits could also be performed for the anomalous trichromats although the above-mentioned calculations of cone contrast are not correct because the cone fundamentals of these observers deviate from those of the di- and trichromats. Similar as with the normal pigments, hybrid pigments are expected to have a linear dependency with on cone contrast. However, the silent substitution conditions will be different. The excitations of the hybrid pigments can therefore be described by a vector addition of those the normal pigment excitations. As a result, model fits are also possible for anomalous trichromats and will give intermediate results.

From these fits, an estimate of the L-gain fraction (\(F_{\text{L}} = \frac{{G_{\text{L}} }}{{G_{\text{M}} + G_{\text{L}} }}\)) was obtained. The mean fractions (+SD) for the subjects belonging to the five groups are displayed in the upper plot of Fig. 4. The values of F L were largest for deuteranopes and deuteranomalous trichromats. The values were smallest for protanopes followed by protanomalous trichromats. The values for the normal trichromats were larger than those for protanopes and protanomalous trichromats but only slightly smaller than those for the deuteranopes and deuteranomalous trichromats. This probably reflects the fact that L/M ratios are larger than unity for most trichromats particularly when full-field stimuli are used [13]. The values for F L were not normally distributed within the different observer groups. We therefore performed Mann–Whitney rank sum tests to compare the results from the different groups. After Bonferroni correction for multiple testing, we found that the differences between the different groups were significant (defined at the 0.05 level divided by the number of tests, owing to the Bonferroni correction) except those between D and DA and between DA and T.

Fig. 4
figure 4

Estimated values (means+1SD) of the L-cone fraction (F L; upper and middle plots) obtained from the first harmonic components in the 36 Hz responses (upper plot) and the second harmonic responses in the 12 Hz data (middle plot). Because the values of F L are not normally distributed, a nonparametric test is performed. All values are found to be significant different except between D and DA and between DA and T. The lower plot shows the luminance fraction (F Lum; means + 1SD.). The data are given separately for D deuteranopes, DA deuteranomalous trichromats, P protanopes, PA protanomalous trichromats and for T normal trichromats. The values of F Lum are normally distributed. A T test is revealed a significant difference between protanopes and trichromats

The second harmonic components in the responses to the 12 Hz stimuli were assumed to be the result of a similar interaction of L- and M-cone driven signals. However, the phase data were not informative and did not change with F R (lower right plots in Figs. 1, 2, 3). We propose that, if the two signals are approximately 180° apart, the phase difference in second harmonic components will be about 360°. This phase difference cannot be detected with the Fourier transform that returns phases only in modulo of 360°. We therefore only used the amplitude data and a scalar addition of the L- and M-cone signals, with two free parameters G L and G M from which, again, the L-fraction, F L (\(= \frac{{G_{\text{L}} }}{{G_{\text{M}} + G_{\text{L}} }}\)), was estimated. A scalar addition indicates zero phase difference between L- and M-cone. The results are displayed in the middle panel of Fig. 4. They are very similar to those of the first harmonic components with 36 Hz stimuli. Thus, the additional assumption of no phase difference between the L- and M-cone driven signals did not influence the results.

The first harmonic (12 Hz) components to the 12 Hz stimuli were assumed to be the vector additions of the 36 Hz responses, representing luminance activity, and an L-/M-cone opponent response. The cone opponent responses were assumed to have constant amplitudes and phases for all values of F R. The rationale for this is that the red–green chromatic response originates from a L–M opponency with equal L- and M-cone strength [9], resulting in equal output strengths and phases for all stimulus conditions (see Fig. 2b in Kremers et al. [11]). Again, there were four free parameters: the gains and the absolute phases of the luminance and chromatic responses. The luminance gain fractions (\(F_{\text{Lum}} = \frac{{G_{\text{Lum}} }}{{G_{\text{Chrom}} + G_{\text{Lum}} }}\)) were estimated from the fitting parameters. F Lum gives an estimate of the relative strengths of the two pathways in determining the fundamental component in the 12 Hz response. The mean F Lum values (+SD) for the five subject groups are shown in Fig. 4 lower panel. We only compared the data from the trichromats with those from the two groups of dichromats. The results were found to be normally distributed. T tests revealed that the data from protanopes were significantly different (after Bonferroni correction) from those of the trichromats (p = 0.008).

In Fig. 5, individual values of F Lum obtained from the first harmonic components to 12 Hz stimuli are plotted versus the individual values of F L obtained either from the 36 Hz stimuli (upper plot) or the second harmonic components to 12 Hz stimuli (lower plot). Although particularly the data from the protanopes and the protanomalous trichromats differed from those of the other subject groups, they overlap. Furthermore, the intra-group variability of F L for protanomalous trichromats was large: the values were as low as those for the protanopes for some subjects and similar as in the deuteranopic and trichromatic observers in two others.

Fig. 5
figure 5

Individual values of F Lum as a function of F L obtained from the first harmonics in the 36 Hz responses (upper plot) and the second harmonic components in the 12 Hz responses

Discussion

The goal of the present work was to explore whether ERG responses to heterochromatic stimuli of counterphase modulating red and green LEDs can be used to determine L-fractions (F L) and to quantify the fraction of the signals reflecting activity of the luminance channel (F Lum) in trichromats, anomalous trichromats and dichromats, so that the type of colour vision can be determined objectively using non-invasive electrophysiological techniques. Furthermore, we wanted to come to suggestions for improvements of the methods and analysis so that the technique can be used in a routine manner.

The results of the high temporal frequency responses (36 Hz and second harmonics to 12 Hz stimuli) showed that the L/M ratios can be determined and that there are differences for different observer groups. The calculations of L/M ratios were based on the assumption of the presence of normal cone fundamentals. Of course, this is not the case for the anomalous trichromats. Therefore, the ratios should not be considered as true ratios for these subjects. However, the method can still be used, assuming that the responses of the hybrid pigments in these subjects can be obtained from a linear combination of those from normal L- and M-cone pigments. Indeed, the data show that the anomalous trichromats are intermediate between di- and trichromats. It should be noted that also the normal fundamentals are subject to inter-individual variability caused by genetic variability in the cone opsins [1820] or by differences in preretinal filtering (e.g. in the macular pigment or the lens [21]). These factors have also been suggested as possible differences between small and large field Rayleigh matches [22].

Despite the difference in results between the groups, there was a considerable overlap in the results for the different groups. Notably deuteranopes, deuteranomalous trichromats and normal trichromats were difficult to distinguish. Furthermore, protanomalous trichromats showed a wide range of results, some resembling those of the protanopes; others more like those of deuteranomalous subjects (see Fig. 5). We would like to emphasize that the pupils of the participants were not dilated because the subjects had to undergo other colour vision tests. This probably had an impact on the signal-to-noise ratio (SNR) of the recordings because less light entered the eye and because the amount of light is also variable. Particularly, the 12 Hz data were impaired compared to previous measurements with this stimulus paradigm in which the subjects’ pupils were dilated [11, 15, 23, 24]. We therefore expect that the results will improve further when dilation of the pupils is implemented in the clinical procedure. Future experiments are needed to test this proposal. The use of an artificial pupil will also be helpful in diminishing the variability of light but not necessarily in increasing the SNR due to the decreased retinal illuminance. Finally, the use of age-matched norms should ultimately be developed.

In the calculations of the stimulus conditions, we used the spectral information of the LEDs as provided by the manufacturer. Possibly, these emission spectra were not completely correct. For many deuteranopes, we found that the minima in the 36 Hz responses and in the second harmonic components to the 12 Hz stimuli were actually at a value of F R below about 0.4, which is the stimulus condition, resulting in a silent substitution for the L-cones. The most extreme value of F L, which was considered by the model, was unity, resulting in a minimum that was identical to the silent substitution of the L-cones. Better results would be obtained for even larger values of F L. The results for the deuteranope shown in Fig. 1 (upper left and lower left plots) illustrate this observation. These results may indicate that the LED emission spectra differed. Furthermore, because the pupils were not dilated particularly intrusion from rod-driven responses cannot be completely ruled out. We found that rod responses with full-field stimuli are probably relatively small. However, if the far periphery is not stimulated directly (as might be the case with non-dilated pupils), then rod-driven responses may be relatively large because of stimulation through stray light stimulation. This effect is particularly large at 12 Hz (Aher and Kremers, unpublished data). Possibly, if the rod (and possibly S-cone), isolating stimuli were used, employing the silent substitution method, the disturbing influence might be less. Introducing silent substitution conditions will, however, decrease the gamut of the stimulator. Another possibility would be the use of focal stimuli on a larger background that desensitizes rods and S-cones. Smaller central stimuli may have an additional advantage: It has been found previously that ERGs in trichromats elicited by the far peripheral retina are strongly L-cone dominated [13, 25]. This may result in a smaller difference in comparison with deuteranopes and deuteranomalous trichromats. The L/M ratio is smaller when smaller stimuli are used [13]. We indeed previously found that the present stimulus paradigm might differ more from those of deuteranopes when smaller stimuli were used [24].

In the present study, the effects of chromatic adaptation have not been explored. Previously, it was found that repeating the measurements at different backgrounds may have large effects on high-frequency ERG data from trichromats in comparison with those from dichromats [17, 22, 26]. Thus, the state of adaptation may be another parameter that may be used for distinguishing subjects belonging to the different groups. The difference with the 12 Hz ERG data, introduced in the present study, is that the 12 Hz ERG may reflect direct activity of cone opponent signals [11, 12, 16], whereas the adaptation effects is based upon a differential effect of the state of adaptation on ERGs that reflect luminance mechanisms.

In conclusion, stimulus conditions can possibly be improved by using LEDs with optimized emission spectra, by silencing rods and S-cones and by making the stimuli smaller. It then has to be determined, if the SNRs of the ERGs are still sufficiently large to yield reliable data.

The luminance fractions (F Lum) in the first harmonic components to the 12 Hz data were smaller for the trichromats than for the other observers. This difference was significant for the comparison between trichromats and protanopes. We, however, found that these data were often particularly noisy. Surprisingly, particularly data from trichromats were often too noisy to be included in the analyses. Again, the reason can be found in the absence of pupil dilation. These data are interesting because they may complement the high-frequency data. Therefore, it will be important to improve the data obtained from these conditions. The 12 Hz components particularly suffer from eye movements and blinks. Stricter artefact rejection criteria in combination with explicit instructions for the observers, to suppress blinks and eye movements as much as possible, may greatly improve the quality of the data and thus also of the fits.

The working hypothesis was that F L and F Lum are different in the different subject groups. Theoretically, F L was expected to be zero in protanopes and unity in deuteranopes. The different trichromatic subjects were expected be intermediate with values for protanomalous trichromats being smaller whereas those for deuteranomalous trichromats being closer to unity. Furthermore, F Lum were expected to be smaller in trichromats and larger in dichromats with intermediate values for anomalous trichromats. The group averages are in agreement with this working hypothesis.

A non-invasive electrophysiological technique has the advantage in comparison with psychophysical methods that the results of the measurements do not depend on subjective settings and that the method is possibly less dependent on the cooperation of the subjects (in e.g. young children, elderly subjects, subjects with a mental retardation or in malingerers). A purpose of the work was to study whether this electrophysiological method can be used in a setting where the responses are measured by non-specialist examiners. The data suggest that an electrophysiological method to distinguish between trichromats, dichromats and anomalous trichromats may be possible, but that improvements (some of which are mentioned above) are needed to reach this goal. It has been shown that VEP recordings may also be used to determine the type colour vision [27]. It might be interesting to study the VEPs of subjects with normal and deficient colour vision elicited by the stimuli used in the present study.

The results of our study show that the quality of the 36 Hz data is good to determine the ratios of L- to M-cone input to ERGs. The second harmonic components to 12 Hz stimuli yield very similar results. Thus, it is possible to obtain all necessary information from 12 Hz data only. The second harmonic data do not give information about response phases of L- and M-cone driven responses. This information has, however, limited use when the task is to determine what type of colour vision is present.

This exploratory study showed that an objective determination of L/M ratios, colour vision and colour vision deficiency using non-invasive electrophysiological techniques may be possible after several technical improvements.