Introduction

In daily life, somatosensory stimuli overflow around us; not a day passes without touching something. Although somatosensory stimuli send us information about “when”, “where” and “what” somatosensory events occur (Gibson 1962), expecting such events is difficult because somatosensory sensation is evoked after physical contact occurs. However, it is possible that expectations of somatosensory events can be prompted by information from other modalities.

In particular, previous studies have reported an influence of visual information on somatosensory processing. For example, visual stimuli modulate spatial (e.g., Spence 2002, 2010) and temporal identification (e.g., Fujisaki and Nishida 2009; Spence et al. 2001) for somatosensory stimuli. In addition, previous studies have reported that prior visual stimuli affected judgments about types of somatosensory stimuli (e.g., Spence et al. 1998, 2000). Expectations about types of somatosensory stimuli generated by prior visual stimuli can be important for avoiding unpleasant somatosensory events and approaching pleasant somatosensory events. For example, affective visual stimuli modulate the perception of somatosensory stimuli and early somatosensory evoked potential (SEP; Montoya and Sitges 2006). Furthermore, this visuotactile interaction is important in social interactions. Facial expressions of others are signals that indicate the meanings of various kinds of touching. In fact, the emotional facial expressions of others have been shown to affect tactile perception and SEP (Ravaja et al. 2017).

As described above, visual stimuli affect somatosensory processing. It is possible that one of the factors that promote such a visuotactile link is influenced by spatial proximity. In the processing of passive touch, we feel the contact of an object when the object approaches and touches the body. Under this circumstance, we are often able to look at the approaching object before it touches the body. Thus, there is a possibility that the factor of approaching visual stimuli relates to visuotactile links. In fact, the processing of a stimulus presented near the body is facilitated by visual information (e.g., Reed et al. 2006; Abrams et al. 2008). Moreover, recent electrophysiological studies have reported that prior visual stimuli approaching the body facilitate spatial expectations for subsequent somatosensory events (e.g., Kimura and Katayama 2015, 2017a). In these studies, participants were instructed to perform a simple reaction time task in response to a somatosensory stimulus to the wrist; the left (or right) wrist usually received the stimulus (congruent; 80%) and the opposite wrist rarely received it (incongruent; 20%). Before the somatosensory stimuli were presented, three LEDs serving as visual stimuli flashed in different patterns; e.g., approaching the wrist where the congruent somatosensory stimuli were presented, or not approaching, i.e., remaining motionless (Experiment 1 in Kimura and Katayama 2015). The comparison of the event-related brain potentials (ERPs) elicited by the different patterns showed that incongruent somatosensory stimuli in the approaching pattern elicited larger P3 amplitudes than those in the motionless pattern. Previous studies reported that P3 amplitude reflects the intensity of a deviation from expectations (e.g., Donchin 1981; Duncan-Johnson and Donchin 1977; Katayama and Polich 1998). Therefore, this result showed that approaching visual stimuli enhanced spatial expectations about the location of subsequent somatosensory stimuli.

Furthermore, this approaching effect influenced temporal expectations for subsequent somatosensory stimuli (Kimura and Katayama 2017b). In this study, somatosensory stimuli were invariably presented to one wrist (left or right) and prior visual stimuli either approached this wrist or remained motionless. The timing of the presentation of prior visual stimuli and subsequent somatosensory stimuli was manipulated; i.e., regular timing (SOA: 1000 ms; 75%), early deviation timing (SOA: 500 ms; 12.5%), or late-deviation timing (SOA: 1500 ms; 12.5%). A comparison of ERPs indicated that early deviation somatosensory stimuli in the approach pattern elicited larger N1 amplitudes than regular somatosensory stimuli in this pattern. In addition, the early deviation somatosensory stimuli under the approach pattern elicited larger N1 amplitudes than those stimuli under the motionless pattern. These results revealed that temporal expectations for subsequent somatosensory stimuli were enhanced by the approach of visual stimuli.

As described above, the approach of non-emotional visual stimuli (i.e., LEDs) facilitates spatial (“where”) and temporal (“when”) expectations regarding subsequent somatosensory events. Thus, there is a possibility that these visual factors influence expectations regarding types of subsequent somatosensory stimulus (“what”) even if the visual stimuli do not contain emotional information. Regarding these visuotactile links and the effects of approaching visual stimuli (Kimura and Katayama 2015, 2017a, b), we hypothesized that non-emotional visual stimuli approaching the body modulate not only spatial (“where”) and temporal (“when”) expectations, but also expectations regarding types of subsequent somatosensory stimuli (“what”).

To test this hypothesis, we recorded electroencephalograms (EEGs) while participants were performing a simple reaction time task in response to somatosensory stimuli, and analyzed their ERPs. Each trial was composed of three red or blue circles as visual stimuli followed by one single or train electrical pulse as a somatosensory stimulus to only one wrist (left or right) in each block. The specific colors of visual stimuli and types of somatosensory stimuli were combined to form congruent and incongruent trials. In the congruent trials, one combination was presented (80% of trials), whereas in incongruent trials, other combinations were presented (20% of trials). For example, if a combination of three red (blue) circles and a single (train) somatosensory stimulus was used in the congruent trials, then the reverse combination was used in the incongruent trials. In addition, visual stimuli did or did not approach the body in a separate condition. Under the approach condition, three visual stimuli were presented sequentially, moving toward the wrist where a somatosensory stimulus was presented, whereas under the neutral condition, visual stimuli located at an equal distance from both wrists were presented three times, the same number as in the approach condition. In each condition, a somatosensory stimulus was invariably exhibited after the third visual stimulus. Participants were told the location of the somatosensory stimuli and the combination ratio between visual and somatosensory stimuli before each block; thus, the approach of the visual stimuli and the type of somatosensory stimuli were irrelevant information in this simple reaction time task.

We focused on ERPs, especially contingent negative variation (CNV), N2b and P3 as an index of expectation caused by approaching visual stimuli for the type of somatosensory stimuli. These ERPs are known to have a high level of sensitivity to prediction and deviation from it. For example, CNV is related to temporal expectations prompted by a prior stimulus regarding a subsequent stimulus (Walter et al. 1964). In the present study, the somatosensory stimulus was invariably presented with the same timing after the third visual stimulus (i.e., SOA: 1000 ms) under both conditions. Therefore, we predicted that CNV would be elicited between the third visual stimulus and the somatosensory stimulus, and would not differ between the approach and neutral conditions if participants were able to expect the timing of the somatosensory stimulus.

Moreover, N2b reflects the processing of stimulus deviation when participants voluntarily hold a mental image (Sams et al. 1983). When using a somatosensory stimulus, N2b is elicited by the deviation of stimulus type (e.g., Kekoni et al. 1996, 1997). Therefore, we expected that incongruent trials would elicit N2b.

Finally, P3 is elicited by a stimulus that deviates from expectations. When the stimulus is received, it is compared with past stimuli received in the same context. If this stimulus deviates from past stimuli received in this context, the neural representation of the stimulus context is updated and P3 is elicited (context updating theory; Polich 2007). In addition, in a test of spatial and temporal expectations of somatosensory stimuli, deviant somatosensory stimuli elicited P3 (Kimura and Katayama 2015, 2017b). Therefore, we predicted that the incongruent trials would elicit P3 components if participants expected a particular type of somatosensory stimuli depending on the color of prior visual stimuli, and it would be possible to argue that the approach of visual stimuli enhanced expectations regarding types of somatosensory stimuli as they did with spatial and temporal expectations (Kimura and Katayama 2015, 2017b), if P3 amplitudes elicited in the approach condition were greater than those in the neutral condition.

Method

Participants

Eighteen undergraduate and graduate students (12 females, 6 males; 19–22 years of age) participated in the experiment. Two participants were left-handed and the others were right-handed, according to their self-report. All participants had normal or corrected-to-normal vision. This experiment was approved by the Kwansei Gakuin University (KGU) Research Ethics Review Board under the KGU Regulations for Research with Human Participants. Written informed consent was obtained from all participants, and their rights as experimental subjects were protected.

Stimuli and procedure

Somatosensory stimuli were generated by an electrical stimulus generator (Nihon Kohden Corporation, SEN-7203) and were presented to the participants’ wrists via electric isolators (Nihon Kohden Corporation, SS-203J) and Ag/AgCl electrodes with a diameter of 1.0 cm. The anode electrodes were placed on the participants’ wrists, and the cathode electrodes were placed 3.0 cm from the anodes toward the elbow. The stimuli were single block pulses of 0.2 ms in duration. The intensities were two times as high as the threshold for each participant (never causing pain). The absolute threshold was measured by six iterations of up-and-down method per participant. The average intensity of the stimuli across all participants was 3.1 mA. Two types of somatosensory stimuli were presented to participants. The single somatosensory stimulus was one single pulse stimulus, whereas the train somatosensory stimulus was a train of ten single pulses with 5 ms stimulus interval (SOA). These stimuli were presented to only one wrist (right or left) in each block. The order of the location (left or right) of the stimulus presentation was counterbalanced across blocks.

Red (1.18 cd/m2) and blue (0.71 cd/m2) circles were presented as visual stimuli at three locations on a desk. These circles and a background gray color (2.54 cd/m2) were projected via LCD projector (EPSON, EB-1430WT). The diameter of each circle was 3.8 cm. These circles were placed at equal distances (8.0 cm each) between the arms (see below). The duration of the presentation of visual stimuli was 200 ms.

Each trial was composed of three visual stimuli followed by one somatosensory stimulus, and the specific colors of visual stimuli and types of somatosensory stimuli were combined to form congruent and incongruent trials. In the congruent trials, one combination was presented (80% of trials), whereas in incongruent trials, other combinations were presented (20% of trials). For example, if combinations of three red (blue) circles and a single (train) somatosensory stimulus were used in the congruent trials, then the reverse combination was used in the incongruent trials. These combinations were counterbalanced across participants. The interval (SOA) from the stimulus onset of the first visual stimulus to the second visual stimulus, from the second visual stimulus to the third visual stimulus, and from the third visual stimulus to the somatosensory stimulus was invariably set to 1000 ms. The interval between trials was either 1000 or 1200 ms at random with equal probability.

Two conditions were distinguished by the presentation pattern of the visual stimuli and were administered in separate blocks. In the approach condition, the visual stimuli were presented sequentially, moving toward the wrist where the somatosensory stimulus was presented (i.e., right, center, and left, or the reverse order), and the subsequent somatosensory stimulus was presented to that wrist. In the neutral condition, the visual stimuli were presented in the center three times with the same timing, and then the somatosensory stimulus was presented to the wrist (Fig. 1). The order of these two conditions was randomized across participants.

Fig. 1
figure 1

The procedure of a the approach condition and b the neutral condition

Each block was composed of 44 trials (32 congruent trials, 8 incongruent trials, and 4 catch trials), which took approximately 4 min. Four blocks were presented for each condition. The interval between blocks was 2 min, and after the second block, the participants rested for 5 min before starting the remaining half of the blocks. The order of conditions was randomized between participants.

In the experimental room, the participants were asked to sit at a desk and to place their arms on the desk 32.0 cm apart. Figure 2 shows the positioning of the visual and somatosensory stimuli. The position of visual stimuli was between the arms at equal intervals (8.0 cm each). Similarly, the distance between the electrode for the somatosensory stimuli on the left (right) arm and position of left (right) visual stimuli was 8.0 cm. The participants were asked to gaze at the fixation point, to control their eye movements, and not to move their eyes and bodies more than necessary in each condition.

Fig. 2
figure 2

The positions of stimulus presentation. The circles indicate the positions of visual stimuli, and the squares indicate the positions of somatosensory stimuli

In addition, the participants were instructed to respond by pressing a button with the left (or right) foot when the somatosensory stimuli were presented, and to not respond when somatosensory stimuli were not presented (i.e., the catch trials). In half of the blocks (two blocks), the participants used the foot on the same side where the somatosensory stimuli were presented; in the other half (two blocks), they used the foot on the opposite side in each condition. Moreover, they were told before each block to which hand somatosensory stimuli would be presented and the frequency of the combinations of visual stimulus color and type of somatosensory stimulus. The response foot was changed between blocks.

Recording and analyses

EEG data were recorded by BrainAmp (Brain Products, Germany) and an electrode cap (Easycap GmbH, Germany) using Ag/AgCl electrodes at 32 sites (Fp1, Fp2, F7, F3, Fz, F4, F8, FT7, FC3, FCz, FC4, FT8, T7, C3, Cz, C4, T8, TP7, CP3, CPz, CP4, TP8, P7, P3, Pz, P4, P8, O1, Oz, O2, A1, and A2), according to the modified 10–20 System. The reference electrode was on the tip of the nose, and the ground electrode site was AFz. The data from all channels were recorded using Brain Vision Recorder software (Version 2.0, Brain Products). The electrode impedances were kept below 5 kΩ. A bandpass filter of 0.1–200 Hz was used at recording. The sampling rate was 1000 Hz.

To analyze the EEG data, the EEGLAB toolbox (Delorme and Makeig 2004) and ERPLAB toolbox (Lopez-Calderon and Luck 2014) on MATLAB (MathWorks Inc) were used. The data were digitally low-pass filtered at 30 Hz (6 dB/octave) using an IIR Butterworth analog simulation filter. Artifacts derived from eye movements and eye blinks were rejected using an automatic EEG artifact detector based on the joint use of spatial and temporal features (ADJUST) of the EEGLAB toolbox (Mognon et al. 2011).

ERP epochs were extracted within a time range of 1000 ms (including a 200 ms pre-stimulus baseline) for N2b and P3 analysis. Trials with an error or those in which the EEG signal variation exceeded ± 100 µV were automatically discarded. Additionally, trials with RTs shorter than 200 ms or longer than 1500 ms and trials with incorrect responses were excluded from the analysis (1.1% of trials were rejected based on these criteria). After artifact rejection, the numbers of remaining trials ranged from 118 to 128 (0–7.9% of trials rejected) for congruent stimuli and 24–32 (0–25% of trials rejected) for incongruent stimuli. Moreover, to identify N2b and P3 latencies, we created difference waves by subtracting congruent trial ERPs from incongruent trial ERPs for each condition (Luck 2014). The appropriate time range of N2b was defined before and after 40 ms from the N2b peak latency of difference waves (240–320 ms) and P3 was defined before and after 75 ms from the N2b peak latency of difference waves (400–550 ms).

In addition, to investigate CNV as an index of the timing cues, epochs were extracted within a time window of 1200 ms (the baseline was a – 200- to 0-ms pre-stimulus of the third visual stimulus, and the onset of the somatosensory stimulus was at 1000 ms). Trials with an error or those in which the EEG signal variation exceeded ± 100 µV were automatically discarded. After artifact rejection, the numbers of remaining trials were 148–160 (0–7.5% of trials rejected) for the approach condition and 130–160 (0–18.8% of trials rejected) for the neutral condition. The mean CNV amplitude was obtained from a latency window of 500–1000 ms. The appropriate latency window was defined based on observation of the resultant ERP waveforms.

A two-way repeated measures analysis of variance (ANOVA) on reaction times (RTs) in response to the electrical stimuli was conducted with the two conditions (approach and neutral) × two stimulus congruencies (congruent and incongruent). Moreover, the N2b and P3 mean amplitude was assessed with a three-way repeated measures ANOVA [2 conditions × 2 stimulus congruencies × 3 electrodes (Fz, Cz, and Pz)]. These electrodes were chosen to check the distribution of N2b and P3 amplitude at the midline. These ANOVAs were conducted by applying Greenhouse–Geisser corrections to the degrees of freedom when appropriate (Greenhouse and Geisser 1959). The effect sizes have been indicated in terms of partial eta squared (ηp2). Post hoc comparisons were made using Shaffer’s modified sequentially rejective multiple test procedure, which extends Bonferroni t tests in a stepwise fashion (Shaffer 1986). In addition, the mean CNV amplitudes at Cz, where the CNV was elicited at maximum amplitude, were compared between conditions by paired t test. The effect size was calculated by computing the Cohen’s d (Cohen 1988). The significance level was set at p < .05 for all statistical analyses.

Results

Behavioral data

The types of stimuli were integrated because it was confirmed that RTs did not differ by type of stimulus. Table 1 shows the mean RTs of all participants. The ANOVA revealed that the RTs in the approach condition were shorter than the RTs in the neutral condition [F(1, 17) = 6.25, p = .02, ηp2 = 0.27]. Additionally, RTs to the incongruent stimuli were longer than those to the congruent stimuli [F(1, 17) = 23.45, p < .001, ηp2 = 0.58]. However, their interaction was not significant (p > .10).

Table 1 Mean RTs (ms) for somatosensory stimuli and standard errors of RTs in each condition

Electrophysiological data

N2b

Figure 3 shows the grand averages for ERPs elicited during the approach (blue lines) and the neutral (red lines) conditions from Fz, Cz, and Pz, as well as the difference waves (pale blue and red lines) that were created by subtracting congruent ERPs from incongruent ERPs. Figure 4 illustrates (a) the topographic map at the time range of N2b (240–320 ms), and (b) the N2b mean amplitude in each condition. In both conditions, negative-going ERP components were elicited after 200 ms and positive-going ERP components were elicited after 400 ms. The ANOVA revealed that the N2b mean amplitude of incongruent stimuli was greater than that of congruent stimuli [F(1, 17) = 6.06, p = .02, ηp2 = 0.26]. Moreover, the main effect of electrode was significant [F(2, 34) = 32.28, p < .001, ηp2 = 0.66]. However, the main effects of condition and interactions were not significant (ps > .10).

Fig. 3
figure 3

Grand average ERP waveforms for each condition (left and center panels), and difference waves that were created by subtracting congruent ERPs from incongruent ERPs (right panels) at the Fz, Cz, and Pz electrode sites (N = 18). The light gray area denotes the time range of N2b (240–320 ms), and the dark gray area denotes the time range of P3 (400–550 ms)

Fig. 4
figure 4

a The topographic map of the N2b time range (240–320 ms), and b mean N2b amplitude in both conditions (N = 18). The error bars indicate the standard errors (SEs) of the means across participants

P3

Figure 5 shows (a) the topographic map at the time range of P3 (400–550 ms), and (b) the P3 mean amplitude in each condition. The ANOVA for the mean P3 amplitudes revealed a significant interaction of condition and stimulus congruencies [F(1, 17) = 6.77, p = .02, ηp2 = 0.28]. Post hoc comparisons indicated that the incongruent stimuli elicited larger P3 amplitudes than the congruent stimuli in both conditions (ps < .05). In addition, the incongruent stimuli under the approach condition elicited larger P3 amplitudes than the incongruent stimuli under the neutral condition (p < .05). However, P3 amplitudes elicited by congruent stimuli were not significantly different between conditions (p > .10). These results revealed that P3 amplitudes elicited by the incongruent stimuli were larger for the approach condition than for the neutral condition. Furthermore, all main effects were significant [condition: F(1, 17) = 5.53, p = .03, ηp2 = 0.25; stimulus congruencies: F(1, 17) = 16.44, p < .001, ηp2 = 0.49; electrode F(2, 34) = 15.07, p < .001, ηp2 = 0.47]. However, the interactions of condition and electrode, stimulus congruencies and electrode, and condition, stimulus congruency and electrode were not significant (ps > .10).

Fig. 5
figure 5

a The topographic map of the P3 time range (400–550 ms), and b mean P3 amplitude in both conditions (N = 18). The error bars indicate the standard errors (SEs) of the means across participants

CNV

Figure 6 illustrates the grand average CNV elicited in both conditions at Cz, where the CNV was elicited at maximum amplitude. The gray area indicates the time range of CNV (500–1000 ms). Comparisons between conditions by paired t test of mean amplitude of CNV revealed no significant difference (p > .10).

Fig. 6
figure 6

Grand average ERP waveforms for both conditions at Cz (N = 18). The gray area indicates the time range for CNV (500–1000 ms)

Discussion

The purpose of the present study was to examine whether visual stimuli approaching the body influence expectations about types of subsequent somatosensory stimuli. Our results showed that N2b and P3 components were elicited by incongruent stimuli under the approach and the neutral conditions. Because N2b is elicited by detection of deviation of stimulus types when somatosensory stimuli are presented (e.g., Kekoni et al. 1996, 1997), this result shows that participants detect deviation of somatosensory stimulus types in each condition. Because P3 is elicited by unexpected stimuli (e.g., Donchin 1981; Duncan-Johnson and Donchin 1977; Katayama and Polich 1998), this result indicates that participants expect somatosensory stimulus types according to visual stimuli in both conditions.

Moreover, incongruent somatosensory stimuli under the approach condition elicited larger P3 amplitudes than under the neutral condition. Previous studies also showed the larger P3 in the approach condition, indicating that the approach of visual stimuli enhanced the expectation for the following somatosensory stimuli (Kimura and Katayama 2015, 2017a). Therefore, the results of this study indicated that visual stimuli approaching the body facilitate expectations about types of subsequent somatosensory stimuli presented at the approached location.

The results showed that the RTs to incongruent trials were longer than those to congruent trials. This result was the same as that of a previous study using a similar paradigm (Kimura and Katayama 2015). In addition, the RTs under the approach condition were shorter than those under the neutral condition. This result indicates that visual stimuli approaching the body facilitate detection of the occurrence of somatosensory stimuli.

The amplitude of CNV did not differ across the conditions, which is the same finding as in the previous study with the similar paradigm (Kimura and Katayama 2015). This result indicates that the timing of the presentation of somatosensory stimuli was expected under both conditions, regardless of whether visual stimuli approached.

In the present study, the participants performed a simple reaction time task in response to somatosensory stimuli. The location of the somatosensory stimuli was told before each block, and the type of the stimulus was irrelevant for the task. Thus, the visual stimuli functioned only as a temporal cue for the subsequent somatosensory stimuli. Nevertheless, ERPs related to deviations from expectations were enhanced in the approach condition, indicating that the approach of visual stimuli automatically facilitates expectations about types of subsequent stimuli.

Considered together with the previous studies that examined spatial (Kimura and Katayama 2015) and temporal (Kimura and Katayama 2017b) expectations for subsequent somatosensory stimuli, the present study indicates that visual stimuli approaching the body facilitate not only spatial (“where”) and temporal (“when”) expectations but also expectations regarding types of somatosensory stimuli (“what”).

Previous studies reported that prior visual stimuli influence processing of subsequent somatosensory stimuli (e.g., Spence 2010; Fujisaki and Nishida 2009; Spence et al. 1998). In the studies that investigated types of somatosensory stimuli, the influence of emotion-laden visual stimuli was examined because the processing of types of somatosensory stimuli is important to avoid unpleasant somatosensory events (Montoya and Sitges 2006) and to live in society (Ravaja et al. 2017). In the present study, the visual stimuli were color patches which did not have emotional content; however, visual stimuli approaching the body influence expectations regarding types of subsequent somatosensory stimuli. This result suggests that visual stimuli approaching the body function as clues about types of subsequent somatosensory stimuli even if the visual stimuli do not have emotional content.

The results of this experiment expand our understanding of the mechanisms of expectations about subsequent somatosensory events created by prior visual information.

Conclusion

In summary, the present study revealed that approaching visual stimuli modulate the P3 amplitude of subsequent somatosensory stimuli, indicating that non-emotional visual stimuli approaching the body influence expectations about types of subsequent somatosensory stimuli, even though the type of stimulus is irrelevant for the ongoing task. The present study offers support for the view that there is a function for expectations about types of subsequent somatosensory events enhanced by the approach of visual stimuli toward the body, even if the approach is irrelevant to the task.