Introduction

Coordinated rhythmic movement is not only required in daily life (e.g., eating, dressing, or cooking) but a well-known study subject for understanding the integration of perception and action. It has been studied extensively, since Kelso proposed coordination dynamics (e.g., Kelso, 1984; Kelso, Holt, Kugler, & Turvey, 1980; Kugler, Kelso, & Turvey, 1980). Accordingly, rhythmic bimanual coordination can be modeled as two oscillators, and relative phase, which is an angular expression of the spatial–temporal difference between the two oscillators, has been conventionally used to define the pattern of coordination.

Haken, Kelso, and Bunz (1985) developed the “HKB model”, which described phase stability and phase transitions. Basically, two intrinsic coordination patterns exist as attractors in coordination dynamics: in-phase (or 0° relative phase) and anti-phase (or 180° relative phase). At a low frequency (about 1 Hz), both in-phase and anti-phase coordination patterns are stable. However, if the cycling frequency increases and exceeds a critical value (about 3-4 Hz), the anti-phase pattern will lose its stability and eventually switch to the in-phase pattern, while the in-phase pattern will be immune to the frequency change. Therefore, the in-phase pattern is thought to be more stable than the anti-phase pattern.

Conventionally, intrinsic bimanual coordination patterns are defined kinesthetically (Kelso, 1984; Kelso, Scholz, & Schöner, 1986). When wrists or fingers are flexing and extending rhythmically, the co-activation of a homologous muscle group results in the simultaneous flexion–extension, therefore, in-phase coordination. Correspondingly, the anti-phase coordination is defined as homologous muscle groups contracting in an alternating or asymmetrical fashion, which is seen when one of the wrists or fingers is flexing while the other extending. This muscular constraint has largely accounted for the stability of bimanual coordination as well as the phase transitions (e.g., Carson, Riek, Smethurst, Párraga, & Byblow, 2000; Kelso, 1984; Kelso et al., 1986; Temprado et al., 2003). Typically, coordination movements with homologous muscles recruited (i.e., kinesthetically defined in-phase) are more stable than those with non-homologous muscles recruited (i.e., kinesthetically defined anti-phase), and phase transitions from anti-phase to in-phase would occur as movement frequency increases but not vice versa.

The egocentric frame of reference has been used to define kinesthetic bimanual coordination, as well (Pickavance, Azmoodeh, & Wilson, 2018; Swinnen, 2002; Swinnen, Dounskaia, & Duysens, 2002; Swinnen, Jardin, Meulenbroek, Dounskaia, & Den Brandt, 1997; Temprado et al., 2003). Using the longitudinal axis of body as a reference, mirror-symmetrical movement (e.g., both wrists move towards the body center simultaneously) is denoted as in-phase, and is more stable than anti-phase, which is defined as asymmetrical movement with respect to the body center (e.g., one wrist moves towards, while the other moves away from the body center simultaneously), especially as movement frequency increased (e.g., Li, Levin, Carson, & Swinnen, 2004; Li, Levin, Forner-Cordero, Ronsse, & Swinnen, 2009). To be noted, the egocentric in-phase movement usually involves the co-activation of homologous muscles, but not always. For instance, when a supine hand and a prone hand are moving toward and away from one another simultaneously in the transverse plane, non-homologous muscles are co-activating to produce the egocentric in-phase movement. Mechsner, Kerzel, Knoblich, and Prinz (2001) and Temprado et al. (2003) studied symmetrical and parallel finger abduction/adduction movements by manipulating hand posture (supine/prone), and the superior performance was found for the mirror-symmetrical movements. Similarly, Brandes, Rezvani and Heed (2017) used a mirror to create the visual symmetry of finger movements on one side and found that the visually perceived symmetrical movements remained more stable and accurate regardless of hand posture and the actual finger moving direction. Therefore, the bimanual coordination is perceptually driven, in which the coordination performance is better when it is perceived as mirror-symmetrical, although the evidence is controversial regarding whether the symmetry has to be perceived kinesthetically or visually.

Alternatively, bimanual coordination can be defined using allocentric frame of reference, which does not have reference to the body-related information, but instead focuses on the external visual cues describing the spatial–temporal relationship between the two oscillators (e.g., Pickavance et al., 2018; Swinnen, 2002; Swinnen et al., 1997, 2002; Temprado et al., 2003; Bingham, 2001; Bingham, 2004a, 2004b; Bogaerts, Buekers, Zaal, & Swinnen, 2003; Snapp-Childs, Wilson, & Bingham, 2011). Regardless of the muscles used, if the two limbs are oscillating in the same direction at the same speed (e.g., two hands simultaneously reach forward or backward), in-phase coordination is produced; if they are oscillating in the opposite direction at the same speed (e.g., one hand reaches forward, while the other reaching backward), anti-phase coordination is produced. Based on the allocentric constraint, those involving isodirectional movements are always more stable than those that do not (e.g., Buekers, Bogaerts, Swinnen, & Helsen, 2000; Swinnen et al., 1997).

In summary, studies have shown that the stability of coordination patterns is subjected to: (1) muscular constraint, by which the coordination movements entailing utilization of homologous muscles are more stable; (2) egocentric constraint, by which mirror-symmetrical movements with respect to the body center are more stable; and (3) allocentric constraint, by which isodirectional movements are more stable. The first two constraints are relevant to using kinesthetic information (body-related information) for control of bimanual coordination. Essentially, they are the same when homologous muscles are used to produce symmetrical movements. The last one is relevant to using visual information (spatiotemporal locations of the oscillators) for control of bimanual coordination, in which the limb movements are simulated visually in the external world. The emerging question is whether the coordination pattern would be less stable when the kinesthetically perceived information is inconsistent with the visually perceived information. As seen in Fig. 1, when two wrists are moving horizontally toward each other, homologous muscle groups are involved in producing mirror-symmetrical but non-isodirectional movements; consequently, the kinesthetically perceived in-phase coordination (Fig. 1a) is actually the visually perceived anti-phase coordination (Fig. 1d), and the kinesthetically perceived anti-phase coordination (Fig. 1b) is the visually perceived in-phase coordination (Fig. 1c).

Fig. 1
figure 1

An illustration of intrinsic bimanual coordination patterns defined using different information. a Kinesthetically perceived in-phase. b Kinesthetically perceived anti-phase. c Visually perceived in-phase. d Visually perceived anti-phase

Numerous attempts have been made to understand the control mechanism of bimanual coordination when visual and kinesthetic information are inconsistent. Bogaerts et al. (2003) asked participants to execute cyclical back and forth line drawings with both hands either in one axis (i.e., both in x-axis or y-axis) or different axes (i.e., one in x-axis and the other in y-axis) to produce kinesthetic in- or anti-phase patterns with visual perception of isodirectional or non-isodirectional movements. They reported that, in general, visually perceived isodirectional dominated the coordination stability regardless of the hand movements. Temprado et al. (2003) conducted a series of experiments by manipulating the motion planes (transverse and sagittal plane) and limb postures (prone and supine) to make visual and kinesthetic information consistent or inconsistent. They found evidence to support the conception that the co-activation of homologous muscles (kinesthetic information) dominates within-subject coordination, while the isodirectional movement (visual information) governs interpersonal coordination. More recently, Pickavance et al. (2018) manipulated visual-kinesthetic consistency by separating allocentric (i.e., visual 0° and 180°) and egocentric (i.e., kinesthetic in- and anti-phase) information. Participants produced kinesthetic in- or anti-phase in the frontal plane by watching visual in- or anti-phase via a display. They found that kinesthetic/egocentric and visual/allocentric information independently contributed to the stability of intrinsic bimanual coordination patterns with the former having larger effects. Clearly, the evidence is inconclusive as to whether visual or kinesthetic information is used to maintain coordination stability when they are inconsistent.

It should be noted that a single movement frequency has been adopted to examine coordination stability with inconsistent visual and kinesthetic information in previous studies (e.g., Bogaerts et al., 2003; Temprado et al., 2003 and Pickavance et al., 2018). However, we know from the HKB model that movement frequency is a control parameter that will cause coordination stability to change (phase transition). Therefore, the study of coordination stability with inconsistency between visual and kinesthetic information must involve manipulation of movement frequency. In addition, energy consumption is a relevant and worth consideration in assessing coordination stability, because stable coordination patterns typically require less energy to maintain than unstable patterns (Hoyt and Taylor, 1981; Sparrow and Newell, 1998), and the energy costs associated with performing a bimanual coordination task at various frequencies have been shown to decrease with motor learning (Galna and Sparrow, 2006).

Consequently, the purpose of this study was twofold. First, we wanted to examine whether inconsistency between visual and kinesthetic information would impact the stability of intrinsic bimanual coordination performance as movement frequency increased. Hypothetically, more stable and energy-efficient coordination would be seen when visual and kinesthetic information are consistent rather than inconsistent, especially at high frequency. Second, we wanted to determine which information (visual or kinesthetic) could be used reliably and efficiently to stabilize the coordination pattern as movement frequency increased. Based on previous studies, we postulated that kinesthetic information (i.e., muscular constraint) would largely account for the stable and efficient control of intrinsic bimanual coordination patterns as frequency increased.

Methods

Participants

Thirty healthy adults aged between 20 and 40 years (mean age = 24.93 ± 4.70 years) were recruited on and off campus at the University of Wyoming through a flyer. All were right-handed as determined by a short version of the Edinburgh Handedness Inventory (Oldfield, 1971), had normal or corrected-to-normal vision, and were naïve to the experimental questions and tasks. Participants were free from any known neurological defects or motor disabilities and refrained from caffeine intake 24 h prior to the tests. Informed consent was obtained from each participant, and the study was approved by the Institutional Review Board at University of Wyoming, Laramie. Participants were randomly assigned to perform the bimanual coordination tasks in one of three conditions with ten people in each condition (see Table 1 for more information about age and gender distribution in each group).

Table 1 Age and gender distribution for each group (Mean, ± SD)

Apparatus

A computer-joystick system was used. The participant sat on a chair that could be adjusted for height to see a Dell 19” square PC screen at eye-height level. An HP PC with screen resolution of 1280 × 1024 and refresh rate of 60 Hz was placed 70 cm from the participant on the top level of a custom-built cart. Two Logitech Force 3D joysticks were fixated to a wooden board on the lower level and were connected via USB to the PC. The board was set either horizontally or vertically to the surface of the cart, depending on which group the participant was assigned to. Therefore, the participant could move the joysticks either in the frontal plane (left–right movement) or in the sagittal plane (up–down movement).

The computer displayed two white dots against a black backdrop on screen, being aligned either one on top of the other or side by side. The top–bottom condition corresponded to the configuration where the wooden board was placed horizontally on the surface, resulting in the top dot in the display controlled by the left joystick and the bottom dot by the right joystick. The side-by-side condition corresponded to the configuration where the wooden board was placed vertically on the surface, so that the left dot was controlled by the left joystick and the right dot by the right joystick. The movement amplitude of the dots was 300 pixels (about 11.5 cm) for both conditions. Dots were 60 pixels (about 2.3 cm) in diameter. Stimulus presentation (both video and audio), movement data recording, and performance analyses were handled by a custom MATLAB toolbox written by ADW, incorporating the Psychtoolbox (Wilson, Tresilian, & Schlaghecken, 2011). Participants were connected to a metabolic cart (via facemask and flow sensor) and a 12-lead electrocardiogram (ECG) monitor during the entire experiment. Physiological data including oxygen consumption (VO2), respiratory rate (RR), and respiratory exchange ratio (RER) were measured using an MGC Diagnostics Breeze Suit Ultima Series Metabolic Cart (Ultima™ CardiO ®2 gas exchange analysis system, Metabolic Diagnostics, St. Paul, MN, USA). The 12-lead ECG device (Mortara Instrument X12 + , Mortara Instrument, Milwaukee, WI, USA) was used to monitor participant heart rate (HR).

Procedure

Participants were randomly assigned to one of three groups (n = 10). Each participant was asked to sit for at least 5 min to acquire resting physiological data before moving the joysticks to produce rhythmic in- and anti-phase movements of the dots on screen. They were asked to do so at low (0.50 Hz), high (2.50 Hz), and self-selected frequencies with elbows fixed on the table while wearing a VO2 mask and a 12-lead ECG device during the entire 30-s trial (see Fig. 2 for an illustration of experimental apparatus). Two memory foam pads were provided for participants to rest their elbows on the table while performing the coordination task. They were instructed to produce the demanded coordination patterns only using the wrist flexion and extension, and a trial was replayed if the participant lifted elbow from the pad. A cardboard extending the bottom surface of the computer screen was used to block the sight of hands. Participants were instructed to attend to the moving dots on the screen all the time, while they moved the joysticks underneath the cardboard, and their goal was to reproduce the demonstrated coordination patterns on the screen no matter how they moved the joysticks. An audio metronome corresponding to the demanded movement frequency (high and low) was played during each trial to guide the participant to produce the bimanual coordination pattern at the demanded frequency. When the participants were asked to produce the bimanual coordination pattern at their self-selected frequency, the audio metronome was turned off, and the participants were asked to pace the bimanual movement at their preferred speed. The demanded movement frequencies (i.e., low and high frequency) and bimanual coordination patterns (i.e., visual in-phase or anti-phase) were randomly prescribed to each participant, while both coordination patterns at the self-selected frequency were always assessed in a random order at the end.

Fig. 2
figure 2

An illustration of the experimental apparatus. Joysticks were attached to a wooden board that can be set either vertically or horizontally to the cart’s surface. Computer display was set on the top level of the cart at eye height. Participants were guided to attend to the computer display while wearing a VO2 mask and a 12-lead ECG during the experiment

At the low and high frequency, participants were first shown a 10-s visual demonstrationFootnote 1 of the target relative phase, followed by a 30-s practice trial with visual feedbackFootnote 2 and audio metronome turned on. Participants then performed six 30-s trials at each target phase without the visual feedback while listening to the audio metronome. No demonstration was given at the self-selected frequency. Participants only performed the practice trial and then the experimental trials as they did previously but at the self-selected pace. A 30-s rest was given between trials at each frequency condition.

By manipulating the spatial mapping between the motion of the dots on screen and motion of hands/joysticks, three task conditions/groups were created: “Consistent Information with Spatial Mapping” (“Info + Spatial +”), “Consistent Information without Spatial Mapping” (“Info + Spatial −”), and “Inconsistent Information with Spatial Mapping” (“Info − Spatial +”) (see Fig. 3 for an illustration of group setting and tasks). In the “Info + Spatial +” condition, the base of joysticks was set to be vertical facing the participant (see Fig. 2), and participants grasped the joysticks by keeping the top of the stick in palm, and moved the joysticks in the sagittal plane to produce the coordination patterns on the screen with two dots moving up and down and side by side. With this configuration, the vertical in-phase coordination pattern could be seen on the screen by flexing and extending both wrists simultaneously, whereas the vertical anti-phase coordination pattern seen by flexing one hand while extending the other hand at the same time. Therefore, not only the motion of hands was spatially mapped to the motion of dots, but also the visual information about coordination was consistent with the kinesthetic information about coordination. In “Info + Spatial −” and “Info-Spatial +” conditions, the base of joysticks was set to be horizontal facing the ceiling, and participants grasped the joysticks as usual (fingers around the stick) and moved the joysticks in the frontal plane (left–right movement). The coordination pattern was demonstrated by two moving dots in the top–bottom configuration. In the “Info + Spatial −” condition, the spatial mapping between the motion of dots and motion of hands changed, so that the in-phase coordination pattern would be seen on the screen when two hands were moving simultaneously inward/outward (flexing/extending) and the anti-phase coordination pattern would be seen when one was moving inward (flexing), while the other moving outward (extending). Therefore, the visual information about coordination was consistent with kinesthetic information about coordination. In the “Info − Spatial +” condition, the direction of hand movement was spatially mapped to the direction of the dot movement on the screen horizontally, so that the in-phase or anti-phase coordination patterns could be seen on the screen by moving the two hands simultaneously either in the same or opposite directions. Since one hand was flexing, while the other was extending when two hands were moving in the same direction (left/right), the visual information about coordination was inconsistent with the kinesthetic information about coordination.

Fig. 3
figure 3

An illustration of task groups. The “Info + Spatial +” group had “consistent information with consistent spatial mapping”, in which the dots exactly simulated the movements of hands. The “Info + Spatial −” group had “consistent information with inconsistent spatial mapping”, in which the relative directions of the dots were opposite to the relative direction of the hands. The Info − Spatial+” group had “inconsistent information with consistent spatial mapping”, in which the dots exactly simulated the movements of hands. Participants grasped the joysticks using a palm grip; however, the head part of joystick was grasped in the “Info + Spatial +” condition, while the stick part of joystick was grasped in both “Info + Spatial −” and Info − Spatial+” conditions

Data analysis

To examine whether participants performed the coordination task at the demanded movement frequencies with discrepancy, the produced movement frequencies were extracted for each participant, and a mixed design Analysis of Variance (ANOVA) was performed on their produced frequencies, treating task conditions/groups (“Info + Spatial +”, “Info + Spatial −”, “Info − Spatial +”) as a between-subject variable and the prescribed movement frequency (low, self-selected, high) as a within-subject variable.

For movement data, two 60 Hz position time series from each trial were filtered using a low-pass Butterworth filter of second order with a cut-off frequency of 10 Hz and numerically differentiated to yield a velocity time series. These were used to compute a time series of relative phase, the key measure of coordination between the two hands. We then computed the Proportion of Time-on-Task (PTT) to assess the bimanual coordination performance (both accuracy and stability) over the course of a trial. Specifically, we computed the proportion of each continuous relative phase time series (trial) that fell within the range of the target phase ± a 20° tolerance.Footnote 3 A mixed design Analysis of Variance (ANOVA) was performed on mean PTT data to examine the effects of movement frequency (low, high, self-selected), target phases (visual in-phase, visual anti-phase), and task conditions/groups (“Info + Spatial +”, “Info + Spatial −”, “Info − Spatial +”), as well as their interactions. In addition, we also computed the relative phase distribution between 0° and 180° using 20° bins for each participant within each trial at each target phase and each frequency. The visual examination of relative phase distribution would reveal how the stability of target phase would change with changing movement frequency.

For oxygen consumption data, VO2 (ml/kg/min), RR in breaths per minute, RER, and HR in beats per minute were measured at rest and immediately after each trial. VO2 quantifies energy demand of physical tasks. RR and HR were expected to increase when the energy demand (oxygen consumption) increased (Burton, Stokes, & Hall, 2004), but they varied significantly between individuals. RER was a derived component of VCO2/VO2, where VCO2 was the measure of carbon dioxide production (American College of Sports Medicine, 2016). Although RER reliably provides information about what energy substrate (i.e., fat or carbohydrate) is utilized to fuel physical activity, people may tend to slow or hold their breath when completing certain tasks even when they were informed to breathe normally (Brookings, Wilson, & Swain, 1996; Carroll, Turner, & Hellawell, 1986; Carroll, Turner, & Rogers, 1987). Thus, we averaged the last four trials of oxygen consumption (VO2) for each participant in each condition, and performed mixed design Analysis of Variance (ANOVAs) on mean VO2 to examine the effects of movement frequency (low, high, self-selected), target phases (visual in-phase, visual anti-phase), and task conditions/groups (“Info + Spatial +”, “Info + Spatial −”, “Info − Spatial +”), as well as their interactions.

Finally, to determine the relationship between coordination performance and the energy consumption, a Spearman’s rank-order correlation was performed on the ranked PTTs and VO2s regardless of target phases at each frequency separately for each group.

The statistical significance level for all ANOVAs, the corresponding post hoc analysis, and correlation were kept at α = 0.05.

Results

Produced frequency

The mean produced frequencies are shown in Table 2. The ANOVA only revealed a significant effect for the prescribed movement frequency (F2,54 = 454.42, p < 0.001, η2 p = 0.94), and the post hoc analysis suggested that the produced frequencies were significantly different among the three prescribed levels (all p’s < 0.001).

Table 2 Mean produced frequency (mean, ± SD) at each prescribed movement frequency separated by task conditions/groups

Performance (mean PTT)

As can be seen in Fig. 4, all participants generally spent the most time in producing the visually perceived target phases at low and self-selected frequencies with the in-phase pattern outperforming the anti-phase pattern. However, at high frequency, “Info + Spatial +” and “Info + Spatial −” groups lost their stability (that is reducing time at target phase and increasing time at non-target phases) in producing the visually perceived anti-phase pattern, while “Info − Spatial +” group lost its stability in producing the visually perceived in-phase pattern.

Fig. 4
figure 4

Relative phase distributions for each group at low (0.50 Hz), self-selected (measured as 1.08 Hz), and high (2.50 Hz) frequency in producing visually defined in- (0° relative phase) and anti-phase (180° relative phase). Blue bars represent the distribution of relative phases at low frequency, the orange bars represent the distribution of relative phases at self-selected frequency, and the grey bars represent the distribution of relative phase at high frequency. Fluctuation was observed when producing kinesthetic anti-phase at high frequency. Kinesthetic anti-phase became unstable at high frequency, but had not switched to in-phase yet

The three-way mixed design ANOVA on mean PTTs yielded main effects for group (F2,27 = 5.19, p < 0.05, η2 p = 0.28), frequency (F2,54 = 53.45, p < 0.001, η2 p = 0.66), and target phase (F1,27 = 18.74, p < 0.001, η2 p = 0.41). As seen in Table 3 and Fig. 5a, in general, the performance of visually perceived in-phase pattern was significantly better (more accurate and stable) than that of visually perceived anti-phase pattern (visual in-phase: 0.81 ± 0.14; visual anti-phase: 0.76 ± 0.15, p < 0.001). As revealed by the post hoc Tukey HSD tests, performance of the “Info + Spatial +” group was significantly better than that of “Info + Spatial −” group (“Info + Spatial +”: 0.82 ± 0.10; “Info + Spatial −”: 0.76 ± 0.17; p < 0.01), and performance at low and self-selected frequencies was much better than that at high frequency (low: 0.84 ± 0.06; self-selected: 0.82 ± 0.08; high: 0.70 ± 0.21. Both p’s < 0.001).

Table 3 Mean proportion of time-on-task (PTT, ± SD) at each movement frequency separated by task conditions/groups
Fig. 5
figure 5

Mean proportion of time-on-task as a function of Group, Phase, and Movement Frequency. a The performance of each group at the visually defined intrinsic bimanual coordination patterns. The visual and kinesthetic intrinsic patterns were identical for the “Info + Spatial +” and “Info + Spatial −” groups, since they had consistent information. Thus, visual in- and anti- phases were equal to kinesthetic in- and anti- phases for these two groups. b The performance of Info − Spatial+” group at the kinesthetically defined intrinsic patterns. *Significant difference between frequencies, and Ɨsignificant difference between phases

The significant two-way interactions (group-by-frequency, F4,54 = 3.77, p < 0.01, η2 p = 0.22; group-by-target phase, F2,27 = 69.25, p < 0.001, η2 p = 0.84) and three-way interaction (group-by-frequency-by-target phase, F4,54 = 74.03, p < 0.001, η2 p = 0.85) were detected, as well.

The significant three-way interaction indicated that the performance varied depending on the group, target phase, and frequency. According to HKB model, we would expect movement frequency to impact more anti-phase than in-phase performance. Therefore, we examined the effect of frequency at each target phase for each group by performing simple main effect analyses, followed by post hoc Tukey HSD tests. For “Info + Spatial +” and “Info + Spatial −” group, a significant frequency effect was only detected when performing the visual anti-phase coordination (“Info + Spatial +”: F2,108 = 5.30, p < 0.01, η2 p = 0.10; “Info + Spatial −”: F2,108 = 81.11, p < 0.001, η2 p = 0.60). Specifically, the performance at low frequency was significantly better than that at high frequency (p’s < 0.001) for “Info + Spatial +” group, and for “Info + Spatial −” group, the performance at low and self-selected frequencies was better than that at high frequency (p’s < 0.001), with no difference found between the first two. In contrast, a frequency effect was detected only when performing the visual in-phase coordination for Info − Spatial+” group (F2,108 = 92.12, p < 0.001, η2 p = 0.63), where the performance at high frequency was significantly poorer than that at low and self-selected frequency (both p’s < 0.001) with no difference detected between the latter two. The results suggest that as long as the visual and kinesthetic information are consistent (as in “Info + Spatial +” and “Info + Spatial −” groups), increasing movement frequency would destabilize the anti-phase coordination; however, when the visual and kinesthetic information are inconsistent (as in Info − Spatial +” group), maintaining the visual in-phase (not anti-phase) coordination would be extremely challenging at high frequency.

Since in-phase coordination performance is generally more accurate and stable than anti-phase coordination as stated in HKB model, we then examined the effect of target phase (stability of coordination) at each frequency for each group by performing simple main effect analyses followed by post hoc Tukey HSD tests. For the “Info + Spatial +” group, performance of the visually perceived in-phase coordination remained better than that of the visually perceived anti-phase coordination across all frequencies (low: F1,81 = 12.86, p < 0.001, η2 p = 0.14; self-selected: F1,81 = 13.03, p < 0.01, η2 p = 0.14; high: F1,81 = 27.12, p < 0.001, η2 p = 0.25). “Info + Spatial −” group performed the visually perceived in-phase coordination better than the visually perceived anti-phase coordination only at the self-selected and high frequencies (self-selected: F1,81 = 16.71, p < 0.001, η2 p = 0.17; high: F1,81 = 199.48, p < 0.001, η2 p = 0.71), with no difference between target phases detected at low frequency (p > 0.05). For the Info − Spatial+” group, no difference was detected between performance of the visually perceived in- and anti-phase coordination patterns at low and self-selected frequency (p > 0.05); however, a significant phase difference was observed at high frequency with the visual anti-phase coordination being more accurate and stable than the visual in-phase coordination (F1,81 = 187.24, p < 0.001, η2 p = 0.70),

It should be noted that the visual and kinesthetic information were consistent in “Info + Spatial +” and “Info + Spatial −” groups; therefore, the visually perceived in- and anti-phase were equivalent to the kinesthetically perceived in- and anti-phase, respectively, in these two groups. However, this is not the case for Info − Spatial +” group. With the visual-kinesthetic inconsistency and spatial mapping of the movements (motion of dots and motion of joysticks/hands), the visually perceived in-/anti-phase were actually the kinesthetically perceived anti-/in-phase, respectively. Therefore, we labelled the data using the kinesthetically perceived coordination and performed a two-way ANOVA separately for the Info − Spatial +” group treating the newly labelled target phase (kinesthetic in-phase; kinesthetic anti-phase) and frequency (low, self-selected, high) as the within-subject variables (see Fig. 5b). The results showed that the kinesthetically perceived in-phase coordination was significantly better and more stable than the kinesthetically perceived anti-phase coordination only at high frequency (kinesthetic in-phase: 0.88 ± 0.07; kinesthetic anti-phase: 0.46 ± 0.12. F1,27 = 193.92, p < 0.001, η2 p = 0.88). Finally, the group difference at each frequency for kinesthetically perceived in- and anti-phase was compared using simple main effect analysis. No group difference was detected at kinesthetically perceived in-phase among frequencies (all p’s > 0.05); however, a group difference was found at high frequency for kinesthetically perceived anti-phase (F2,81 = 38.61, p < 0.001, η2 p = 0.49) with “Info + Spatial +” group outperforming (p < 0.001) the other two groups (no difference between the latter two, p > 0.05).

Energy consumption (mean VO2)

Energy consumption data generally supported the movement data. As shown in Table 4 and Fig. 6a, the ANOVA yielded a main effect of frequency (F2,54 = 98.41, p < 0.001, η2 p = 0.78), showing that more energy was consumed at high frequency than that at both self-selected and low frequencies (low: 4.36 ± 0.72; self: 4.75 ± 0.88; high: 6.16 ± 1.27. both p’s < 0.001). Significant two-way interaction of group-by-target phase (F2,27 = 10.79, p < 0.001, η2 p = 0.44) and a three-way interaction of group by frequency-by-target phase (F4,54 = 7.29, p < 0.001, η2 p = 0.35) were detected.

Table 4 Mean VO2 (ml/kg/min) ± SD at each movement frequency separated by task conditions/groups
Fig. 6
figure 6

Mean VO2 (ml/kg/min) as a function of Group, Phase, and Movement Frequency. a The VO2 level of each group at the visually defined intrinsic bimanual coordination patterns. The visual and kinesthetic intrinsic patterns were identical for the “Info + Spatial +” and “Info + Spatial −” groups, since they had consistent information. Thus, visual in- and anti-phases were equal to kinesthetic in- and anti-phases for these two groups. b The VO2 level of the Info − Spatial+” group at the kinesthetically defined intrinsic patterns. *Significant difference between frequencies, and Ɨsignificant difference between phases

The effect of target phase at each frequency for each group was examined by simple main effect analysis with post hoc Tukey HSD analysis, as well. As revealed by the results, the “Info + Spatial +” group showed phase difference only at high frequency (F1,81 = 5.63, p < 0.05, η2 p = 0.06), with more oxygen consumed to perform the visually perceived in-phase than anti-phase coordination. The “Info + Spatial −” and Info − Spatial+” groups showed significant phase difference at all frequencies (“Info + Spatial −”—low: F1,81 = 5.31, p < 0.05, η2 p = 0.06; self-selected: F1,81 = 7.79, p < 0.01, η2 p = 0.09; high: F1,81 = 34.37, p < 0.001, η2 p = 0.30. Info − Spatial+”—low: F1,81 = 3.91, p = 0.05, η2 p = 0.05; self-selected: F1,81 = 5.80, p < 0.05, η2 p = 0.07; high: F1,81 = 66.88, p < 0.001, η2 p = 0.45). The “Info + Spatial −” group consumed more energy performing the visually perceived anti-phase than in-phase coordination across all frequencies, while the Info − Spatial +” group consumed more energy performing the visually perceived in-phase than anti-phase coordination across all frequencies.

Finally, we again re-labelled data using the kinesthetically perceived coordination and ran two-way ANOVA just for the Info − Spatial +” group treating the newly labelled target phase (kinesthetic in-phase, kinesthetic anti-phase) and frequency (low, self-selected, high) as within-subject variables (see Fig. 6b). The results showed that more energy was consumed when performing the kinesthetically perceived anti-phase than in-phase coordination (kinesthetic in-phase: 4.42 ± 0.85; kinesthetic anti-phase: 5.32 ± 1.45. F1,9 = 37.28, p < 0.001, η2 p = 0.81). More energy was also spent when performing at high frequency than at either low or self-selected frequency (both p’s < 0.001), with no difference between the latter two frequencies (p > 0.05). As for the group difference, the simple main effect analysis only detected a significant group effect at high frequency, showing that energy consumption was relatively higher for the “Info + Spatial +” group in producing the kinesthetically perceived in-phase coordination (compared with the “Info + Spatial −” group, p < 0.05; compared with Info − Spatial+” group, p < 0.001). No other group difference was detected, suggesting that other than performing the kinesthetic in-phase coordination at high frequency, all groups spent the same level of energy producing both intrinsic coordination patterns. As for why the “Info + Spatial +” group remained a higher energy consumption than the other two groups at the high frequency, the possible explanation is that more energy is required for moving joysticks in the sagittal plane at this speed. Compared to moving joysticks in the frontal planes, more movements were observed at trunk and lower body for the participants when they were moving joysticks vertically without lifting their elbows, which could be an anatomical constraint.

Correlation between performance and energy consumption

As demonstrated in Table 5, a significant negative correlation was found for “Info + Spatial −” and Info − Spatial+” groups at high frequency (the former: rs (20) = − 0.59, p < 0.01; the latter: rs (20) = − 0.69, p < 0.01), and no correlation was found for “Info + Spatial +” group at any frequency (p > 0.05). These results suggest that the accurate and stable coordination performance at high frequency is more associated with small energy consumption for both “Info + Spatial −” and Info − Spatial +” groups; however, such a relationship become weaker when visual and kinesthetic information are kept consistent with addition of spatial mapping.

Table 5 Spearman’s rank-order correlation between ranked mean proportion of time-on-task and ranked mean VO2 (ml/kg/min)

Discussion

The study of bimanual coordination suggests that two intrinsic coordination patterns exist to attract coordination performance with the in-phase coordination being more stable and less affected by increasing movement frequency than the anti-phase coordination. However, the perception of intrinsic coordination patterns could be ambiguous due to different frames of reference used to define them in visual and kinesthetic domains. Thus, it remains unknown whether the visually or kinesthetically perceived information is used to maintain the intrinsic coordination patterns.

The current study evaluated the stability of intrinsic bimanual coordination patterns as well as the associated energy consumption at various movement frequencies by manipulating the consistency between visual and kinesthetic information in a computer-joystick bimanual task. The results showed that the kinesthetic information was largely used to maintain the stability of intrinsic coordination patterns at high movement frequency, which could be an energy-conserving solution. However, spatial mapping alone seemed to be beneficial for keeping the visually perceived in-phase and anti-phase coordination patterns equally stable at low movement frequency, and spatially mapping the visual information to be consistent with kinesthetic information greatly enhanced the stability of anti-phase coordination.

Visual and kinesthetic information co-exist in bimanual coordination

When performing coordinated movements, both visual and kinesthetic information about relative phase are available for use to maintain the coordination. Previous studies have shown that visual information presented in Lissajous figures (e.g., Kovacs, Buchanan, & Shea, 2009; Kovacs and Shea, 2011) or moving dots (e.g. Wilson, Snapp-Childs, & Bingham, 2010, Wilson, Snapp-Childs, Coats, & Bingham 2010) enables people to learn a novel coordination pattern, and kinesthetic information provided by a mechanical manipulandum (e.g. Wilson, Bingham, & Craig, 2003) or human coach (e.g., Ren et al., 2015; Zhu et al., 2017) is equally effective for learning the novel coordination pattern. However, as conjectured by Bingham, Snapp-Childs, and Zhu (2018), the information used to perform coordination movements could be different in visual and kinesthetic modes (being modality specific), and when visual and kinesthetic information are both provided for learning, people seem to prefer using the salient visual information while neglecting the kinesthetic information (Zhu, Mirich, Huang, Snapp-Childs, & Bingham, 2017).

Recently, Huang, Dai, and Zhu (2019) attempted to direct people’s attention to both visual and kinesthetic information in learning a novel coordination pattern. A superior learning was noticed when people were directed to focus on visual information before they could focus on kinesthetic information. The researchers contended that visual information was more salient and easier to be discriminated than kinesthetic information; therefore, the early focus on visual information helped to establish the coordination pattern and then cross-train the kinesthesis. On the other side, the limitation of early focus on kinesthetic information could be attributed to the confusion in using either muscle constraints or egocentric frame of reference to interpret the perceived kinesthetic information.

Although learning a novel coordination pattern like 90° relative phase is difficult, the information in visual or kinesthetic mode is unambiguous, because the coordination movement can be interpreted as either half time moving in the same direction and half time moving in the opposite direction (visually), or half time flexing/extending limbs simultaneously and half time flexing/extending limbs alternately (kinesthetically). However, this is not the case for performing the intrinsic coordination patterns. As demonstrated in Fig. 1, the visually and kinesthetically perceived information could be completely inconsistent, thus ambiguous for the actor in maintaining the intrinsic coordination patterns. Which information will be relied on to maintain the stability of coordination is yet to be determined? The current study provided a possible answer; that is, at low frequency, either visual or kinesthetic information could be used to maintain intrinsic coordination patterns. However, at high frequency, kinesthetic information will be predominantly used to maintain the coordination, which might be an energy-conserving solution.

Importance of information consistency and spatial mapping

Spatial mapping plays an important role in determining the consistency between visual and kinesthetic information in bimanual coordination. In the current study, the spatial mapping was deliberately manipulated to keep visual and kinesthetic information consistent or inconsistent. When motion of dots on screen was spatially mapped to the motion of hands/joysticks, the use of allocentric visual information was promoted for control of bimanual coordination. The results showed that when visual and kinesthetic information were kept consistent (as in “Info + Spatial +” and “Info + Spatial −” groups), the superior stability of in-phase over anti-phase coordination was maintained. The increasing movement frequency only impacted the anti-phase coordination, so that it became harder to maintain the coordination. This finding supported both HKB model (Haken et al., 1985) and perceptually driven dynamical model (Bingham, 2004a, b) of bimanual coordination, suggesting that bimanual coordination is a perception–action task that follows natural law. Nevertheless, some benefit for spatial mapping in addition to keeping the information consistent was seen. The performance of anti-phase coordination was significantly higher in “Info + Spatial +” group compared to that in “Info + Spatial −” group and their performance was uncorrelated with energy consumption at all, indicating that the spatial mapping helped to alleviate the cost of maintaining the kinesthetically perceived anti-phase coordination as the movement frequency increased.

The stability of coordination changed when keeping the spatial mapping to make the information inconsistent (as in the Info − Spatial +” group). Both intrinsic coordination patterns were performed equally well at lower frequencies, but the kinesthetically perceived anti-phase coordination lost its stability at the high frequency even it was visually perceived as in-phase, suggesting that spatial mapping can help to alleviate the cost of maintaining the kinesthetically perceived anti-phase coordination up to a point in the spectrum of movement frequency, beyond which the kinesthetic information will take over to control the coordination. The switch of using different information for control of bimanual coordination as a result of increasing movement frequency might be similar to the switch of gait pattern as a result of increasing locomotion speed (Hoyt & Taylor, 1981; Diedrich & Warren, 1995) in that the previously used information (or gait pattern) became no longer reliable to support the current demand of task and keep the minimum energy expenditure. This is highly possible when visual and kinesthetic information co-exist in performing the bimanual coordination task.

Dynamical use of information for control of bimanual coordination

Perceptual-motor learning and control of bimanual coordination entails integration of information from different modalities (e.g., vision and kinesthesis). The perceptually driven dynamical model (Bingham, Zaal, Shull, & Collins 2001, 2004a, b; Snapp-Childs, Wilson, & Bingham, 2011) suggests that the coordinated rhythmic movement is perceptually mediated by information about relative phase. This information is relative direction between two oscillators, which is affected by relative speed. In-phase is distinctive and stable, because it represents the phase relation with the relative direction always being identical. With the relative speed of in-phase being consistently zero, the ability to resolve relative direction between oscillators preserves as movement frequency increases. Correspondingly, anti-phase represents the phase relation with opposite relative direction. Since relative speed of anti-phase ranges from zero to maximally different, the relative direction of movement becomes impossible to be distinguished at high frequency, and thus, anti-phase loses its stability and eventually switch to in-phase pattern.

In light of the perceptually driven theory of bimanual coordination, the visually perceived information should be salient and easy to use for control of coordination. As long as the relative direction can be detected (at low frequency), the visually perceived in-phase coordination should remain stable even when it is kinesthetically perceived as anti-phase. However, this strategy does not persist at high frequency. The detection of relative direction becomes more difficult and eventually impossible when the two visually displayed dots are oscillating at high speed. Therefore, the use of visual information for control of coordination becomes unreliable and energy inefficient. To maintain the coordination pattern and save energy, it is imperative to switch and use the alternative information, that is, the kinesthetically perceived relative phase. In fact, to authors’ knowledge, this is the first study, showing that more energy was consumed to maintain the visually perceived in-phase coordination when it was destabilized by high frequency.

Evidence for the control mechanism by which kinesthetic in-phase outperformed kinesthetic anti-phase has conventionally stemmed from neural crosstalk theory of bimanual coordination (e.g. Marteniuk, MacKenzie, & Baba, 1984; Swinnen, 2002; Swinnen, Young, Walter, & Serrien, 1991). The control of bimanual coordination entails information exchange between the hemispheres through the corpus callosum. Each hemisphere controls both contralateral and ipsilateral arm movements. Kinesthetic anti-phase coordination is less stable, because it requires the co-activation of non-homologous muscle groups (e.g., one arm flexes while the other extends), whereas the limbs receive discordant motor commands that inhibited one another when planning and executing the movement. Such a neuromuscular deficiency in control of kinesthetic anti-phase coordination can be offset by having the salient visual information to guide the limb movement at low movement frequency especially with spatial mapping. However, the control of coordination will restore the superior stability of kinesthetic in-phase at high frequency for two possible reasons that are await proven by the future studies: first, the reduced resolution of visual information makes it unreliable for control, and second, it might be neuro-muscularly more efficient for the brain to issue the accordant motor commands to co-activate the homologous muscle groups for simultaneous flexion and extension.

In sum, the current study indicates that using visual or kinesthetic information for control of bimanual coordination appears to be a function of movement frequency. At a relatively low movement frequency (before hitting the threshold), either visually or kinesthetically perceived information can be used for control of coordination, and the visual–spatial information might be salient and accessed relatively easy for learning and control of anti-phase or non-in-phase coordination patterns. When the movement frequency increases to surpass the threshold, the visual–spatial information will become unreliable, creating a condition in which it is more energy efficient to switch and use the kinesthetic information for control of coordination.