Introduction

Ideomotor apraxia (IMA) is a deficit of execution of voluntary motor programming, unrelated to deficits of primary motor or sensitive areas, task instructions understanding, object recognition or frontal inertia [1]. It affects approximately one-third of left-hemisphere (LH) stroke patients, independently of stroke type, age and gender [2], and often co-occurs with other, severe cognitive deficits such as aphasia. IMA affects the performance of both known and new gestures, typically on imitation, but also when gestures are elicited through other modalities (e.g., on verbal command or visual presentation of objects), and it differs from ideational apraxia, which refers to a loss of the conceptual representation of a known gesture [3].

The imitation deficits are explained on the basis of a dual-route model (originally proposed by [4], and developed by [58]) assuming the existence of two pathways for transforming the visual input—the gesture to be imitated, performed by the examiner, in a motor act—the gesture performed by the patient (Fig. 1). If, after visual analysis, the gesture is recognized, i.e., it belongs to the motor repertoire of the individual, it is processed via the “semantic route” (enabling only imitation of known gestures). If the gesture is new, after visual processing, it is decomposed into simpler components, which are held in working memory till they are physically reproduced (“direct route” [5, 7]).

Fig. 1
figure 1

Modified version of the dual-route model for action imitation proposed by Rumiati and Tessari [7]. After early visual processing, shared by both routes, known gestures automatically activate the semantic route, using information stored in long-term memory (LTM). By contrast, new gestures are imitated via the direct route, which decomposes the seen gesture into smaller motor components which are stored in working memory (WM) till they are reproduced. The LTM–WM connections allow learning of new gestures

Regardless of gesture type, lesions of inferior parietal cortex, subcortical structures, and premotor cortex in the LH are most frequently associated with IMA. Cortical lesions tend to be associated with sequence errors, body-part-as-a-tool errors or unrecognizable gestures while subcortical lesions tend to be associated with postural or timing errors [914]; see 13 for a review]. Right-handed individuals with LH damage show IMA of both upper limbs [3]. However, the right limb is often plegic, so IMA is usually tested only with the left limb. The anatomo-functional correlates of IMA have been analyzed in brain-damaged patients with selective deficits in imitating known or new gestures [8, 12, 14] and in neuroimaging research on healthy individuals performing both gesture types. The two routes are associated with separate brain areas: the semantic route mainly relies on LH areas (inferior temporal, parahippocampal, and angular gyri); the direct route includes a more extensive network of cortical areas (i.e., superior parietal cortex bilaterally, right parieto-occipital/occipito-temporal junctions and left superior temporal cortex [12, 15, 16]). Moreover, the composition of the list of actions to be imitated—new and known gestures intermixed in a same list vs. presented in separate lists— has a role [5, 12, 17, 18]. With mixed lists, the direct route is used for imitating both types of action; with separate lists, the semantic route is selected for imitating known gestures and the direct route for new gestures [12, 18]. This strategy allows the participant to minimize the number of switches between the two routes, hence reducing cognitive load [18].

The most widespread tests for IMA [3, 19, 20] can detect severe ideomotor deficits. However, they were not standardized to identify selective or disproportionate damage to one of the two routes, which would be critical for tailoring the rehabilitation technique for each specific patient (see [21] for a review of rehabilitation approaches). Patients with direct-route damage are impaired at learning new gestures by imitation, even though in a domestic context they can properly use objects and tools. By contrast, patients with semantic-route damage can learn new motor skills, but are impaired in a domestic context, because they cannot retrieve motor information associated with known objects. Hence, identifying these two patient types would much improve the effectiveness of rehabilitation programs.

New IMA batteries have been proposed (e.g., [19, 20]) that evaluate gesture recognition, identification and production in detail. However, administration time is usually so long as to advise their use just in a post-screening phase, after patients received an IMA diagnosis. Some of the tests (e.g. [23]) require gesture production only on verbal command, thus providing ambiguous information (most LH patients have language comprehension deficits). Additionally, some of these tests do not analyze the known/new dissociation and the distinction between distal (fingers and hand) and proximal (arm) components of gesture production, relating more to grasping and reaching, respectively [24]. However, distal and proximal components show different vulnerability after brain damage [8, 2431].

Aim of the study

We wish to propose a new short IMA test to be used in the screening phase, and which is able to separately test (1) direct-route from semantic-route deficits, and (2) deficits of the proximal vs. distal movement components. This would help fast and accurate IMA diagnosis and classification of patients, allowing for tailored rehabilitation. Longer, in-depth assessment might then be performed with ad hoc batteries (e.g. [19, 20, 32]).

Method

Participants

We recruited 111 participants (55 females, age = 60.2 ± 15.5, range 30–84, education = 9.8 ± 4.04, range 4–20Footnote 1). Inclusion criteria were: (1) aged 30–90 years; (2) not showing anamnestic or clinical evidence of neurological disease, head trauma, psychiatric disorders requiring pharmacological intervention, evidence of alcoholism or drug addiction; (3) being right-handed on the Edinburgh Test [33]. Each participant signed a statement of informed consent.

Procedure

Ten experts not directly involved in the research project selected 18 known gestures (easily recognizable) and 18 non-recognizable gestures (see Appendix). Half the known gestures mainly involved the hand (e.g., OK sign), while the others mainly involved the arm (e.g., military salute). Known and new gestures were presented in separate blocks, known gestures first, to avoid the participant from selecting the direct route as a default strategy.

The examiner, previously trained by an investigator through a demo (http://www.sissa.it/cns/Videos/Imitation%20test.avi; the video is for demonstration purposes for the examiner only. During the test, it is recommended that the examiner to stand, next to the patient, in order to be able to easily perform the proximal gesture and to perform the distal new gestures resting his/her hand on a table), presented each stimulus up to two times. The examiner demonstrated each gesture with his/her right hand and the participant imitated it in a mirror fashion by his/her left limb.Footnote 2 Participants were instructed to imitate the gesture in a mirror-like configuration and to pay attention to the exact position of both hand and arm to reproduce their position correctly with respect to either other body parts or between them.

Correct imitation on first presentation was granted 2 points. If a participant failed to reproduce the gesture correctly on first presentation, the experimenter presented it a second time; correct imitation after second presentation was granted 1 point. A double failure was scored 0. The maximum test score was 72/72. Each participant’s performance was videotaped and later analyzed by a second independent judge. If there was no agreement between the examiner and a second judge (A. Tessari, who later watched the video-recorded performance of all participants), the participant was discharged by the study (only 1 participant, out of an original sample of 112, was excluded).

After the imitation task, each participant was asked to recognize the 18 known gestures. This will be critical for telling pre-semantic/semantic from post-semantic deficits in patients: impaired recognition with intact imitation of known gestures suggests a pre-semantic or semantic deficit along the semantic route; impaired imitation with intact recognition of known gestures would suggest post-semantic damage (Fig. 1).

The test normally takes 2–3 min for a non-apraxic person. It can take up to 4–5 min when administered to severe apraxic patients.

Statistical methods

Collinear predictors, distribution shapes and statistical models

Education showed the typical correlation profile due to social evolution in the last decades in Italy: age and education were anticorrelated (Spearman’s ρ = −.473, p < .001), and women showed a slightly lower education level than men (Mann–Whitney z = 2.42, p = .016), an effect emerging from the oldest individuals. Hence collinearity affected our demographic predictors. To disentangle their effects on imitation performance, we had to introduce them simultaneously in a single analysis. We used generalized linear model (GzLM) with Tweedie distribution (1.5) and Log-link function. Indeed GzLM Tweedie distribution can accurately model markedly non-normal score distributions: on our test, most scores lay at, or close to ceiling, with a long tail towards lower values (Skewness ranged −1.08 to −1.99 in different subscales; Kurtosis ranged 1.47–5.00). After having detected significant predictors, we modeled their effects on the scores, hence providing correction equations and tables. Overall, the procedure was as follows.

  1. 1.

    We computed Score minus MaxScore (so that ceiling values became 0, a necessary condition for the Tweedie model). We then applied GzLM to identify critical predictors, with a backward selection technique: on a first step age, education and gender were introduced in the analysis; then variables surviving a p < .05 threshold, one-tailed in the expected direction for age and education, two-tailed for gender, accessed a second step, and so on, until only p < .05 predictors survived (Table 1).

    Table 1 Generalized linear model results for the effects of age, education, gender
  2. 2.

    Scores were corrected for the predictors surviving step (i) (i.e., only age, in all cases). We fitted a two-parameter quadratic model, raw score R = i + q(Age − 30)2, with i = intercept and q = slope of quadratic component, and derived corrected scores Cage30 for minimal age in the sample (30 years). The linear component was omitted because it was not significant (see “Results”).

  3. 3.

    We tested whether corrected scores Cage30 were really independent of other predictors (education and gender); if so, the corrected scores were used as the final standardization outcome; if not, a further second-level correction was applied. In both cases, correction equations and tables were provided.

We repeated this procedure separately for the overall score (0–72), for the subscales known (0–36), new (0–36), proximal (0–36), distal (0–36) gestures, and for the four atomic subscales known proximal (0–18), known distal (0–18), new proximal (0–18), new distal (0–18) gestures.

Results

All participants recognized each and every “known” gesture (100 % accuracy). Imitation performance was analyzed as detailed in the following paragraphs.Footnote 3

Meaning and body segment effects

Between-subscale differences showed close-to-normal distributions (Skewness ranged 0.04–0.90, Kurtosis −0.37 to 3.70) so paired-samples t tests were used. Main effects of meaning (known vs. new gestures, t(110) = 8.178, p < .001) and body-segment (distal vs. proximal, t(110) = 4.836, p < .001) were found, with a significant interaction [t(110) = 4.702, p < .001]. Post hoc tests showed no body-segment effect within known gestures [t(110) = 1.205, p = .231), while such effect appeared within new gestures [t(110) = 5.745, p < .001). meaning had a significant effect both for proximal [t(110) = 3.425, p < .001] and distal [t(110) = 8.528, p < .001] gestures, even though it was markedly higher in the latter. The overall profile is visible in Fig. 2: proximal and distal gestures were imitated at a similar level when they were known (mean proximal = 17.04 vs. mean distal = 16.85); when gestures were new, distal were imitated worse (16.47 vs. 15.15). The known vs. new advantage was clear among proximal (17.04 vs. 16.47) and distal (16.85 vs. 15.15).

Fig. 2
figure 2

Mean performance (SE) of 111 participants (0–18 scale) as a function of gesture meaning (known–new) and body segment (proximal–distal)

Overall score (0–72): model and correction table

Age was the only significant predictor of the overall score on GzLM analysis (see Table 1). Score drops with age (Fig. 3). When fitting a standard second-order polynomial a significant quadratic component was detected [t(108) = 2.268, p = .025] without linear component [t(108) = 1.503, p = .136]. Such non-linear pattern was not due to ceiling (72/72, achieved by seven young individuals), as the exclusion of an identical proportion of top-scoring individuals from the older age classes did not change the profile [quadratic: t(84) = 2.165, p = .033; linear: t(84) = 1.428, p = .157].

Fig. 3
figure 3

Overall scores (range 0–72) by the 111 participants as a function of age. Curves show percentiles 3.1 (the boundary between equivalent scores 0 and 1), 10.7 (between 1 and 2), 26.8 (between 2 and 3), and median (between 3 and 4) according to the quadratic model detailed in the text

We implemented a model with only intercept i and quadratic q components: raw score R = i + q(Age − 30)2. Given that variance increases with age [the four age classes 30–46, 47–62, 63–73, 74–84, yielded a significant Levene (3107) = 3.126, p = .029], we included a linear link between intercept i (=performance at age = 30) and quadratic decrement q, to account for this variance increase. The final equation providing an age-corrected score, standardized for age = 30, was:

$$C_{\text{age30}} = \left[ {{\text{Raw score}} + 0.02068\left( {{\text{Age}} - 30} \right)^{2} } \right]/\left[ {1 + \left( {{\text{Age}} - 30} \right)^{2} /3936} \right]$$
$${\text{If }}C_{{{\text{age}}30}} > 72,{\text{ make it}} = 72.$$

This age-corrected score correlated neither with education (ρ = .119, p = .215) nor with gender (Mann–Whitney, z = 1.42, p = .156). Hence, no further correction was needed. Table 2 allows to find the raw scores corresponding to percentiles 5th, 10th, 25th, 50th, 75th, and to equivalent scores [34] 0–4, given the patient’s age.

Table 2 Overall score correction (range 0–72). Raw score and age are the entries, and percentiles and equivalent scores are the output

Subscales

For the sake of consistency, we applied the same general model as that used with overall-score to all subscale-scores. Table 3 reports the fitted quadratic models. The age-corrected Cage30 scores did not correlate with education or gender, so the models were taken as the final ones. One exception was Cage30 of the known distal subscale, which correlated with education. This effect was modeled by a simple linear regression, leading to a further correction. The final score is corrected both for age (standardized at 30) and education (standardized at 20), Cage30/ed20.

Table 3 Best quadratic models, residual correlations of corrected scores with education and gender, and further corrections are reported for all STIMA subscales

Correction tables with equivalent scores and percentiles for all eight subscales (known, new, proximal, distal, known proximal, known distal, new proximal, new distal gestures) are reported in the supplementary material. This makes all subscales ready to use in clinical practice without using the complex correction formulae reported in Table 3.

Discussion

The purpose of this study was to provide a new, short test for detecting IMA deficits that specifically affects the imitation of known/new gestures or different body segments.

Results showed that known gestures are imitated more accurate than new gestures, and that gestures involving proximal segments are imitated better than those involving distal segments. These two difficulties interact: new gestures involving distal segments are over-additively difficult (Fig. 2). Unlike gender, age has a significant impact on all subscales; education had a marginal effect on one subscale (distal known gestures).

We also estimated and subtracted the effects of age on all subscales, and provided tables for converting raw scores into equivalent scores [34] and percentiles. While equivalent scores have well-known meaning in clinical practice, the fifth percentile is conventionally accepted as cut off for diagnosis in research. A patient whose score on imitation of known gestures is below 5th percentile is likely to have a damaged semantic route, while a patient failing at imitating new gestures is likely to have an impaired direct route. We also provided equivalent scores and percentiles for distal and proximal movements, as a large literature showed their sensitivity to different anatomical lesions [8, 25, 28, 30, 31, 3537].

Since STIMA presents known and new gestures in separate blocks, it should be generally a more sensitive detector of dissociations between the two types of gestures than other tests presenting known and new actions in mixed lists (e.g. [3, 20]). With mixed lists, participants are likely to rely on the direct route only, as this can imitate both gesture types, thus avoiding the cognitive load of frequently switching between the two routes. However, this strategy would swamp any experimental difference between known and new gestures. By contrast, separate-blocks presentation minimizes the cognitive load (no switch is required within each block), hence prompting the use of one route in each condition: the semantic route for known gestures and the direct route for new gestures [5, 12, 1618].

Other advantages of STIMA over other tests are that it is quick to administer (which makes it usable in the bed-side screening phase) and it includes differential evaluation of body segments, distal vs. proximal.

Longitudinal studies (e.g., [22]) show that IMA rehabilitation is necessary, since spontaneous recovery rate is only 50 %. An accurate diagnosis of the specific aspects underlying IMA is critical to choose appropriate rehabilitation programs. The correct identification of the damaged imitation process provided by STIMA makes the different stakeholders (psychologists, physiotherapists, speech therapists, doctors) able to tailor the rehabilitation procedure to the individual patient. For example, if damage mainly lies in the direct route, the patient cannot learn new gestures by imitation: the rehabilitator may exploit the (relatively intact) repertoire of gestures that are already known by the patient. Here, the “substitutive” method, in which spared capacities can stand in for the compromised function by alternative strategies of compensation (e.g., [38]), is appropriate. If, on the contrary, the semantic route is more damaged, the patient is unable to access, retrieve or implement semantic information about known gestures in an appropriate motor program. Here, the rehabilitator may take advantage of the ability to learn by imitation, through the direct route, and try to create a new trace in episodic memory [39] using the “substitutive” [38] or the “restorative” method, in which the lost function is trained to bring its effectiveness as close to pre-morbid levels as possible (e.g., [40]; [21] for a review).

A previous version of this test has already been used in two large group studies with brain damaged patients Mengotti et al. [41] and patients with Parkinson Disease Bonivento et al. [42]. Stimuli and procedure were exactly the same, with the difference being in the size of the control sample and especially, in the grain of statistical analysis, which only reported cut offs for the total score and for the known and new gestures subscales. In those studies, the test proved sensitive in detecting either a general apraxic deficit or dissociations between known and new gestures. The present version will allow even subtler distinctions, given that it provides nine different scales, and for each of them, age- and education-corrected scores as well as equivalent scores; the correction for demographic variables will increase sensitivity, while the use of equivalent scores will allow an estimation of deficit severity (which single cut offs do not provide).