Introduction

Treatment of Parkinson’s disease (PD) gains more and more importance due to better and earlier diagnostic conditions and rise of life expectancy in the general population (Berg et al. 2013). Elective hospital admissions due to PD are rare worldwide. Some health care insurance systems reimburse treatment of PD patients in specialized units, i.e. in Germany (Müller et al. 2004). There are certain preconditions. Payers enforce expertise for PD and careful documentation of anti-parkinsonian drug titration. They also ask for performance of additional so-called “activating” therapies over an interval of at least 7.5 h in total per week. In turn, healthcare insurances offer a daily rate for this in-patient stay, which is not common in the case-related reimbursement system of the German diagnosis-related groups system. Generally, two approaches exist for the evaluation of the effect of treatment in these PD units. On the one hand, one may use clinical rating scales, which preponderantly estimate motor function and non-motor symptoms. Their execution varies from one examiner to the other. Scoring is relatively insensitive to subtle modifications and is biased by the rater’s subjective impression of the patient (Fahn et al. 1987; Chaudhuri and Martinez-Martin 2008; Müller and Woitalla 2010; Müller 2014). On the other hand, execution of standardized quantitative instrumental procedures enables a more objective assessment (Maetzler et al. 2016). However, these techniques are sometimes complex to handle. They often only reflect tremor, rigidity and akinesia to a certain extent and neglect the bias of additional non-motor symptoms, such as apathy, on task performance. To date, data on the efficacy of these in-patient stays are rare, particularly concerning non-motor features of PD (Müller 2014). Here, we describe outcomes of a standardized evaluation of motor- and non-motor symptoms with clinical rating in conjunction with execution of more objective instrumental, respectively, standardized clinical tests initially and at the end of an elective hospital stay (Müller et al. 2000). This combination of subjective, investigator-biased scoring with more objective standardized examination techniques, which only focus on the certain components of PD, represents a rare and unique approach for such an investigation (Müller et al. 2004; Müller and Woitalla 2010). Currently, rather complex techniques are developed to determine for instance fluctuations of motor behavior in PD patients. However, the information, resulting from these still experimental techniques for the treating clinician, is sometimes rather complex. Often outcomes ask for a complex evaluation and interpretation due to the amount of presented data. Moreover, these new instrumental assessment tools predominantly focus on motor behavior (Maetzler et al. 2016). In contrast, one may scrutinize whether more simple and easy to perform assessment tools would provide the same merit for clinicians (Maetzler et al. 2016). Objectives were to discuss suitability and clinical value of applied rating scales in relation to employed evaluation techniques and to demonstrate the efficacy of this kind of an in-patient treatment in an open label fashion.

Subjects and methods

Subjects

We included 126 consecutively referred in-patients [age: 68.02 ± 0.86 (mean ± SEM) years; male: 78, female: 48; Hoehn and Yahr range 3.01 ± 1.06; duration of hospital stay: 21.22 ± 2.02 days; Table 1] into this trial. All fulfilled clinical diagnostic criteria for idiopathic PD.

Table 1 Comparison between both assessments

Design

All PD patients received a standardized setting with physiotherapy, massage, speech therapy, occupational therapy, etc. These supplemental therapies lasted 7.5 h per week at a minimum. 5 h had to be performed as an individual therapy, in other words on a one-to-one basis between the patient and the therapist.

Clinical rating

Scoring was performed with the Unified Parkinson’s Disease Rating scale (UPDRS), the non-motor symptom assessment scale for PD (NMSS) and the non-motor screening questionnaire (NMSQuest). The UPDRS was designed to provide a measure of signs and symptoms of PD in clinical practice and research. The scale assesses six domains of PD impairment using a combination of data gathered from interview (part I: mental behavior, part II: activities of daily living, part IV: complications of therapy) and direct examination of the patient (part III: motor examination) in combination with the Hoehn and Yahr Scale, which roughly subdivides severity of PD in various stages (Fahn et al. 1987).

Similar to the UPDRS, the non-motor symptom assessment scale for PD (NMSS) was executed by a board-certified neurologist (Chaudhuri et al. 2007). The NMSS as a standardized interview consists of nine parts. They focus on cardiovascular including falls (domain 1), sleep/fatigue (domain 2); mood/cognition (domain 3); perceptual problems/hallucinations (domain 4), attention/memory (domain 5), functions of the gastrointestinal—(domain 6) and the urinary tract (domain 7); sexual behavior (domain 8) and an interview on further miscellaneous items (domain 9).

The non-motor screening questionnaire (NMSQuest) was developed for patients as a self-rating tool, which only focus on the presence of non-motor features within a yes or no paradigm (Siderowf and Werner 2001; Chaudhuri et al. 2006).

Standardized assessments

We used two instrumental procedures, peg insertion and tapping (Müller et al. 2000). They aim on the function of upper limbs and ask for a certain cognitive load, whereas the well-known Timed Up and Go test (TUG) focus on balance and walking abilities (Zampieri et al. 2010). Here, we present no data of matched controls within the performed within subject comparison due to significant differences to normal controls, as described in previous studies (Müller et al. 2000; Zampieri et al. 2010). We allowed all participants to get familiar with the tasks for a time interval of 60 s. This approach should minimize learning and training effects on the repeated performance of these tests (Müller et al. 2000).

Peg insertion

We asked subjects to transfer 25 pegs (diameter 2.5 mm, length 5 cm) from a rack into one of 25 holes (diameter 2.8 mm) in a computer-based contact board individually and as quickly as possible. The distance between rack and appropriate holes was 32 cm. The board was positioned in the center and the task was carried out on each side. When transferring each peg from rack to hole, elbows were allowed to be in contact with the table. We measured the time interval between inserting of the first and the last pin initially with the right and then the left hand. We assessed the time period for this task by a computer to 100 ms accuracy. The peg insertion result represented the time of the task performance with the right and left hand in seconds (Müller et al. 2000).

Tapping

Individuals tapped as quickly as possible on a contact board (3 cm × 3 cm) with a contact pencil for a period of 32 s after the initial flash of a yellow stimulus light. We did not control for peak height reached by the pencil. The board was positioned in the center, when the task was carried out on each side. When performing the task, elbows were allowed to be in contact with the table. We obtained the number of contacts by a computer. First, we measured the frequency of tapping with the right and then with the left hand. The tapping rate represented the computed sum of tapping results of both hands (Müller et al. 2000).

Timed up and go test (TUG)

This simple test is used to assess a person’s mobility. Test execution requires balance, both the static and the dynamic component. TUG measures the time that a person takes to rise from a chair, walk 3 m, turn around, walk back to the chair, and sit down (Zampieri et al. 2010).

Design

A board-certified neurologist rated severity of PD with the UPDRS and the NMSS, technicians performed standardized assessments on the first [initial] and last day [end] of the hospital stay under identical conditions. Fluctuating PD patients were only evaluated in their “ON”-state, respectively, what they themselves defined as their “ON”-state.

Statistics

Data showed a normal distribution according to the Kolmogorow–Smirnow test. We used ANCOVA with repeated measures design including sex, Hoehn and Yahr Stage, gender and age as covariates for comparisons. p values below 0.05 were regarded as significant for the whole explorative analysis. The correlation analysis was performed with Pearson product–moment correlation. Only a correlation coefficient of >0.25 was considered as significant in view of the number of participants and performed correlations. Only total scores were included in the analysis.

Ethics

All subjects gave written informed consent. This investigation was advertised according §4 Abs. 23 Satz 3 AMG at the medical association. It was characterized as non-interventional and thus observational, because all performed evaluations are part of the routine surveillance in the treatment of PD patients.

Results

Comparisons

Rating by neurologists demonstrated that all the scores of the UPDRS improved, even the various rating outcomes of parts I–IV became better (Table 1). The NMSS scores reduced with the exception of domains 7 and 8 (Table 1). Self-rating of the non-motor symptoms by the patients showed a reduction of the NMSQuest outcomes (Table 1). The interval for the performance of the peg insertion task decreased. The tapping score did not go up significantly. The period for execution of the TUG reduced (Table 1).

Correlations

Table 2 shows significant correlations between the various rating scale outcomes at moments “initial” and “end”. One compelling result was the positive relation (R = 0.39) between computed differences of both evaluation moments concerning the total UPDRS and NMSS scores (Fig. 1). It is noticeable that the evaluation of non-motor symptoms by the patients and physicians was related to each other (Table 2, lines 6, 7, 12–15). Generally, the peg insertion results (Table 2, lines 16, 17, 20, 21, 25, 27–37) and the TUG outcomes (Table 2, lines 4, 5, 10, 11, 18, 19, 22, 23, 31–33, 35–37, 40, 41, 43, 44) showed more significant correlations to the other applied assessment tools in contrast to the tapping procedure (Table 2, lines 38–42). In summary, these significant associations also reflect the value of a three-dimensional evaluation by rating scales, instrumental procedures, respectively, standardized testing.

Table 2 Correlation analysis
Fig. 1
figure 1

Correlation analysis on computed differences of the UPDRS total score and of the NMSS total score. UPDRS Unified Parkinson’s Disease Rating Scale, NMSS non-motor symptom assessment scale for PD

Discussion

This investigation provides some insight on the efficacy of in-patient stays in a specialized unit for the treatment of PD patients. The employed combination of subjective and objective evaluation tools shows that reduction of the various UPDRS-, NMSS- and NMSQuest scores, the decline of the TUG outcomes and to a certain extent the results of the two different applied instrumental procedures may reflect the achieved benefit for PD patients. The standardized use of subjective rating scales and easy to handle, cheap objective instrumental tools supplements the evaluation with rating scales (Müller et al. 2000). We suggest that this concept represents a suitable approach for the evaluation of the treatment benefit. These real-world data may further convince the payers of the economic and therapeutic efficacy of in-patient stays in PD.

The reported improvement of PD symptoms, reflected by the decline of UPDRS-, NMSS- and NMSQuest scores, is distinct superior to the clinical benefit observed in various short-term and long-term controlled trials with selected, at least partially optimum titrated patients of earlier and later stages of PD. Nowadays these phase IV trials only change one component of the applied therapy. Our outcomes indicate that a multifactorial treatment concept with implementation of supplemental non-pharmacological therapies will provide a more distinct amelioration of disease severity.

Interestingly the employed tapping procedure, which shares some similarities to the item “finger tapping” within part III of the UPDRS, showed no significant outcomes in contrast to the peg insertion paradigm. Peg insertion in the applied form depends on various kinds of movements, requires a more complex sequence of movements and demands visuospatial cognition, self-elaboration of internal strategies, sorting and planning. All these efforts are influenced by the modulatory role of striatal dopamine levels on association areas of the prefrontal cortex (Müller et al. 2000). Therefore, peg insertion additionally asks for dopamine-dependent cognitive processes. We suggest that, therefore, peg insertion better reflected the improvement of PD symptoms in contrast to the tapping task (Müller et al. 2000). This employed tapping paradigm only asks for repetitive performance and programming of standardized movements. It requires low cognitive load, as the subject may create a fixed habit tendency with a consistent, attentional saving of cognitive resources after learning a certain sequence of movements, which is based on an automatic function of a cognitive set (Müller et al. 2000).

Figure 1 reveals that PD symptoms also deteriorated in some patients. Moreover, this significant correlation of Fig. 1 also indicates that both, NMSS and UPDRS, may reflect treatment effects despite their different focus on certain aspects of PD. The reported correlations of Table 2 underline the value of TUG and peg insertion in relation to the rating scales. The reported associations with the NMSQuest demonstrate that self-rating of patients may also represent a valuable instrument to reflect disease severity.

Limitations of this trial are the missing blinding of raters due to obvious technical reasons. Here, a certain bias due to patient-to-investigator bonding is likely. We also missed to perform a concomitant scoring by caregivers and/or spouses before and after a fixed time period in the domestic surroundings for further evaluation of the long-term benefit of the hospital stay. One may also assume certain placebo effects in view of the upcoming demission from the hospital at the moment of the second assessment. Despite the performed, standardized training session with the applied instrumental tasks, we cannot exclude an impact provided by the repetition of instrumental tests with putative learning confounds. Therefore, we suggest that this kind of research warrants additional future studies, which compare the efficacy of drug titration performed during an out- and/or in-patient setting. However, design and performance of such a trial appears to be a rather complex issue, since careful titration in an out-patient setting is not well reimbursed in the German health care system.

In conclusion, this pilot investigation with its explorative descriptive statistical analysis demonstrates the benefit for PD patients following an in-patient setting in a specialized PD unit.