Keywords

1 Introduction

The clock drawing test (CDT) is a widely used cognitive screening tool that is simple and quick to administer and has been well accepted by both clinicians and patients [13]. Its origins can be traced to neurology textbooks, which reported the usefulness of this test as a measure of attention in hemineglect patients [4]. More recently, it has been used to screen for cognitive impairment, primarily in elderly patients [3] but also in a wide range of other neurological and psychiatric disorders including: Alzheimer’s disease [5], Parkinson’s disease [6, 7], Huntington’s disease [8], vascular disease [9, 10], schizophrenia [1113], stroke [14], and traumatic brain injury [15].

The CDT is a valuable cognitive screening test for both quantitative and/or qualitative assessments of many cognitive functions, including selective and sustained attention, auditory comprehension, verbal working memory, numerical knowledge, visual memory and reconstruction, visuospatial abilities, on-demand motor execution (praxis), and executive function [2, 16, 17]. The specific abilities falling under the category “executive function” that are assessed by the CDT include abstraction, complex motor sequencing, response inhibition (i.e., the frontal pull of the hands to the “10” in the instruction to set the time at “10 past 11”) and frustration tolerance [2]. Interpretation of the CDT necessitates consideration of the broad range of cognitive functions that are assessed by this test [18]. The ease of use and wide range of cognitive abilities required to complete the CDT successfully have made this test an increasingly popular cognitive screening measure among researchers and clinicians. A review of recent literature published on the CDT using the PubMed/MEDLINE database, within the date range of December 2011 – February 2016, found a total of 272 peer-reviewed publications when searching for articles containing the keywords “clock drawing test” and 41 articles when searching for articles containing “clock drawing test” in the article title.

2 Popularity of CDT

The widespread use of the CDT among clinicians is also evidenced by a number of recent surveys that have investigated the frequency of use of currently available cognitive screening measures among practitioners across a variety of fields. In 2010, Iracleous and colleagues published a survey of the cognitive screening tools that are currently being used by Canadian family physicians [19]. Of the 249 surveys that were completed and returned by members of the College of Family Physicians of Canada (CFPC), the majority of respondents had been in practice for more than 5 years and devoted 40–60 % of their practice to the care of the elderly. Their findings indicated an overwhelming agreement among practitioners that screening is important within the primary care setting and should not be left to specialists. Furthermore, the most frequently used assessment tools were (i) the Mini-Mental State Examination (MMSE) and its variants (76 % of respondents reported using this measure “often” or “routinely”) (see Chaps. 3 and 4), (ii) the CDT (52 %), (iii) the delayed word recall test (52 %), (iv) alternating sequences (13 %), and (v) the Montreal Cognitive Assessment (MoCA; see Chap. 7) (5 %). Of note, however, is that the authors did not report the number of respondents who do not incorporate cognitive screening into their practice and, thus, do not use any of the above tools. As a result, the reported percentages reflect the sample of Canadian family physicians as a whole, rather than just those who conduct cognitive screening on a regular basis. Nevertheless, the findings provide strong support that the CDT is a commonly used, and a well-accepted, cognitive screening measure among Canadian family practitioners.

Milne et al. [20] conducted a survey of primary care practices in South East England to determine what, if any, instruments were being used by clinicians to screen for dementia. Each participating practice was asked to mark which measures they used from a list of common screening tools with space provided to report unlisted measures. Data were obtained from a total of 138 practices. Of those, 79 % reported that they routinely used at least one dementia screening instrument, with 21 % not using an instrument at all. Furthermore, of those who used an instrument, 70 % of practices used one, 26 % used two and only 4 % used more than two instruments. The breakdown of the screening instruments most commonly used was as follows: the MMSE and its variants (51 %), the abbreviated mental test (AMT) (11 %), MMSE and AMT (10 %), MMSE and CDT (8 %), MMSE and the 6-item cognitive impairment test (6-CIT; see Chap. 11) (6 %), and the CDT (5 %). Results from this survey suggest that the CDT is used less often by practitioners in the UK compared to usage rates of Canadian practitioners [19]. However, an earlier survey reported by Reilly, Challis, Burns, and Hughes [21] that sampled only practitioners who were working within old age psychiatry services in England and Northern Ireland found a much higher frequency of usage of the CDT. Their study found that an overwhelming majority (96 %) of the 331 respondents used standardized scales as part of the assessment process for older people with mental health problems in the community. Of the respondents that endorsed the use of standardized scales, the most frequently identified measures were the MMSE (95 %), the Geriatric Depression Scale (52 %), and the CDT (50 %). Thirty-one percent of the respondents used all three of these scales.

Shulman et al. [22] conducted an international survey of geriatric specialists on behalf of the International Psychogeriatric Association (IPA). With the goal of determining which screening tools were routinely used by clinicians with expertise in neuropsychiatric aspects of old age, the survey was mailed to all IPA members as well as members of the American and Canadian Associations of Geriatric Psychiatry. Of the 334 completed surveys, the majority of respondents were geriatric psychiatrists (58 %), followed by general psychiatrists (14 %) and geriatricians (9 %). Just over 50 % of the respondents were from North America, and 62 % indicated that they devoted more than 75 % of their professional practice to the care of the elderly population. The results revealed that only a small number of tests were used by the vast majority of specialists, including MMSE and its variants (100 %), CDT (72 %), delayed word recall (56 %), the verbal fluency test (35 %), similarities (27 %), and the trail-making test (25 %).

The sequence of instruments reported by Shulman et al. [22] overlaps with that in the primary care setting [23] and suggests that the MMSE is the most frequently used cognitive screening instrument. However, a survey of 155 members of the Canadian Academy of Geriatric Psychiatry (CAGP) and attendees of the 2010 Annual Scientific Meeting suggests that the CDT has increased in popularity in the past few years and may have surpassed the MMSE as the favored screening instrument among Canadian psychogeriatric clinicians [24]. Results show that the six most frequently identified screening tools used “often” or “routinely” by clinicians were the CDT (92.9 %), the MMSE and its variants (91.4 %), the MoCA (80.2 %), delayed word recall (74.6 %), the trail-making test (43.6 %), and verbal fluency (42.9 %). The results of these surveys clearly suggest that the CDT is an increasingly popular instrument among practitioners from a variety of clinical settings.

3 CDT Administration

The CDT provides a user-friendly visual representation of cognitive functioning that is appealing to busy clinicians. The test takes less than 1 min to conduct (compared to 10 min for the MMSE) and appears to have a high level of acceptability by patients [2]. The scoring systems described in this chapter are not all comparable because of differing emphasis placed on visuospatial, executive, quantitative, and especially qualitative issues [25, 26]. Although each scoring system uses slightly different methodologies and instructions for clock drawing, most studies use a pre-drawn circle of approximately 4 in. (10 cm) in diameter [26]. However, some authors feel that there is value in observing patients perform free-drawn circles as this can indicate some degree of impairment [27]. The disadvantage of this method is that if the patient begins by drawing a poor-quality circle, at times merely due to age-related issues such as tremor or visual impairment, the remainder of the test may be compromised [28].

Generally, the test instructions presented verbally to the patient are “This circle represents a clock face. Please put in the numbers so that it looks like a clock and then set the time to 10 min past 11.” This method involves the abstract task of denoting time in symbolic fashion using hands, and thus, the tester should not use the word “hands” in the instructions [2]. While other times such as 3:00, 8:05, and 2:45 have been used, the 11:10 task is particularly useful because it includes both visual fields and requires that the patient inhibits the “frontal pull” towards the number ten, an error that is common in even mildly impaired patients [26]. The inclusion of copying and time setting or reading tests in addition to clock drawing tests by some authors [29] may help to improve the CDT’s predictive validity but also increases its time of administration and complexity, thereby reducing one of the key positive features of the CDT, its speed of completion [28].

4 CDT Scoring Systems

Table 5.1 presents the properties of the most common scoring methods as well as several measures that were reported in the studies by the authors that developed these scoring systems and in subsequent studies. Figures 5.1 and 5.2 provide examples of typical qualitative errors, and Fig. 5.3 indicates the clinical usefulness of clock drawing for demonstrating change in cognitive functioning. Characteristic errors on the CDT include perseveration; right-left confusion; concrete thinking, especially the tendency to “pull” the minute hand to “10”; and confusion about the concept of time [2].

Table 5.1 Characteristics of Clock Drawing Test scoring systems
Fig. 5.1
figure 1figure 1figure 1figure 1figure 1figure 1

Severity scores from 5 to 0 (Reproduced from Shulman [2] with permission from John Wiley & Sons Ltd.)

Fig. 5.2
figure 2figure 2

Errors in denoting 3 o’clock (Reproduced from Shulman [2] with permission from John Wiley & Sons Ltd.)

Fig. 5.3
figure 3figure 3

Sensitivity to deterioration in dementia (Reproduced from Shulman [2] with permission from John Wiley & Sons Ltd.)

In perhaps its first systematic use, Goodglass et al. [30] included the CDT as part of the Boston aphasia battery. Their procedure involved clock setting where the subject was given four pre-drawn clock faces that include short lines marked in the positions of the 12 numbers. The subject was asked to denote four different times: 1:00, 3:00, 9:15, and 7:00. Points were awarded for each correct placement of a hand and 1 point each for correctly drawing the relative lengths of the minute and hour hands. A total of 3 points could be achieved for each clock for a maximum of 12 points on the test. The authors reported that age and education appeared to be influential factors only for subjects who scored in the bottom range on the test.

Shulman et al. [31] compared the CDT to the MMSE [47] and the Short Mental Status Questionnaire (SMSQ) [48] in a sample of 75 older adults with a mean age of 75.5 years. Three groups were included in their study, including those with dementia, those with depression, and normal controls. The authors developed a 5-point scale of severity of impairment, based on clinical experience. A score of 1 denoted very minimal error while a score of 5 was assigned when the subject was unable to make any reasonable attempt to draw a clock. In a subsequent study, this scoring was reversed and 5 points were awarded to a perfectly drawn clock [43]. Shulman’s current practice (see Fig. 5.1) is to assign 5 points for a “perfect” clock, 4 points for a clock with minor visuospatial errors, three for inaccurate representation of 10 past 11 when the visuospatial organization is done well, two for moderate visuospatial disorganization of numbers such that accurate denotation of “ten past eleven” is not possible, one for a severe level of visuospatial disorganization, and 0 for inability to make any reasonable representation of a clock [2].

Sunderland et al. [33] used a priori criteria to develop a 10-point scoring system with 10 as the highest score and 1 as the lowest score. Five points were awarded for drawing a clock face with numbers correctly placed, while 6–10 points were given for accuracy of drawing hands to denote the time 2:45. An arbitrary cut-off score of 6/10 was considered within normal limits. The authors reported that three out of 83 controls (3.6 %) scored less than 6, whereas 15 out of 67 patients with Alzheimer’s disease (22.4 %) scored more than 6. They also found high inter-rater reliability between clinicians and non-clinicians and high correlation of the CDT with other measures of dementia severity, including the Dementia Rating Scale. A later study by Kirby et al. [49] used this same scoring system while incorporating a more heterogeneous sample of community-dwelling participants. They found that the sensitivity of the CDT in the detection of dementia in the general community was 76 %. The specificities of the CDT against normal elderly and depressed elderly were 81 and 77 %, respectively.

Wolf-Klein et al. [34] compared their clock drawing test to the MMSE [47], Hachinski’s scale [50], and the Dementia Rating Scale [51] in a sample of outpatients being screened for cognitive impairment. Their methods included a pre-drawn circle and ten hierarchical clock patterns that were predetermined by a previous pilot study involving over 300 patients. Their patient groups included healthy normals, those with Alzheimer’s dementia and multi-infarct dementia, and others. A cut-off score of 7/10 reflected normal performance, and a score of less than seven was considered “abnormal.” With a focus on temporoparietal function, they found that scores of 1–6 were specific for Alzheimer’s disease as opposed to multi-infarct dementia or mixed cases.

A simple 4-point scoring system was developed by the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD) [32]. In this method, subjects were instructed to draw a clock by first drawing a circle, then adding numbers and then setting the time to show 8:20. The instructions could be repeated, and if necessary, the subject could be instructed to draw a larger circle. In this system, a score of “0” implied an intact clock, 2 = mild impairment, 3 = moderate impairment, 4 = severe impairment. Thus, any score greater than 0 was considered abnormal for the purposes of classification [52]. The CERAD scoring method was later used by Borson et al. [52], who incorporated the CDT into the “Mini-Cog” battery, which also contains a simple three-word delayed recall memory test. The authors found the sensitivity and specificity for probable dementia were 82 and 92 %, respectively, for the CDT, compared to 92 and 92 % for the MMSE and 93 and 97 % for the Cognitive Abilities Screening Instrument (CASI) [53]. However, the authors noted that in poorly educated non-English speakers, the CDT detected demented subjects with higher sensitivity than the two longer instruments (sensitivity and specificity 85 and 94 % for the CDT, 46 and 100 % for the MMSE, and 75 and 95 % for the CASI). Furthermore, less information was lost due to non-completion of the CDT than the MMSE or CASI (severe dementia or refusal: CDT 8 %, MMSE 12 % and CASI 16 %).

Tuokko et al. [35] developed a unique procedure involving three empirically derived tasks that involved clock drawing, clock setting, and clock reading. The clock drawing component involved a pre-drawn circle in which the subject was asked to denote “ten past eleven.” Clock setting involved setting five different times, and clock reading involved the same clocks as in clock setting, but in a different order. Errors on clock drawing were classified into the following categories: omissions, perseverations, rotations, misplacements, distortions, substitutions, and additions. Clock setting achieved a maximum of 3 points, as did clock reading. Making more than two errors was considered a positive (abnormal) result for clock drawing, while the cut-off for the clock setting and reading tasks was a score of less than 13. Interestingly, errors from four categories (omissions, distortions, misplacements, and additions) were found to contribute significantly to the difference between normal elderly and Alzheimer’s disease patients.

Rouleau et al.’s [8] version of the CDT instructed subjects to “draw a clock, put in all the numbers, and set the hands for ten after eleven.” The participants were also asked to copy a pre-drawn clock. This version was designed to identify the quantitative and qualitative aspects of cognitive impairment in patients with Alzheimer’s disease. The test was scored is using a 10-point scale, with lower scores indicating greater cognitive impairment.

Death et al. [36] focused on elderly inpatients seen consecutively in surgical and medical wards at three hospitals in Newcastle, UK. Their CDT protocol involved giving the patient a piece of paper with a 10 cm heavy black circle with a dot in the center printed on it. They were asked to “imagine this is a clock face. Please fill in the numbers on the clock face.” If, while drawing, a patient spontaneously recognized an error and requested to correct it, he or she was allowed to do so. For scoring, clocks were classified as follows: bizarre (class 1), major spacing abnormality (class 2), minor spacing abnormality or single missing or extra number (class 3), and completely normal (class 4). Clocks class 1 and 2 indicated impairment, and class 3 and 4 indicated no cognitive impairment. The authors found that normal clock drawing ability reasonably excluded cognitive impairment or other causes of an abnormal MMSE in elderly acute medical and surgical hospital admissions where cognitive impairment is often missed.

The clock completion test developed by Watson et al. [37] involved providing patients with a pre-drawn circle and asking them to draw in the numbers on a clock face. Interestingly, in this method, the patients were not asked to draw the hands on the clock, and scoring included only the positioning of the clock numbers. The scoring system divided the pre-drawn circle into four quadrants, assigning greatest weight to the fourth quarter. An error made in quadrants one, two, or three received a score of 1, and any error in quadrant four (containing numbers 9–12) received a score of 4. A score of 0–3 was considered normal, and anything ≥4 was considered abnormal. In the original study, the authors studied a group of patients from a geriatric outpatient assessment clinic and found an excellent comparison with the Blessed Orientation-Memory-Concentration test [54].

Manos and Wu [38] developed a “10-point clock test” that included a scoring system utilizing a transparent circle divided into eighths that was applied to the clock drawn by the patient. A maximum of 10 points were awarded for numbers falling into their proper segment and for correctly drawn hands. A difficulty with this method is that some significant errors will not be scored, such as counterclockwise placement of numbers or numbers that are positioned outside the circle. The authors found that a cut-off score of 7 out of 10 identified 76 % of patients with dementia and 78 % of control patients. A later study using the same test attempted to identify mild AD patients (i.e., those with MMSE >23) among consecutive ambulatory patients. The author reported a sensitivity of 71 %, compared to 76 % for the original study that included patients with a mean MMSE score of 20 [55].

A “simple scoring system” (SSS) was developed by Shua Haim et al. [56]. The authors performed a retrospective chart analysis of a sample of elderly patients in an outpatient memory disorders clinic. Their scoring system was based largely on the visuospatial aspects of the task and the correct denotation of time by the hands for a maximum of 6 points. A formula was developed to relate clock scores with the MMSE using simple linear regression in the following way: MMSE = 2.4 × (the clock score) + 12.7. The authors reported that a clock score of zero predicts an MMSE score of <13, whereas a clock score of 6 predicts a MMSE score of ≥27.

Lin et al. [39] examined a comprehensive scoring system of the CDT in screening for Alzheimer’s disease in a Chinese population in order to derive a simplified scoring system. In this study, the clocks were first scored based on the systems described by Watson et al. [37], Wolf-Klein et al. [34], and Tuokko et al. [35], which involved first dividing the clocks into quadrants using two reference lines – one line through the center and the numeral 12, and then a second line perpendicular to the first one through the clock center. If a numeral was placed on the reference line, it was included in the quadrant clockwise to the line. Thirteen criteria were then scored as correct or incorrect for a maximum total score of 16 (item six received up to 4 points for correct placement of three numerals in each of the four quadrants). The authors then formulated a simple scoring system of only three items (hour hand, number 12, and difference between hands) using a stepwise discriminant analysis to select a minimal set of items from the comprehensive scoring system. The simplified 3-item scoring, with a cut-off score of 2/3, was found to have a sensitivity of 72.9 % and a specificity of 65.6 %. The authors suggest that this simple scoring method can be used as a quick test for AD screening.

Lessig et al. [42] analyzed the scoring systems of Shulman et al. [43], Mendez et al. [16] and Wolf-Klein et al. [34], as well as the CDT system used in the Mini-Cog [52] in order to identify an optimal subset of clock errors for dementia screening. The clock drawings of 364 ethnolinguistically and educationally diverse subjects with ≥5 years of education were analyzed. An algorithm using the six most commonly made errors of inaccurate time setting, no hands, missing numbers, number substitutions or repetitions, and failure to attempt clock drawing detected dementia with 88 % specificity and 71 % sensitivity. A stepwise logistic regression found the simplified scoring system to be more strongly predictive of dementia than the three other CDT scoring systems. Also, substituting the new CDT algorithm for that used in the original version of the Mini-Cog improved the test’s specificity from 89 to 93 % with minimal change in sensitivity.

Babins et al. [41] developed “the 18-point clock-drawing scoring system” based on clinical intuition as well as a literature review. The goal of their system was to enhance the utility of the CDT for recognition and prognostication in mild cognitive impairment (MCI). In this system, errors were grouped into the following major categories: stimulus-bound errors, conceptual deficits, perseverations, visuospatial organization, and planning deficits. Using this scoring system with a sample of 123 retrospectively assessed individuals from a memory clinic in Montreal, the authors found that there were three significant hand items that appeared to be possible early markers of progression to dementia. The items “clock has two hands,” “hour hand is towards correct number” and “size difference of hands is respected” all showed significant differences between progressors and non-progressors. The authors suggested that the 18-point clock drawing scoring system may have advantages in identifying MCI individuals who are more likely to progress to dementia.

In an interesting twist on the standard administration and scoring of the CDT, Royall and colleagues [17] developed a variant of the clock drawing test (CLOX) designed to detect executive impairment and differentiate it from nonexecutive visuospatial failure. This version of the test is divided into two parts to distinguish the executive control of clock drawing from the constructional/visuospatial ability. For the first part of the test (CLOX 1), the subject is asked to “draw me a clock that says 1:45. Set the hands and numbers on the face so that a child could read them.” The notion underlying the method for CLOX 1 is that it reflects performance in a novel and ambiguous situation eliciting the executive skills of goal setting, planning, motor sequencing, selective attention and self-monitoring of a subject’s current action plan. Some of the CLOX 1 instructions are deliberately designed to distract the subject. For example, use of the terms “hand” and “face” has the potential to elicit semantic intrusions because they are more commonly associated with body parts than with elements of a clock. The maximum score for CLOX 1 test is 15. The second portion of the task (CLOX 2) involves a simple copying task of a pre-drawn clock already set at 1:45. Differences in scores on CLOX 1 and 2 are hypothesized to reflect executive contribution to the clock drawing test versus visuospatial and constructional ability. The participant’s performance is rated on a 15-point scale (lower scores indicate impairment) on both CLOX 1 and 2. Cut points of 10/15 (CLOX 1) and 12/15 (CLOX 2) represent the fifth percentile for young adult controls. A later study by the same authors found the CLOX test explained more variance in executive control function than other clock drawing tests [57].

Very recently, Jørgensen et al. [58] attempted to develop a reliable, short, and practical version of the CDT for clinical use. A main goal of their study was to produce a scoring method with high interrater reliability, which is a psychometric characteristic of the CDR that has been found to decline with increased scoring system complexity. Using a pilot study, the authors initially produced a 9-item scoring system that was developed based on Lin et al.’s [39] 13-item system. Four clinical neuropsychologists who were blind to diagnostic classification then scored clock drawings from 231 participants. The interrater agreement of individual scoring criteria was analyzed and items with poor or moderate reliability were excluded. This produced a 6-item CDT, which was examined to determine its classification accuracy. The authors found that, at a cutoff value of 5/6, the 6-item CDT had a sensitivity of 0.65 and a specificity of 0.80. Furthermore, stepwise removal of up to three items reduced the sensitivity only slightly (i.e., from 0.65 to 0.59). Classification accuracy associated with a score of 4/6 or less was reportedly very high (sensitivity = 0.63, specificity = 0.80).

5 Comparing CDT Scoring Systems

Table 5.2 shows the psychometric properties of the CDT scoring systems as determined by some of the comparison studies discussed in this section. Scanlan et al. [62] examined 80 clock drawings by subjects with known dementia status from four categories (i.e., normal, mild, moderate, and severe abnormality) as defined by the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD). In order to compare dementia detection across scoring systems, an expert rater scored all clocks using published criteria for seven systems, including Shulman et al. [31], Morris et al. [32], Sunderland et al. [33], Wolf-Klein et al. [34], Mendez et al. [16], Manos and Wu [38], and Lam et al. [29]. Additionally, 20 naïve raters with no formal instruction judged each clock as either normal or abnormal. The authors found that when using categorical cut-off points published for each CDT scoring system, the overall concordance between the naïve scores and the different CDT systems was high (86–89 %), with the exception of the Sunderland (73 %) and Wolf-Klein (66 %) systems. When CDT classifications were compared against independent clinical dementia diagnoses, the Mendez system most accurately distinguished demented from non-demented individuals, followed closely by the CERAD system. Naïve raters did not differ from the Manos or Shulman systems but were significantly better than the Lam, Sunderland, and Wolf-Klein systems. The CERAD and Mendez systems were found to be most sensitive in detecting mild and moderate dementia, however the Wolf-Klein system failed to detect some subjects who were presenting with severe dementia. Of note is that the Wolf-Klein system requires no time setting and mild to moderate number spacing errors are disregarded, both factors that likely contributed to poor performance of this system. Interestingly, the authors reported that detection of both MCI and mildly demented subjects was minimally two to three times greater than physician recognition for all systems except the Sunderland and Wolf-Klein systems [62].

Table 5.2 Psychometric properties of the Clock Drawing Test

Van der Burg et al. [63] compared the dementia screening performance of two scoring systems, the CERAD system [32, 52] and the Shulman et al. [43] system, to determine whether a somewhat more complex system has clear advantages over a simpler and less time-consuming scoring system. The authors selected the simple 4-item CERAD method because of its user-friendly qualities and the Shulman 6-item system because of its proven diagnostic qualities. A total of 473 drawings was selected from a larger sample of 1199 elderly subjects for whom the presence or absence of dementia was known. Results showed that both scoring systems had good inter-system and inter-rater reliabilities and both correlated equally well with the true diagnosis of dementia. These findings are similar to earlier studies by Scanlan et al. [62] and Lin et al. [39], which also concluded that simpler systems were found to be accurate when compared to more complex systems. The authors concluded that primary care physicians and other health-care providers should be encouraged to use the simpler 4-item scoring checklist as it is easier to administer and requires less time than the 6-item method [63].

Matsuoka et al. [67] identified brain regions associated with performance on various measures of the CDT using magnetic resonance imaging (MRI) in 36 patients with Alzheimer’s disease, eight with mild cognitive impairment and four healthy controls. Multiple regression analyses were used to identify relationships between each CDT scoring system (Shulman [2], Rouleau [8] and CLOX 1 [17]), and regional gray matter volume. The authors reported that the CDT scores of the three scoring systems were positively correlated with gray matter volume in various regions in the brain. Furthermore, some brain regions overlapped with the three different scoring systems, whereas other regions showed differences between tests. All three CDT scoring systems were positively correlated with gray matter volume in the right parietal lobe. Furthermore, the Shulman system was positively correlated with gray matter volume in the bilateral posterior temporal lobes, leading the authors to speculate that the Shulman CDT might be useful in detecting the impairment of semantic knowledge and comprehension. The Rouleau CDT score was positively correlated with gray matter volume in the right parietal lobe, right posterior inferior temporal lobe and right precuneus, suggesting that the Rouleau CDT may detect impairment of visuospatial ability and the retrieval of visual knowledge. Finally, the CLOX 1 score was positively correlated with gray matter volume in the right parietal lobe and right posterior superior temporal lobe, suggesting that the CLOX 1 system may detect impairment in visuospatial ability and sentence comprehension. The authors concluded that distinct brain regions might be associated with CDT performance using different scoring systems and that different scoring and administration systems require different cognitive functions. Thus, rather than using only one scoring system, a combination of CDT scoring systems may cover a wider range of brain functions in dementia screening [67].

Recently, Mainland et al. [68] conducted a literature review of studies published between 2000 and 2013 to synthesize the available evidence on CDT scoring systems’ effectiveness and to recommend which system is best suited for use at the clinical frontlines. The authors found that, despite significant variations that emphasize visuospatial and executive functions to varying degrees, the psychometric properties of most systems are remarkably similar. When used specifically as a dementia screening measure in clinical settings, this finding is important considering the increased time required for scoring more complex systems. The authors concluded that, based on their review of the literature, expert consensus appears to support the notion that “simpler is better” when selecting scoring systems for dementia screening because of their strong psychometric properties and ease of use. In fact, Scanlan et al. [62] reported that simple judgment of “normal” versus “abnormal” clock drawings by naïve raters provides screening accuracy comparable with published scoring systems when distinguishing demented from non-demented individuals. Further support for the use of simpler scoring methods for the purpose of cognitive screening was provided by Kørner et al. [69], who examined five different scoring systems in a sample of Danish participants and found that, as the predictive values of each scoring system were nearly identical, the shortest scoring system was preferred.

6 Predictive Validity of CDT

6.1 Normal Aging

Bozikas et al. [70] administered Freedman et al.’s [27] version of the CDT to 223 healthy community-dwelling adults in order to develop norms for the Greek population and to explore the influence of demographic factors (i.e., sex, age, and level of education) on the performance of healthy individuals. The authors found no sex differences in performance but did find that age and level of education contributed to CDT scores. More specifically, they found that greater years of education were associated with better performance, while age had a negative contribution. Analysis revealed that the influence of age was due exclusively to the elderly group; for those patients under the age of 60 years, age did not influence CDT performance. However, there was a marked decline after 60 and another decline after 70 years of age. The authors suggest that performance on the CDT is resistant to the aging process, at least in the non-elderly. However, the authors note that future research should establish more reliable norms for the elderly by including more extensive sampling of elderly patients with varying levels of education.

Hershkovitz et al. [71] assessed the relationship between the CDT and rehabilitation outcome in 142 elderly hip fracture patients who scored within the normal range of the MMSE (>23). This retrospective study was performed in a post-acute geriatric rehabilitation center, and patients were divided into two groups according to CDT performance (impaired versus intact) scored using the Watson method [37]. The differences between the two groups in relation to age, gender, education level, living arrangement, pre-fracture functional level, and outcome measurements were compared. The patients’ functional status was assessed using the Functional Independent Measure (FIM) and the motor FIM [72]. The FIM is comprised of 18 parameters, each assessed on a scale of 1–7 according to the degree of assistance the patient requires to perform a specific activity in three domains: basic activity of daily living, mobility level, and cognitive functioning. Patients’ rate of in-hospital improvement was calculated by comparing admission and discharge FIM scores. Discharge FIM scores were significantly lower for the impaired CDT group (89 vs. 94.9, p = 0.007). Also, length of hospital stay was significantly longer (28.2 vs. 25.3 days, p = 0.033), and rate of improvement in FIM was significantly slower (0.62 vs. 0.77, p = 0.036) for the impaired CDT group. The authors concluded that the CDT may assist the multidisciplinary team in identifying hip fracture patients whose MMSE scores are within the normal range but require a longer training period in order to extract their rehabilitation potential.

6.2 Mild Cognitive Impairment

Research examining the CDT’s ability to differentiate between subjects with and without mild cognitive impairment (MCI) is inconsistent [9, 28, 73]. For example, Yamamoto et al. [74] found that the CDT had positive utility for MCI screening, whereas Lee et al. [75] did not recommend the use of the CDT as a screening instrument for MCI. Ehreke et al. [76] speculated that the inconsistent results might be due to the variety of versions of CDT administration and scoring, and thus they compared the utility of different CDT scoring systems for screening for MCI using a sample of German subjects aged 75 years and older. Diagnosis of MCI was established according to the criteria proposed by the International Working Group on MCI [77]. These criteria include: (a) absence of dementia according to DSM-IV or ICD-10; (b) evidence of cognitive decline: subjective cognitive impairment (measured by self-rating or informant report) and impairment on objective cognitive tasks, and/or evidence of decline over time on objective cognitive tasks; and (c) preserved baseline activities of daily living or only minimal impairment in complex instrumental functions. The CDT scoring systems that were examined included Sunderland et al. [33], Shulman et al. [43], Mendez et al. [16], Rouleau et al. [8], Babins et al. [41], and Lin et al. [39]. The authors reported significant differences in CDT scores between participants with and without MCI for all scoring systems applied. Furthermore, receiver operating characteristics (ROC) analysis revealed a significant probability of correctly differentiating between subjects with and without MCI for all scoring systems (a 64–69 % probability of MCI subjects achieving a different CDT score from subjects without MCI). However, an examination of screening utility indicators (sensitivity and specificity) showed that none of the scoring systems were able to screen reliably for MCI, as evidenced by the fact that no cut-off point in any system produced values of sensitivity higher than 80 % and values of specificity higher than 60 % (recommended values of sensitivity/specificity outline by Blake et al. [78]). The scoring system that came closest to these recommended values was that of Shulman et al., which produced 76 % sensitivity and 58 % specificity. The sensitivity and specificity values for the other systems were as follows: Sunderland et al. = 69 and 63 %; Rouleau et al. = 48 and 79 %; Babins et al. = 60 and 70 %; Mendez et al. = 64 and 70 %; Lin et al. = 76 and 49 %. The authors concluded that the CDT, as currently administered, is not a good screening instrument for MCI. However, they suggest that the CDT’s clinical utility in this population could be improved by being semi-quantitative, having a wider score range and focusing on the clock’s hands and numbers in more detail.

Similarly, Beinhoff et al. [79] employed the Shulman [2] scoring system to examine its usefulness in a sample of 232 patients with various degrees of dementia in an outpatient memory clinic in Germany. Using a cut-off point of >1, 86 % of AD patients and 40 % of MCI patients were detected. These authors also concluded that the CDT was useful for the detection of AD, but not for MCI.

Forti et al. [80] examined whether the CLOX [17], both alone and in combination with the MMSE, could be useful as a screening tool for MCI in a sample of 196 elderly individuals seeking medical help for cognitive complaints. The CLOX is a CDT protocol that has been reported to be more sensitive to executive functioning impairment than either the MMSE or several other CDT tasks [57]. Forti et al. employed an extensive screening process in order to subdivide their MCI participants into the following subtypes: amnestic MCI (aMCI), if there was impairment in memory alone; multiple-domain MCI with memory impairment (mMCI), if there was impairment in memory and at least one other cognitive domain; non-amnestic MCI (naMCI), if there was impairment in one or more non-memory cognitive domain. The study found that, at standard cut-offs, both CLOX subtests had reasonable specificity (CLOX 1 = 72 %, CLOX2 = 92 %) but unacceptably low values of sensitivity (CLOX 1 = 54 %, CLOX 2 = 28 %), as well as likelihood ratio (CLOX 1 = 1.91, CLOX 2 = 3.59) for MCI. Furthermore, using different cut-off scores or combining the CLOX with the MMSE did not result in a statistically significant increase in diagnostic efficiency. Scores for both CLOX subtests were lower in subjects with MCI than in controls, but neither subtest achieved efficacy enough to merit recommendation as a screening tool. As expected, the lowest CLOX scores were found for patients diagnosed with the mMCI subtype, which supports previous findings that, independent of the scoring system used, the greater the severity of cognitive impairment, the better the ability of a CDT task to detect it [28, 81]. The authors concluded that the CLOX, either alone or used in conjunction with the MMSE, is not a useful screening tool for MCI in a clinical setting.

A study by Parsey and Schmitter-Edgecombe [44] used both an established quantitative scoring system and a revised qualitative scoring method based on error criteria developed by Rouleau et al. [8] to demonstrate the sensitivity of the CDT to MCI. For the qualitative component, the authors converted the qualitative errors examined by Rouleau et al. [8] into a quantitative system to increase the speed and practicality of its use while maintaining the entirety of the scoring criteria. The authors hypothesized that by maintaining a greater number of qualitative errors and incorporating an efficient quantitative total score component, the modified scoring system would be both sensitive to MCI and practical for use in both clinical and research settings. The study found that MCI participants scored significantly differently than non-demented controls in terms of overall total score using the Modified Rouleau method, but not the original 10-point Rouleau system. Furthermore, sensitivity and specificity analyses revealed that the Modified Rouleau CDT scoring method demonstrated a moderate ability to detect early signs of cognitive impairment. However, the Modified Rouleau system still exhibited significant numbers of false negative identifications. When compared to the original Rouleau scoring system, the modified version was more sensitive to MCI, which supports previous studies demonstrating that more complex scoring systems are more sensitive to the earliest stages of dementia [41, 62, 75]. The authors concluded that qualitative observations of clock drawing errors can help increase sensitivity of the CDT to MCI and that using a more detailed scoring system is necessary to differentiate individuals with MCI from cognitively healthy older adults.

A more recent study by Rubínová et al. [82] further supported the use of more complex scoring systems when attempting to diagnose amnestic MCI. In their study involving 48 patients with amnestic MCI and 48 age- and education-matched healthy controls, clock drawings were scored by three blinded raters using one simple, 6-point scale [43] and two complex 17- and 18-point scales [41, 83]. The study found that only the more complex scoring systems were significant predictors of the amnestic MCI diagnosis in logistic regression analysis. The 17-point scoring system of Cohen et al. [83] showed good sensitivity (87.5 %) that equaled that of the MMSE; however, the MMSE showed superior specificity (31.3 %) compared to the CDT (12.5 %). The authors found that the combination of the CDT and MMSE scores increased the area under the ROC curve (0.72; p < .001) and increased specificity (43.8 %), but not enough to be deemed an acceptable level (i.e., >60 %; [78]). The authors concluded that the simple 6-point scoring system for the CDT did not differentiate between healthy elderly and patients with amnestic MCI and although more complex scoring systems were slightly more efficient they were still characterized by high rates of false positive results.

7 CDT and Specific Neurologic Conditions

The value of the CDT has been assessed in a wide variety of neurologic conditions including dementia, delirium, Huntington’s disease, Parkinson’s disease, stroke, traumatic brain injury, and schizophrenia.

7.1 Vascular Dementia and Alzheimer’s Disease

An interesting observation on CDT strategy was reported by Meier [84], who observed that patients with vascular dementia commonly begin the task by dividing the circle with radial lines into segments. When comparing the frequency of segmentation patterns in clock drawings of patients with Alzheimer’s disease and those with vascular dementia, the vascular patients used the strategy at twice the rate. Specifically, almost half of all impaired drawings of patients with vascular dementia showed segmentation compared with only one-quarter of the impaired drawings of Alzheimer’s patients. Moreover, patients using segmentation had a higher score on the MMSE than patients with other strategies.

Kitabayashi et al. [85] used quantitative analyses of clock drawings to demonstrate differences in the neuropsychological profiles of Alzheimer’s disease compared to vascular dementia. Using Rouleau et al.’s [8] CDT protocol, the authors found that Alzheimer’s disease patients’ error patterns tended to be stable and independent of disease severity. However, patients with vascular dementia showed increased frequency of graphic difficulties and conceptual deficits with increasing severity of the disease. However, the frequency of visuospatial or planning deficits decreased with dementia severity. In mild dementia groups, the frequency of spatial and/or planning deficit was higher in vascular dementia. In moderate dementia groups, the frequency of graphic difficulties was significantly higher in vascular dementia and the difference in the frequency of spatial and/or planning deficit that was seen in mild dementia disappeared [85].

The finding of increased spatial and planning deficits in mild vascular dementia suggests that frontal-subcortical disturbances are operative. However, at the moderate stage, patients experience conceptual deficits and graphic difficulties more prominently, while the spatial and conceptual deficits decrease. This suggests that the impairment of memory and motor function masks the frontal executive dysfunction as dementia severity increases [85]. The authors concluded that the cognitive profiles of patients are significantly different between Alzheimer’s disease and vascular dementia at the mild and moderate levels and it may be possible to discriminate between these profiles using qualitative analyses of clock drawings [85].

Wiechmann et al. [86] examined the sensitivity and specificity of Borson et al.’s [52] 4-point scoring system for the CDT in discriminating Alzheimer’s disease and vascular dementia. Receiver operating characteristic (ROC) analysis revealed that the CDT was able to distinguish between normal elderly control participants and those with a dementia diagnosis (Alzheimer’s disease and vascular dementia combined). The authors reported that the optimal cut-off score for normal controls was 4, which produced 100 % sensitivity and 70 % specificity. The cut-off score for differentiating Alzheimer’s disease from vascular dementia was 3, which produced a sensitivity of 55 % and a specificity of 22 %. Similarly, the cut-off score for discriminating vascular disease from vascular dementia was 3, which produced a sensitivity of 69 % and a specificity of 33 %. Thus, since the optimal cut-off scores for both Alzheimer’s disease and vascular dementia were the same, it was impossible to predict one diagnosis from the other solely based on the 4-point total score. Wiechmann et al. concluded that Borson et al.’s [52] 4-point system demonstrated good sensitivity and specificity for identifying cognitive dysfunction associated with dementia, but the system did not adequately discriminate between Alzheimer’s disease and vascular dementia [86].

Cacho et al. [5] examined the effect of presenting the CDT instructions with a verbal command versus asking participants to copy a clock model presented visually. Their sample included patients with early Alzheimer’s disease against a control group of healthy control subjects. Patients in the early Alzheimer’s disease group obtained significantly higher scores on the copy command version of the task compared to the verbal command version (z = −7.129, p < 0.001), whereas no statistically significant differences were found for the healthy control group (z = −2.001, p < 0.080). In other words, early Alzheimer’s disease patients showed a significantly better performance and score on the CDT when copying a clock model than when the clock was drawn in response to verbal command. The authors referred to this difference in performance as the “performance pattern.” This is similar to the pattern of response seen in the CLOX test for executive function [57]. Thus, the study found that patients with early Alzheimer’s disease showed an improvement pattern in the execution of the CDT copy command in comparison with the execution of the CDT verbal command that is not seen in healthy controls. Such results may be associated with a greater deterioration of memory functions compared to visual-construction functions in patients with early Alzheimer’s disease [5].

Recently, Tan et al. [87] published a review of research examining the ability of the CDT to differentiate Alzheimer’s disease from other dementia types. The results of the review suggest that qualitative analyses of CDT performance may be useful in differentiating Alzheimer’s disease from other dementias, such as vascular dementia, Parkinson’s disease with dementia, dementia with Lewy bodies and frontotemporal dementia. Also, CDT cut scores were generally found to be helpful in differentiating Alzheimer’s disease from frontotemporal dementia; however, regardless of the scoring system used, quantitative scores in general were not useful for differentiating Alzheimer’s disease from all other forms of dementia. The authors speculated that this is due to the intrinsic nature of the CDT assessing several cognitive skills at the same time and, although a single overall score is able to demonstrate the presence of cognitive impairment, it is limited in delineating specific domains of cognitive impairment. The authors concluded that an examination of CDT error types may be useful in localizing the domain of cognitive dysfunction and assisting with differential diagnosis of dementia types.

7.2 Delirium

Fisher and Flowerdew [88] examined older patients who were undergoing elective orthopedic surgery to assess whether the CDT could predict postoperative delirium. The authors suggested that identifying high-risk patients for delirium may assist clinicians in decreasing the morbidity associated with delirium by providing timely interventions. In their study, patients undergoing elective hip and knee surgery were examined pre- and postoperatively, using a modified Confusion Assessment Method (CAM) questionnaire [89]. Using a stepwise multiple logistic regression, the authors identified two significant risk factors for postoperative delirium. The first risk factor was male gender, and the second was a CDT score of ≤6 based on the modified clock drawing scoring system of Sunderland et al. [33] and Wolf-Klein et al. [34]. Interestingly, abnormal MMSE scores did not predict delirium in the authors’ model. Thus, the authors speculated that the CDT measures non-dominant parietal functions better than the MMSE and therefore may be indirectly detecting an increased predisposition to the development of delirium.

Manos [90] reported a case of an 80-year-old man who underwent a decompression lumbar laminectomy and later developed a wound infection and other complications, necessitating a second surgery. He developed a delirium the night after his second operation. The CDT was used to document recovery from the delirium up to 14 days postoperatively. By postoperative day 10, the delirium had cleared from a clinical perspective, but cognitive impairment was still evident on the CDT, with minor impairment lasting until day 14. This case study provided further evidence of the usefulness of the CDT in the monitoring of delirium.

Recently, Bryson et al. [91] evaluated the accuracy of the CDT in a sample of patients undergoing surgery for aortic repair. Their study was a subcomponent of a trial whose primary purpose was to explore the relationships among delirium, postoperative cognitive dysfunction, and the apolipoprotein ε (epsilon) 4 genotype. Delirium was assessed using the Confusion Assessment Method [89] on postoperative days 2 and 4 and at discharge. Cognitive functioning was assessed with neuro-psychometric tests before surgery and at discharge. Postoperative cognitive dysfunction was determined using the reliable change index method [92], and the CDT was administered at all time points. Delirium was noted in 36 % of patients during their hospital stay, while postoperative cognitive dysfunction was noted in 60 % of patients at discharge. Agreement between the CDT and the test for delirium or postoperative cognitive dementia was assessed with Cohen’s kappa statistic. The authors found that agreement between the CDT and Confusion Assessment Method was poor at 2 and 4 days postoperatively, as well as at discharge, with kappa consistently <0.3. For the purpose of their study, the authors assumed that the Confusion Assessment Method is diagnostic of delirium and reported the sensitivity of the CDT in identifying delirium ranges from 0.33 at discharge to 0.59 at the day 4 assessment. Specificity ranged from 0.65 at 2 days postoperatively to 0.83 at discharge. The results of this study suggested that the sensitivity of the CDT for delirium and postoperative cognitive dysfunction was poor, and thus the CDT is not recommended for bedside screening of delirium or postoperative cognitive dysfunction. However, the authors acknowledge that their study was limited by the absence of an agreed standard of reference on which to base their diagnoses of delirium and postoperative cognitive dysfunction, as well as by a highly selected patient sample that does not reflect the variety of patients presenting for elective non-cardiac surgery [91].

7.3 Huntington’s Disease

Rouleau et al. [8] applied both quantitative and qualitative analyses of the CDT to distinguish characteristics associated with Huntington’s disease and Alzheimer’s disease. The authors used a CDT protocol adapted from the Boston Parietal Lobe Battery [30] with added qualitative analysis assessing: (a) graphic difficulties to stimulus-bound responses, e.g., for 11:10, hand pointing to “10” rather than “2”; (b) conceptual deficits; (c) spatial or planning deficits; (d) perseveration. The study also included a copy task in which Alzheimer’s disease patients showed significant improvement compared to Huntington’s disease patients. The authors suggested that the primary cause of drawing problems is not graphic, motor, or visual perceptual difficulties, but rather they are due to the loss of semantic associations with the word “clock.” Huntington’s versus Alzheimer’s patients demonstrated moderate to severe graphic and planning deficits. Such planning difficulties may be related to frontostriatal dysfunction associated with Huntington’s disease. Moreover, since cognitive impairment was equal between Alzheimer’s and Huntington’s patients, qualitative differences between groups appear to be due to differential involvement of the limbic cortical regions in Alzheimer’s disease compared to the basal ganglia and corticostriatal dysfunction associated with Huntington’s disease.

7.4 Parkinson’s Disease

Saka and Elibol [93] examined the utility of practical neuropsychological tests, including the CDT, in differentiating Parkinson’s disease with dementia (PD-D) and Alzheimer’s disease, as well as Parkinson’s disease with mild cognitive impairment (PD-MCI) and amnestic MCI (aMCI). The authors evaluated consecutive cases with mild to moderate Alzheimer’s disease (n = 32) and PD-D (n = 26), as well as aMCI (n = 34) and PD-MCI (n = 19). The study found that the CDT was more impaired in patients with PD-D than Alzheimer’s disease. For differentiation of PD-D from Alzheimer’s disease, the CDT was found to be valuable with moderately high sensitivity (85.7 %) and specificity (69.6 %). In differentiation of aMCI and PD-MCI, the CDT was again found to be helpful with a sensitivity of 75.0 % and a specificity of 62.5 %. By applying stepwise linear discrimination function analysis, the authors found that a combination of the CDT with an enhanced cued recall task correctly classified 70.7 % of the overall study population; specifically, 71.4 % of Alzheimer’s disease, 71.9 % of aMCI, 69.6 % of PD-D, and 68.8 % of PD-MCI patients were correctly identified. These results suggest that the CDT can supplement clinical diagnostic criteria in differentiation of dementia or MCI associated with Parkinson’s disease from Alzheimer’s disease and aMCI. The authors note, however, that while the CDT measures visuospatial impairment, it also involves frontal lobe functions such as planning, which is more impaired in PD-D than Alzheimer’s disease. Moreover, impairment of visuospatial function occurred more frequently in PD-MCI than aMCI cases, and thus, it may predict the developing state of PD-D.

7.5 Stroke

The utility of the CDT for localizing vascular brain lesions was explored by Suhr et al. [94] in a sample of 76 stroke patients and 71 normal controls. In addition to comparing six quantitative scoring systems, the study also assessed the discriminative ability of a number of qualitative aspects of CDT performance using Rouleau et al.’s scoring protocol [8]. The authors hypothesized that the qualitative aspects of the CDT would be more useful than quantitative scores in discriminating among patients with respect to lesion location. The results found that, indeed, no significant differences emerged between various lesion groups when using quantitative scoring techniques in assessing localization of function. However, qualitative features of the CDT were found to discriminate between lesion locations. Specifically, right-hemisphere stroke patients displayed more graphic errors and impaired spatial planning compared to left-hemisphere stroke patients. This pattern of performance is consistent with the impaired visuospatial/visuoconstructional difficulties seen after right-hemisphere strokes. Also, subcortical patients showed more graphic errors compared to cortical patients, while cortical patients demonstrated more perseveration on qualitative assessments. This pattern of performance is similar to the findings of Rouleau et al. [8], who found graphic difficulties were more common in the subcortical dementia associated with Huntington’s disease. The authors concluded that scoring the CDT qualitatively might provide useful additional information about the location of brain dysfunction, while adding little time and effort to the evaluation process.

Cooke et al. [95] explored the relationships between CDT performance following stroke and key clinical variables, including cognition, lateralization, and type of stroke. Their sample included 197 patients with stroke from 12 hospital and rehabilitation facilities. The results showed that MMSE [47] performance was strongly associated with performance on the CDT. The authors suggested that this relationship provided further corroboration of the validity and sensitivity of the CDT as a quick screening tool of cognitive impairment in the stroke population. As hypothesized by the authors, the location of the stroke (left or right cerebral hemisphere) demonstrated a significant relationship with the CDT. Approximately half of the patients with a right-hemisphere stroke had impaired clock drawings (54 %), whereas less than half of those with left-hemisphere stroke had impaired clock drawings (35.6 %). The right hemisphere controls the majority of cognitive and perceptual functions that are responsible for executing the CDT [96], and visuospatial and visuoconstructional skills are predominantly affected following lesions to the right hemisphere [26]. Thus, it is expected that those with right-hemisphere stroke would have impaired CDT performance [95].

Freedman et al. [27] describe how the CDT can be used to assess and diagnose perceptual and cognitive impairments post-stroke due to the organization of the brain. For example, if all elements of the clock (circle, hands, and numbers) are present but distorted, then the lesion is more likely to be found in the right hemisphere and may be further localized to the posterior area of the right hemisphere where spatial organization skills are located. In contrast, a lesion in the left hemisphere may be indicated by sequential errors, such as writing the numbers in the correct sequence but in the counterclockwise direction [27].

7.6 Traumatic Brain Injury

De Guise et al. [15] examined the neuroanatomical correlates of the CDT in patients with different types and sites of injury sustained after traumatic brain injury (TBI). Patients were assessed in the context of a level 1 trauma center, and different types of injuries (epidural hematoma, subdural hematoma, subarachnoid hemorrhage, intraparenchymal hematoma, and brain edema) in different sites (frontal, temporal, parietal, occipital lobes, bilateral, and right or left hemisphere) were included. The authors anticipated that more impaired performance on the CDT would be associated with parietal injuries. The results showed that patients who sustained a traumatic subarachnoid hemorrhage, brain edema, and bilateral injury showed more deficits on the CDT. Errors made by these patients included difficulty producing the clock face and correctly placing the hands and in numbering the clock accurately. The authors found that traumatic subarachnoid hemorrhage, brain edema, and bilateral injuries interfere with CDT performance, likely because they are more diffuse and involve a combination of cerebral areas. Further analyses based on the sites of lesions confirmed the involvement of the parietal lobe in performance on the CDT. Specifically, a higher percentage of patients who sustained parietal lesions presented with more deficits in the drawing of the clock and in accurately producing numbers and hands. The authors concluded that the CDT can be used as a sensitive and reliable screening tool for detecting cognitive impairment in patients with TBI.

In response to the study by De Guise et al. [15], Frey and Arciniegas [18] noted that most (72.9 %) of the subjects in the De Guise study had frontal injuries. As a result, it is likely that performance problems in their sample are at least partially reflective of the effects of injury to the frontal and/or frontal white matter elements of CDT-relevant frontoparietal networks. Frey and Arciniegas suggested that, while parietal lesions might exert an additional adverse effect on the function of those networks, confirming the presence of such an effect necessitates controlling for the effects of frontal and/or white matter lesions on CDT performance. After reanalyzing the data presented by De Guise et al. using one-tailed hypothesis testing, Frey and Arciniegas demonstrated that significant effects on CDT performance are not limited to parietal injuries. Moreover, Frey and Arciniegas stressed that any predictive model of CDT total score using neuroanatomical variables requires the inclusion of frontal, temporal, and parietal lesions [18]. Thus, while it is clear that the CDT may be a viable tool for discriminating between lesion locations in TBI patients, there remains a need for additional research with greater refinement of the concepts and methods employed.

The executive clock drawing tasks (CLOX 1 and 2) were examined by Writer et al. [97] for their ability to predict functional impairment in a sample of patients with combat-related mild traumatic brain injury and comorbid post-traumatic stress disorder (PTSD). Functional impairment was assessed using the structured assessment of independent living skills (SAILS). The SAILS assesses instrumental activities of daily living and measures both competency (performance ability and accuracy) and efficiency (time to completion) [98]. Pilot findings reported by the authors found CLOX 1-defined executive functioning correlated well with SAILS-defined functional competency and efficiency. Moreover, CLOX 1 performance contributed variance independent of comorbid PTSD anxiety symptom burden or other potentially confounding subject and injury characteristics. These findings suggest that the CLOX can discriminate between those with high versus low performance-based functional status scores in patients with mild TBI. However, the authors acknowledge that these results need to be interpreted with caution due to the low sample size used (n = 15) [97].

7.7 Schizophrenia

Herrmann et al. [99] compared 24 patients with schizophrenia to 24 healthy, age-matched controls on clock drawing, copying, and reading. Patients all met DSM-IV [100] criteria for schizophrenia with diagnoses made by a psychiatrist. Participants’ cognition was assessed using the MMSE [47], and symptom severity was documented with the Brief Psychiatric Rating Scale (BPRS) [101]. Clock tasks were scored according to the method described by Freedman et al. [27]. The authors found that schizophrenic patients performed worse than controls on clock drawing and copying, but showed no differences on the reading task, even though both groups had similar scores on the MMSE. They speculated that the CDT may be more sensitive to cognitive impairment in schizophrenics than the MMSE, given the latter’s lack of sensitivity to frontal system dysfunction. Furthermore, since performance on the CDT was significantly affected by scores on the BPRS, it has been suggested that the clock tasks might be measuring state-associated impairment (related to symptom severity) rather than trait-associated changes (related to the inherent neurocognitive deficit of the illness per se) [99]. The authors also suggested that the examination of specific errors made on the CDT may shed some light on the deficits displayed. Specifically, compared with controls, the patients with schizophrenia made most errors on placing and spacing the numbers on the free-drawn and pre-drawn clocks. These errors may reflect impairment in frontal visual-spatial function as these errors may be related to attention and strategy formation rather than to vision and topography. The relatively normal clock reading in schizophrenic patients may reflect sparing of the posterior regions that mediate reading in general [99]. The authors concluded that, while the role of clock drawing and copying in schizophrenia requires further study, the easily administered CDT may prove useful in monitoring changes in cognition, possibly associated with symptom severity. The CDT may also help to document positive or negative changes in cognition associated with the use of antipsychotic medications.

7.8 Metabolic Syndrome

Metabolic syndrome is a constellation of health risk factors that includes hypertension, atherogenic dyslipidemia, impaired glucose homeostasis and abdominal obesity [102]. Metabolic syndrome is associated with greater occurrence of subcortical white matter hyperintensities, which are associated with cognitive decline, late-onset depression and functional disability [103]. Viscogliosi et al. [104] sought to determine whether the presence of metabolic syndrome predicted longitudinal changes in cognitive functioning, as assessed by the CDT, over a 1-year period. Their sample included 104 stroke- and dementia-free older hypertensive participants. They found that the presence of metabolic syndrome predicted 1-year cognitive decline independent of participants’ age, neuroimaging findings, and initial cognitive performance. In this study, the authors used the Sunderland CDT scoring method [33] and found that participants who met criteria for metabolic syndrome in their sample (n = 31) scored significantly lower at follow up, with an average score of 6.8 versus 8.3 in participants without a diagnosis of metabolic syndrome. Interestingly, in a follow up study by the same research group [103], metabolic syndrome was found to be inversely associated with CDT scores but had no impact on measures of episodic memory. Also, when the individual risk factors comprising metabolic syndrome (e.g., hypertension, atherogenic dyslipidemia, etc.) were examined alone, none of these individual components of metabolic syndrome predicted poorer cognitive performance independently.

8 Longitudinal Monitoring Using the CDT

A cognitive screening instrument that can accurately and reliably discriminate between neurological conditions is certainly a useful tool in clinical and research settings. The above-mentioned studies suggest that the CDT can indeed assist clinicians in screening for a variety of disorders. In addition to discriminating between neurological conditions, another potentially effective use of the CDT is related to longitudinal monitoring of cognitive decline. Recently, Amodeo et al. [105] conducted a literature review examining the ability of the CDT to monitor longitudinal decline in cognitive function. The authors found that preliminary results of the limited number of studies examining the predictive value of the CDT suggest that it is useful for the longitudinal assessment of cognitive impairment and may be helpful for predicting conversion to dementia. In considering longitudinal monitoring, the authors found that the CDT appears to be sensitive to the cognitive decline associated with progression to dementia.

Studies by Rouleau et al. [106] and Lee et al. [107] found that patients with Alzheimer’s disease demonstrated an increase in conceptual errors over time, suggesting that this type of error in particular may be most sensitive to the cognitive decline typical of Alzheimer’s disease. Conceptual errors are broadly defined as errors “reflecting a loss or a deficit in accessing knowledge of the attributes, features and meaning of a clock” and can manifest as a misrepresentation of time on the clock or a misrepresentation of the clock itself [107]. Interestingly, conditions requiring the patient to produce the clock on their own (as opposed to copying a clock) appear to be superior in detecting cognitive decline in dementia. Rouleau et al. suggest that this finding implies a decline in the mental representation of a clock, given that this mental representation is necessary in the drawing condition but less so in the copy condition [106]. Overall, this research suggests that the CDT is sensitive to the cognitive decline associated with dementia or the development of dementia and it is the subject’s mental representation or meaning of a clock that displays the most marked degradation.

In their review of the literature, Amodeo et al. [105] concluded that the CDT appears sensitive to cognitive decline over time and may be able to predict which cognitively intact older adults and MCI patients will eventually develop dementia. Although the accuracy of discrimination is not sufficient to recommend the CDT alone as the best measure of cognitive decline over time, it does have the advantage of quick and easy administration and may best be applied in combination with other instruments. The CDT has already found its way into well-known tests such as the Mini-cog [108], the Montreal Cognitive Assessment (MoCA) [109] (see Chap. 7), and the Addenbrooke’s Cognitive Examinations (Chap. 6), as well as the Test Your Memory (TYM) test (Chap. 9) and the Quick Mild Cognitive Impairment screen (Qmci; Chap. 12). As demonstrated by the studies exploring predictive validity, an abnormal CDT may serve as a flag for further assessment, even if the patient appears intact. In addition to predicting cognitive decline, repeated administration of the CDT may be useful for monitoring this decline. Amodeo et al. [105] suggest that future research should focus on methods to improve predictive validity of the CDT, including the determination of which aspects of clock drawing are most sensitive and specific, and with which supplementary tests it should be administered.

9 Cultural, Ethnic, and Educational Considerations

As with any cognitive screening tool, the characteristics of the subject population (i.e., language, cultural background, level of education) can influence the validity of the CDT. Numerous studies have examined the effect of such variables, with particular attention being paid to the influence of level of education. To date, the results have been contradictory, with some studies finding a link between such variables and CDT performance and others finding no correlation.

Sugawara et al. [3] sought to develop normative data for the CDT for the Japanese community-dwelling population using Freedman’s scoring protocol [27]. The CDT and MMSE were administered to 873 volunteers aged 30–79 years old (36.8 % males) who participated in the Iwaki Health Promotion Project in 2008. The authors found gender differences in the free-drawn condition in both nonparametric and multiple regression analyses. Specifically, female CDT scores were higher than those of males. The authors noted, however, that the results of previous research examining gender differences in CDT performance were controversial, with some supporting an influence of gender [110, 111] and others finding no differences [70]. In all conditions that were tested in this study, subjects 60 years of age and older showed either significant decreases in CDT scores or a decreasing trend in performance. Interestingly, the authors only found an influence of education on CDT scores in females 60 years of age and older in the free-drawn condition. This finding is in contrast to results published by Yamamoto et al. [74], who also studied CDT performance in the Japanese population but found CDT scores to be independent of years of education. The authors noted, however, that most participants included in the study (96.8 %) had received 9 or more years of education. Thus, it is possible that the high level of literacy in their subjects may have precluded their study from finding strong educational differences in CDT scores [3].

Kim and Chey [1] investigated CDT performance of 240 non-demented elderly Korean individuals with a wide range of education levels and 28 patients with mild dementia of the Alzheimer’s type (DAT). They found that literacy and education of patients significantly influenced the CDT performance in the sample, in that older people with lower education had lower CDT scores and wider range of performance. These effects were most dramatic in the illiterate individuals. Moreover, illiterate and/or uneducated older persons made conceptual errors similar to those of the DAT patients. Conceptual deficits observed in the DAT patients have been interpreted as stemming from the loss of semantic association evoked by the word “clock” and the graphic representation of a clock [8]. However, Kim and Chey [1] found that misrepresentation of the clock was mostly observed in the uneducated participants from both the normative groups and the DAT group. The authors speculated that the conceptual errors made by an uneducated normal individual are likely to be due to poor development of the representation of a clock or time on a clock face, which are based on numeracy and abstract thinking. Thus, even though semantic association or representation may be intact, the necessary constructional skills may be poorly developed in uneducated people as well. The authors concluded that the CDT performance in older people who are either illiterate or with 6 or less years of education should be interpreted with caution [1].

The correlation of the MMSE and the CDT was explored by Fuzikawa et al. [112] using Shulman’s method [2] in a sample of elderly Brazilian adults with very low levels of formal education. Participants were recruited from Bambui, a town of 15,000 inhabitants in southeast Brazil. The median schooling level of the sample was 2 years. The authors found that the correlation between the MMSE and CDT was moderate (ρ (rho) = 0.64) in the sample of older adults with very low formal education, and no differences were found according to gender, age, or schooling level. Specifically, higher CDT scores were associated with higher MMSE scores, whereas lower CDT scores corresponded to a wider range of MMSE scores. Thus, it appears that in this population with very low education, the majority of subjects who perform well on the CDT could be expected to obtain a high MMSE score. Therefore, if an individual was able to draw a good clock despite having a low level of education, this could indicate adequate cognitive function that is reflected by high scores on the MMSE. In contrast, a low CDT score in this population would not allow suppositions about the MMSE score but would suggest the need for further assessment and/or investigations. The results of this study suggest that the CDT may be very practical in developing counties, where resources are limited and low education among the elderly is common.

Borson et al. [108] proposed that telling time by clock face is familiar across all major cultures and civilizations, whereas the more abstract figure copying seen in the MMSE intersecting pentagons task is a skill that is more familiar to those educated in developed countries. They argued that the task of drawing a clock “from scratch” requires the use of multiple cognitive abilities from a wide range of cerebral regions. While this feature is ideal for a cognitive screening instrument, it is not common across all screening and visuospatial copying tasks. The “diffuse” CDT task is thus ideal for cognitive screening purposes as it elicits a number of cognitive abilities, including long-term memory and information retrieval, auditory comprehension, visuospatial representation, visual perceptive and visual motor skills, global and hemispheric attention, simultaneous processing, and executive functions [52].

In an earlier study, Silverstone et al. [113] described the usefulness of the CDT in a sample of 18 Russian immigrants who were unable to speak English. CDT screening identified abnormal scores in four of the participants, and follow-up with these patients’ families confirmed a diagnosis of progressive cognitive loss and dementia. The authors suggested that the CDT is a useful screening tool when language is a serious barrier to cognitive testing.

10 Conclusion

In this chapter, a wide range of CDT scoring and administration methods were presented, and it appears as though the simpler the scoring system, the better for most clinical settings as the more complicated and lengthy scoring systems do not appear to add significant value to the clinical utility of the test when being used as cognitive screening measure. In terms of simplicity, the 4-point system used by the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD) seems optimal [108]. However, when examining the utility of the CDT scoring systems for screening for MCI, Ehreke et al. [76] found that while significant differences were observed between MCI subjects and normal controls, no scoring method produced sensitivity and specificity values high enough to conclude that the CDT, as currently administered, is a good screening instrument for MCI. However, they suggested that the clinical utility could be improved by including a semi-quantitative and wider scoring range that places more focus on the clock’s hands and number placement. Thus, it appears that in some situations, an overly simplified scoring system may limit the utility of the CDT. With this in mind, it falls to the clinician to decide what level of detail they wish to extract when deciding which scoring protocol to apply.

The CDT appears to have achieved widespread clinical utilization, albeit with inconsistent approaches to scoring and interpretation. The CDT is well accepted by clinicians and patients due to its ease of use and short administration time. The recent literature reflects increasing interest and focus on this test as a quick screening tool for cognitive impairment. Moreover, conclusions from studies examining its utility in various populations of patients are predominantly positive. As a screening instrument, it can also provide an easy to administer and valuable baseline from which to monitor cognition over time. Available evidence suggests that the CDT, used in conjunction with other brief validated cognitive tests and informant reports, such as the MMSE [47], or as a component of a brief cognitive screening battery, such as the MoCA [109] or Mini-Cog [108], should provide a significant advance in the early detection of dementia.