Assessing Pragmatic Language Competencies: Toward Evidence-Based Assessments

Until recently, even the most commonly used language assessment instruments (e.g., CELF-3, TOLD, etc.) did not include evaluations of each of the four most widely recognized language domains (syntax, semantics, phonology, and pragmatics). In fact, pragmatic language competencies (PLCs)—the ability to appropriately and effectively use language in social contexts—were not examined by any of the common language assessment instruments. Although this is no longer the case, PLC assessments are still omitted in many general measures of language competence. Even when included, the assessment of PLCs is less circumspect than the evaluation of syntax, vocabulary, and semantics. Although the latter domains are crucial in children’s language development, the success and appropriateness of an utterance in context depends on far more than a sentence’s grammaticality, vocabulary, and meaning (Ninio and Snow 1999). The way in which a child’s language is used in the important contexts and encounters in their social environment (e.g., home, school, peer environments) may be more relevant to adjustment and social success than their competence in the more traditionally assessed language areas.

For example, strong empirical evidence has linked deficits in PLCs with many developmental, communication, learning, and psychiatric disorders. It is well known that PLC deficits are symptomatic of children with autism spectrum disorders [PDD, NOS, Autism, and Asperger Disorder; ASD] (Lord 1993; Mawhood et al. 2000; Rapin 1996; Tager-Flusberg 1993; Tager-Flusberg and Caronna 2007), even when these children demonstrate normal or near normal development of other language competencies such as in syntax (Barrett et al. 2004; Bishop 2000; Norbury et al. 2004). But the linkage of PLC deficits with ASDs is not unique (Botting and Conti-Ramsden 1999). Children with Attention-Deficit/Hyperactivity Disorder (ADHD) have substantial PLC deficits. In fact, the severity, if not the form, of PLC deficits in children with ADHD can approximate those of children with ASD (Bishop and Baird 2001) and children with language disorders (Russell et al. submitted). Not surprising, then, that children with ADHD are found to differ reliably from typically developing controls (Geurts et al. 2004) in the degree of their PLC deficits. In addition, PLC deficits have also been identified in children with Conduct Disorder (CD) and Oppositional Defiant Disorder (Gilmour et al. 2004), with more than two-thirds of the children with CD also exhibiting PLC deficits.

The extensiveness of PLC deficits across these disruptive behavior disorders has suggested that the DSM-IV criteria for ADHD, ODD, and possibly CD may contain symptomatic descriptions that partially characterize comorbid PLC impairments (Camarata and Gibson 1999; Russell 2007; Tannock 2000; Westby and Cutler 1994). By “comorbid” it is meant that their deficits in PLCs are so pervasive as to seriously impair the children’s functional adaptation across a variety of contexts, not that children with various psychiatric diagnoses simply have some pragmatic language issues. It is well established that structural language disorders and psychiatric difficulties are commonly comorbid, in the first sense. Twenty-six years ago, Baker and Cantwell (1982a, b) reported that over half (53%) of language and speech disordered children were found to have a diagnosable psychiatric disorder. These substantial comorbidity rates have been corroborated in subsequent studies (Cohen et al. 1993, 1998) and are observed whether examining language-disordered children for unsuspected psychiatric disorders or vice versa. Comorbidity rates of 50% are twice as high as the estimated 25% comorbidity rate between childhood psychiatric disorders (Costello et al. 2003). In addition, Beitchman et al. (2001) discovered that 40% of children identified with early speech and/or language impairment had some type of psychiatric diagnosis in adulthood, with anxiety disorders being the most common. These and other studies (e.g., Fujiki et al. 2002; Jerome et al. 2002) provide strong evidence of the relationship between language functioning, psychiatric status, and emotional adjustment. However, these research studies were based on language evaluations that did not formally assess children’s PLCs, raising strong suspicion that comorbidity rates between language disorders and psychiatric disorders would actually be higher, if PLCs had been assessed.

Because emerging research links childhood externalizing and internalizing disorders to PLC deficits (Im-Bolter and Cohen 2007), it is both timely and important to familiarize psychologists and educators with domains of pragmatic competence and to evaluate the content and other forms of validity associated with PLC assessment instruments (Adams 2001; Adams et al. 2006; Farmer and Oliver 2005; Hyter et al. 2001; Penn 1999; Richardson and Klecan-Aker 2000; Weist et al. 1991a, b). Fortunately, there is an extensive research and theoretical literature on many domains comprising pragmatic competence dating back to the 1970s (e.g., Freedle 1977; Halliday 1973; Hymes 1974; Russell 1979a, b, c; Schiffrin 1987; Snow and Ferguson 1977). Moreover, the assessment of children’s PLCs has long been recognized as crucial among speech language pathologists, even if historically such assessments relied on structured participant observation rather than on validated questionnaires or tests. The importance afforded PLC assessments by speech language pathologist, however, is plainly evident in Cantwell and Baker’s (1987) diagnostic decision tree model. In that model, PLCs are at the top node of their diagnostic decision tree, just below the evaluation of sensory (e.g., hearing) factors that may contribute to language dysfunction.

The central role of PLCs in typical and atypical childhood development is also acknowledged by current efforts to devise and evaluate PLC assessment instruments (e.g., Adams 2002). As new and updated editions of language tests incorporate PLC subscales, psychologists and educators will need to grow familiar with and knowledgeable about PLC deficits to carry out routine psychological and educational assessments, in addition to knowing when to refer their child patients for more specialized assessments conducted by speech language pathologists. In this regard, appraisals of the content, structural/dimensional, ecological, and diagnostic validity of PLC instruments are needed. To date, little has been written about what PLC domains are, or should be, how they are assessed, in diagnostic tests, behavioral checklists/questionnaires, and structured participant observations and with what comparative intensity.

Basing analyses on a broad sample of PLC (sub)tests and checklists/questionnaires (N = 24), the present review provides a critical appraisal of the development and utility of PLC assessment instruments in clinical use or in research. The review also includes recommendations of how some of the instruments can be employed to deepen clinical/developmental and educational evaluations. Four key questions, addressed using quantitative and qualitative analyses, orient the review: (1) Can the PLC domains targeted by individual assessment items be reliably identified?, (2) What are the core PLC domains that are most commonly assessed by checklists/questionnaires and tests?, (3) What is the relationship between the salience of PLC domains in tests/tasks versus checklists/questionnaires?, and (4) What degree of content, structural, diagnostic, and ecological validity do the PLC assessment instruments currently have?

In addressing these questions, neutrality has been maintained with respect to whether authors of the tests and/or checklists/questionnaires advocate that the development of PLCs drive the development of important aspects of structural language competencies and social cognition (i.e., a view ascribed to functionalists) or, alternatively, that PLCs are simply another important feature of language such as syntax (i.e., a view ascribed to structuralists; Ninio and Snow 1999; see also Shirk and Russell 1996, pp. 226–258, for a functionalist application of PLCs in understanding change processes in child psychotherapy). Instead, the focus has been on the language domains featured in the specific items included in checklists/questionnaires or tests described as focusing on pragmatics. Hopefully, the current review will help to familiarize psychologists and educators with the types of domains that PLC instruments contain and will serve to focus research on developing tests/tasks and questionnaires that provide strong, evidence-based PLC evaluations and targets for treatment (Barlow 2005; Hunsley and Mash 2005; Weist et al. 1991a, b).

Method

Measures and Instruments

Questionnaires (with rating scales), checklists (with present/absent scoring), and tests that assess PLCs were identified through common search engines (e.g., MEDLINE, PSYCHINFO) with various combinations of search terms (e.g., social, communication, assessment, pragmatic, language, disorder, child), review of bibliographies in book chapters or journal articles, and through instruments and their bibliographies available from publishers. We included questionnaires/checklists and tests “under development,” if they had been used and published, even if norms were not available. We did not, however, include lists of PLCs provided in some articles and chapters as suggested foci for assessments, or instruments that include a few PLC items, if the instruments were clearly constructed for other purposes (e.g., Child Behavior Checklist; Achenbach 2001). This search resulted in the identification of 11 questionnaires, 2 checklists, and 11 tests, containing 1082 items. Figure 1 provides the names of the questionnaires/checklists and tests, date of publication, and what kinds of reference distributions and age ranges are included for each test (Academic Communication Associates 1989; Adams and Bishop 1989; Adams et al. 2001; Bishop 2003, 2004; Bishop and Adams 1989; Bloom et al. 1999; Bowers et al. 1994; Brice 1992; Carrow-Woolfolk 1999; Kleiman 1994; Leonard et al. 2002; Penn 1988; Phelps-Terasaki and Phelps-Gunn 1992; Prutting and Kirchner 1987; Rinaldi 1996, 2001; Rubin 1985; Semel et al. 2003; Seymour et al. 2003; Smith et al. 2000; Stott et al. 2002; Wiig 1990; Wiig and Secord 1988).

Fig. 1
figure 1

Age appropriateness for pragmatic language questionnaires and tests. ➨ = adults. PCA = Profile of communicative appropriateness (1983), CCSR = Communication competence self report (1985), PPR = Pragmatic protocol (1987), PCSP = Pragmatic communication skills protocol (1989), APSS = Adolescent pragmatics screening scale (1992), FCP = Functional communication profile (1994), VPRS = Verbal pragmatic rating scale (1999), TASCC = Teacher assessment of student communicative competence (2000), GLS = General language screen (2002), CCC = Children’s communication checklist-2 (2003), PP = Pragmatic profile (2003), ORS = Observational rating scale (2003), SULPR = Social use of language program-revised (2001), TOPL = Test of pragmatic language (1992), TOPS-R = Test of problem solving-elementary, revised (1994), CASL-PJ and NL = Comprehensive assessment of spoken language subtests—pragmatic judgment and nonliteral language (1999), DELV = Diagnostic evaluation of language variation—pragmatic domain (2003), ERRNI = Expression, reception and recall of narrative instrument (2004), CRIL = Criterion referenced inventory of language (1990), ALICC = Assessment of language impaired children’s conversations (1989), UA = Understanding ambiguity (1996), TLC = Test of language competence (1989), and ACE = Assessment of comprehension and expression (2001). Reference distribution: 1 = criterion referenced, 2 = norm referenced, 3 = local, 4 = none

Coding system development. Domains for the PLC coding system were drawn from the theoretical and research literatures on pragmatics (e.g., Bach and Harnish 1979; Britton and Pellegrini 1990; Brown and Levinson 1978; Grice 1967; Levinson 1983; Lyons 1977; Palmer 1986; Schenkein 1978; Searle 1969; Sperber and Wilson 2002). Thirteen PLC domains were identified and defined for coding purposes using this literature: Requests; Speech Acts (variety and appropriateness); Interlocutor Variety; Gricean Principles; Negotiations, Directions, and/or Instructions; Conversational Turn-taking; Topic Control and Maintenance; Nonliteral Language, Use of Indirection, and Presupposition; Rituals, Greetings, Goodbyes; Nonverbal Communication; Speech Characteristics (such as prosody) and Fluency; Theory of Mind and Emotion Language; Discourse Attentiveness and Empathy, and Narrative. Preliminary review of two clinical questionnaires (Bishop 2003; Semel et al. 2003) also suggested the need to include a Vocabulary and an Other category. When assessing coder agreement and consensual classification of all items, two further domains were deemed necessary: Syntax/Grammar and Comprehensibility of an utterance. See Table 1 for the 17 domains, definitions, and examples. In defining domains and in coding items, the intent was to be inclusive rather than to exclude domains or items in the instruments because of the degree of inference that might be necessary to relate them to PLCs. For example, if items probed the children’s use of mental state verbs or emotion language the item was coded as Theory of Mind and Emotion Language, not because theories of mind are universally accepted as pragmatic phenomena but because the items were included in the pragmatic instrument. Similarly, most aspects of vocabulary and grammar would not be considered as pragmatic phenomena per se, though they may relate to the latter in important ways. Finally, the degree of sophistication involved in the use of PLCs was not assessed. For this reason, for example, simple and complex requests were both coded as Requests (see Russell 2007).

Table 1 Pragmatic modes of engagement

An index of the salience of PLC domains. To identify core PLC domains empirically, two indices of each domain’s salience were devised. The first index of salience was determined by noting how many of the checklists/questionnaires or tests contained at least one item for each of the 17 PLC content domains. We reasoned that a domain’s centrality would be reflected in the degree to which it was included as a targeted area in the set of examined instruments. Central PLCs would be assessed by most checklists/questionnaires and tests, whereas peripheral PLC domains would be assessed by only a few checklists/questionnaires and tests.

The second index of salience was determined by noting how many items were directed at each PLC content domain across all the questionnaires or tests. We reasoned that a domain’s centrality would be reflected in the number of probes contained in this set of instruments. A PLC domain collectively viewed as central would be probed by many items, whereas a PLC domain collectively viewed as relatively peripheral would be probed by fewer items.

To calculate the index of salience, each PLC content domain received a ranking in terms of what proportion of questionnaires/checklists or tests included probes relevant to it and in terms of how many total probes it received summing across the 24 assessment instruments. These ranks were summed separately for questionnaires/checklists and tests to discover the relative salience of each of the 17 PLC domains by type of assessment methodology.

Results

Can PLC Domains Probed by Questionnaire/Checklist and Test Items be Reliably Identified?

A random sample of 389 of the 1,082 items was selected and coded independently by two raters using definitions contained in Table 1. Each questionnaire/checklist and test item was assigned to as many as three domains, with a primary (A), secondary (B), and tertiary classification (C). We assessed inter-rater agreement at three levels, namely: (1) Percent of agreement of primary, (2) primary or secondary, and (3) primary or secondary or tertiary classifications. For agreements restricted to primary classifications, the two raters agreed on 81% of their domain classifications; for primary and/or secondary classifications, the two raters agreed on 88% of their classifications; and for primary, secondary, or tertiary classifications, the two raters agreed on 91% of their classifications. We further assessed agreement using Cohen’s Kappa, an index of agreement that adjusts for chance. An agreement was recorded if the lead rater’s A code for a particular item was matched by the reliability rater’s A or B code. If they did not match, the disagreement was entered as the lead and reliability raters’ A codes. Using this method Cohen’s Kappa was .84, a score describing substantial strength of agreement (Kundel and Polansky 2003; Landis and Koch 1977). This value was reproduced when using the reliability rater’s A codes as the target and the lead rater’s A or B codes for registering agreements. Most of the observed disagreements occurred due to one rater utilizing the “Other” category as their A code without providing a B or C code. With 17 domains, 453 of the 482 questionnaire/checklist items (roughly 94%) and 597 of the 600 test items (roughly 99%) could be classified consensually to their primary PLC domain.

What are the Core PLC Domains Sampled in Questionnaires/Checklists and Tests?

For questionnaires/checklists alone. Each of the 17 domains received a rank in terms of how many of the questionnaires/checklists included at least one domain-relevant item and in terms of the number of domain-relevant items that were included across the 13 questionnaires/checklists (see Table 2). Summing these two rank orders provided a basis to estimate the relative salience of PLC domains, and presumably their importance. The six most salient domains were, in descending order: Requests; Speech Characteristics and Fluency; Nonverbal Communication; Topic Control and Maintenance; Conversational Turn-taking; and Negotiations, Directions, and/or Instructions. The six least salient areas were, in ascending order: Comprehensibility; Rituals, Greetings, or Goodbyes; Nonliteral Language, Use of Indirection or Presupposition; Syntax/Grammar; Speech Acts; and Narrative. In addition, the second to the last row in Table 2 provides the total number of the 17 PLC domains that each questionnaire/checklist probed. As is evident, no questionnaire/checklist had items that assess all PLC domains. However, two probe 15 of the 17 domains. On the other hand, six questionnaires/checklists probe less than 10 of the PLC domains. Finally, only four questionnaires/checklists probe all six of the most salient areas: the ORS, TASCC, PP, and APSS.

Table 2 Domains of pragmatic skill probed by thirteen questionnaires: salience and content validity

For tests alone. Each of the 17 domains received a rank in terms of how many tests included at least one item relevant to its PLC domain and in terms of the number of domain-relevant items that were included across the 11 tests (see Table 3). As with the questionnaires/checklists, summing these two rank orders provided a basis to estimate the relative salience of PLC domains, and presumably their importance. The six most salient areas are, in descending order: Requests; Narrative; Nonliteral Language, Use of Indirection, or Presupposition; Nonverbal Communication; Theory of Mind and Emotion Language; and Rituals, Greetings, or Goodbyes. The six least salient areas are, in ascending order: Discourse Attentiveness and Empathy; Interlocutor Variety; Comprehensibility; Gricean Principles; Topic Control and Maintenance; and Speech Acts. The second to the last row in Table 3 provides information about the total number of the 17 domains that each test probed. As is evident, no test probes more than 10 of the 17 domains. Finally, only one test probes all six of the most salient PLC domains: the TOPS-R.

Table 3 Domains of pragmatic skill probed by eleven (sub)tests: salience and content validity

For tests and questionnaires/checklists together. Table 3 also contains the results for tests and questionnaires/checklists combined in its last column. The six most salient PLC domains across tests and questionnaires are, in descending order: Requests; Nonverbal Communication; Negotiations, Directions, or Instructions; Speech Characteristics and Fluency; Theory of Mind and Emotion Language; and Narrative. The six least salient PLC domains across tests and questionnaires are, in ascending order: Comprehensibility; Discourse Attentiveness and Empathy; Interlocutor Variety; Gricean Principles; Speech Acts; and Syntax/Grammar.

What is the Relationship Between the Salience of PLC Domains Across Tests/Tasks and Questionnaires/Checklists?

Comparing Tables 2 and 3 reveals several descriptive differences between the tests and questionnaires/checklists. For example, six of the 13 questionnaires/checklists probe more PLC domains (>10) than the broadest test. Conversely, there are more (sub)tests than questionnaires/checklists that focus intensively on one or two PLC domains. The rank orders of PLC domains in the pool of tests versus the pool of questionnaires/checklists were correlated. Spearman’s ρ was .14, p = .60, indicating that there was not a significant degree of correspondence between the salience of featured PLC domains across tests and questionnaires/checklists. The correlation remained insignificant even when the comparison was restricted to instruments that contained items probing six or more PLC domains (ρ = .22, p < .4), thus eliminating narrow band tests focusing on one or two PLC domains.

A second way to assess the salience of PLC domains in tests versus questionnaires/checklists is to parse the PLC domains in terms of their relative appearance developmentally, as depicted in Table 1. As Adams (2002) has indicated “assessment of language pragmatics is currently restrained by limitations in normative methodologies. Knowledge of developmental ‘norms’ is limited so that only very approximate age of emergence can be provided” (pp. 974–975) and, we would add, age of relative mastery and the form of the progression from emergence to mastery. Therefore, the groupings should be considered heuristic, even if based on the temporal order of development as suggested by preliminary empirical investigations and partly on logical considerations (Kaplan 1966; Russell 2007).

The first set of PLC domains we have labeled Precursors/Enablers and include the basic emerging competencies that are required to participate in pragmatic language interaction. These include: ability to decode and encode nonverbal communications; the developing sensitivity to vocalizations and spoken language (Discourse Attentiveness and Empathy), development of vocalizations, articulation, and prosodics (Speech Characteristics and Fluency); ritual greetings/goodbyes (Rituals, Greetings, Goodbyes), comprehensibility and the emergence of vocabulary. The second set of PLC domains we have labeled Basic Exchanges/Rounds. These involve those PLC domains that facilitate discourse exchanges and rounds across two or three turns at talk as in question/answer sequences. These include fitting one’s conversational contributions into the flow of interactive speech (Conversational Turn-taking), the ability to seize and maintain topics of conversations (Topic Control and Maintenance), requests (e.g., question/answer sequences), use of a differentiated set of speech acts (Speech Acts), the use of syntax/grammar and ease in communicating to a variety of listeners (Interlocutor Variety). The third set of PLC domains we have labeled Extended Literal and Nonliteral Discourse. These include those PLC domains that facilitate participation in the extended discourses that build a sense of identity and social belonging, namely, ability to participate in the give and take of discourse (Negotiations, Directions, or Instructions), the development and use of theory of mind and internal emotion language (e.g., as evidenced by use of internal state language; Theory of Mind and Emotion Language), the ability to form and comprehend extended narratives (Narrative), the ability to use language in a metaphorical or nonliteral way (Nonliteral Language, Use of Indirection, or Presupposition), and the conscious orientation to being relevant, succinct, and informing (Gricean Principles).

The number of items that fell into each of the three developmental levels across tests and questionnaires/checklists were tabulated. Tests had 75, 165, and 357, while questionnaires/checklists had 158, 183, and 112 items focused on Precursors/Enablers, Basic Exchanges/Rounds, and Extended Literal and Nonliteral Discourse, respectively. A chi-square analysis of the number of items in each developmental level across tests and questionnaires/checklists revealed substantial and statistically significant differences, χ2(df = 2) = 141.39, p < .0001. Questionnaires/checklists placed more emphasis on Precursors/Enablers than tests and, conversely, tests placed more emphasis on Extended Literal and Nonliteral Discourse than questionnaires/checklists. This pattern remained even when restricting analyses to instruments that probed at least six PLC domains, which effectively excluded narrow band subtests that focused only on one or two domains. In this analysis, tests had 56, 148, and 137, while questionnaires/checklists had 157, 182, and 108 items focused on Precursors/Enablers, Basic Exchanges/Rounds, and Extended Literal and Nonliteral Discourse, respectively, with χ2(df = 2) = 41.32, p < .0001. As in the previous analysis, questionnaires/checklists placed more emphasis on early developing PLCs (Precursors/Enablers and Basic Exchanges/Rounds) and tests placed more emphasis on later developing skills (Extended Literal and Nonliteral Discourse).

What is the Degree of Structural, Ecological, and Diagnostic Validity of PLC Instruments?

The relevance of test and questionnaire/checklist items to multiple rather than to a single PLC domain(s) could adversely affect an instrument’s dimensional or factorial validity. However, about half of the instruments organize the PLC domains they sample as if they are indicators of a unidimensional construct of pragmatic competence. The presumed unidimensionality of these instruments has not been examined and confirmed empirically through factor analytic studies. In fact, only internal consistency statistics are reported. Conversely, the other half of the instruments present a priori subscale composites or groupings, suggesting that either pragmatic competence is conceived as multidimensional and componential or that the subscales can be combined into a single construct (Tomblin and Zhang 2006). An enumeration of a subset of composite scales contained in the tests and tasks reveals many different and possibly conflicting emphases and conceptualizations: Voice, Fluency, Audience, Stylistic Variations, Kinesics and Proxemics, Appropriateness of Communication, Rituals and Conversational Skills, and so on. The presumed multidimensionality has not been confirmed empirically in any of the tests or questionnaires that provide subscale composite scores and only one instrument provides norms for the subscales and composites (i.e., the CCC).

Two instruments, the General Language Screen (GLS, Stott et al. 2002) and the Verbal Pragmatic Rating Scales (VPRS, Bloom et al. 1999), report exploratory factor analyses employed to discover their dimensionality. For the former, a two-factor solution accounting for only 44% of the total variance was deemed best, with factor 1 appearing to be a general language skills factor (articulation, expressive, receptive, and pragmatic) and factor 2 appearing to be a receptive language skills factor (Follows two-step instructions, understands “Where” questions, places objects when asked, and enjoys listening to stories). For the VPRS, which focuses six questions on aspects of Gricean relevancy theory, a three-factor solution was deemed best, with factors dubbed “Discourse Content” (items were Lexical selection, Quantity, and Specificity), “Parsimony” (Conciseness), and “Conceptual Unity” (Relevancy and Topic Maintenance). It should be noted that none of the exploratory factor analyses have been followed by confirmatory procedures and neither study reported factor reliabilities.

With content validity substantially varying across tests and questionnaires/checklists, and with structural/dimensional validity almost invariably left unexamined, it may also be instructive to characterize the diagnostic and ecological validity of these PLS instruments. If we parse ecological validity into two aspects, one focusing on verisimilitude (i.e., the degree to which test demands mirror those faced in one’s everyday environment) and one focusing on veridicality (i.e., the degree to which a test shares variance with measures of everyday functioning; Chaytor and Schmitter-Edgecombe 2003), it is clear that the tests and questionnaires/checklists differ substantially. Questionnaires/checklists and observational ratings appear to have greater verisimilitude than the tests reviewed, with the latter appearing to assess aspects of meta-pragmatic awareness or judgment, rather than the degree to which a child actually uses his/her PLCs appropriately in their everyday contexts of interaction. Knowing what one should or might say as a response to a verbally or pictorially depicted social situation (meta-pragmatic awareness) appears quite different from actually saying what is acceptable when confronted in everyday life with a real situation (pragmatic skill), as numerous studies in moral development and ethics have shown (e.g., McColgan et al. 1983; Thoma et al. 1991). On the other hand, many of the tests and questionnaires/checklists report empirical findings demonstrating that scores on the pragmatic test or questionnaire/checklist share variance with measures of real-world functioning (such as symptom rating scales), and thus both appear to be amassing evidence for their veridicality.

Veridicality and verisimilitude aside, it would appear that none of the tests or questionnaires/checklists has demonstrated its diagnostic validity, if we mean by diagnostic validity the provision of replicable evidence of the pragmatic test’s or questionnaire’s acceptable level of sensitivity and specificity across the age levels it purports to assess. Since a “Pragmatic Communication Disorder” is not recognized in the current DSM or ICD nosologies, a lack of diagnostic validity can be viewed as not very surprising and/or even as a moot point. However, as descriptors of PLC problems change from “deficits” to “impairments,” and perhaps to a more widely recognized developmental communication disorder (i.e., not just recognized by speech language pathologists), the attainment of diagnostic validity will be crucial.

Discussion

A growing body of research suggests that PLC deficits may have a level of comorbidity with psychiatric disorders in children on par with the elevated levels found for structural language disorders. These PLC deficits are not confined to children falling along the autism spectrum, where such difficulties comprise one of three major classes of symptoms in autism and a major class of symptoms for the child with Asperger’s disorder. Instead, PLC deficits and impairments have been shown to be strongly associated with ADHD, ODD, and CD and are associated with internalizing disorders as well (e.g., Baker and Cantwell 1982a, b; Beitchman et al. 1986, 1996; Cantwell et al. 1981; Cohen et al. 1998; Ginsburg et al. 1998; Tse and Bond 2004). The most conservative interpretation of the relationship between PLC deficits and childhood psychiatric disorders would suggest that PLC deficits place a child at substantial risk for having or developing a psychiatric disorder and vice versa. The least conservative interpretation of the relationship would suggest that there is a heretofore widely unrecognized communication disorder, Pragmatic Communication Disorder, distinct from Expressive and Expressive/Receptive Communication Disorders, and symptoms of Pragmatic Communication Disorder have been included in and conflated with the diagnostic criteria of other DSM-recognized disorders. One needs only to accept the most conservative interpretation, however, to recognize that the development and improvement of PLC evidence based assessments and instruments are crucial to better understand the causes of comorbidity between each domain of language functioning, including PLC and psychiatric illness. With this understanding treatment interventions could focus on validated treatment targets.

The PLC domains probed by the various tests and questionnaires/checklists included in this review could be coded with substantial inter-rater agreement. Devising two indices of the salience of the pragmatic domains across the tests and tasks revealed a set of core PLC domains. The 6 highest ranked core areas collapsing over tests and questionnaires/checklists were: Requests; Nonverbal Communication; Negotiations, Directions, or Instructions; Speech Characteristics and Fluency; Theory of Mind and Emotion Language; and Narrative. This represents two early developing domains (nonverbal communication and speech fluency), and three relatively late developing domains (Narrative, Theory of Mind and Emotion Language, and Negotiations, Directions, and Instructions). Requests were the only PLC domain from the middle developmental phase where speakers learn how to engage in simple discourse exchanges and rounds. If restricted to the broader band instruments which probe at least 6 PLC domains, the top five PLC domains are the same.

Rationales for the inclusion of even these core areas in questionnaires/checklists and in tests, however, were not often explicitly developed. For example, the top ranked PLC was Requests—it had the most probes directed at it (N = 191) and was a target in the greatest number of instruments (N = 17). There are numerous reasons why this place of distinction makes good sense, including, (1) The study of requests is one of, if not the most, prolific area(s) of research in developmental pragmatics, extending from the preverbal (proto)requests of infants to those made with subtle degrees of indirection and politeness by teenagers and adults (e.g., Brown and Levinson 1978; Ervin-Tripp 1976; Garvey 1975; Goody 1978), (2) Requests are quintessentially social, always involving two roles, the requester and requestee, (3) According to an influential theory of speech acts, the Request To Be Heard is represented in the highest node of the semantic representation of each and every utterance, and thus Requests are implicated in all acts of speech (Labov and Fanshel 1977; Ross 1970; Sadock 1974), and (4) Requests can vary systematically in linguistic form, degrees of directness, and prosodic features, making their repertoire incredibly diverse and flexible with nearly unlimited social utility. Deficits or impairments in the ability to make requests can thus be seen to have ramifications across all aspects of language communication and social adaptation, and justifies their central place in any assessment of PLC (Russell and Koch 1991). The more difficult question is: What is the smallest set of items trained on precisely what aspects of requests that must be included in an assessment instrument in order to optimize content, ecological, and diagnostic validity? The complexity of this question grows as one appreciates the range of sophistication (e.g., in cognitive, emotional, and interpersonal spheres, not to mention in language per se) that spans use of the simplest (Milk, Mommy) to the most complex indirect requests. Obviously, the same question needs to be addressed for each pragmatic domain.

For example, extensive rationales for a similarly central place in PLC assessment can be made for the ability to use and understand internal state language, intentionality, perspective-taking, and other aspects of Theory of Mind and Emotion Language. In fact, according to one theory, the full range and appropriate selection of polite, direct and indirect, requests presupposes advanced Theory of Mind and Emotion Language abilities for use in assessing the relative solidarity and status/power of the requester/requestee and the cost (social, monetary, time, etc.) to the requestee if the request should be granted (Brown and Levinson 1987). Recent research and theory also suggest that PLC and Theory of Mind and Emotion Language are intricately interwoven, and require intact frontal lobe and executive function development, and thus there is considerable evidence that Theory of Mind and Emotion Language abilities should be probed by PLC assessment instruments as well (Abu-Akel 2003; Bishop and Norbury 2005; Carlson et al. 2004; Frith and Frith 2003; Kuperberg et al. 2000; Martin and McDonald 2003; Sperber and Wilson 2002; Stuss et al. 2001). But how many and which Theory of Mind and Emotion Language processes or markers must be included in a PLC assessment instrument? Is the use of internal state language and emotion terms a sufficient marker or must others be included as well?

The question of which domains must be included in a test or questionnaire/checklist and at what age bands to achieve a valid estimate of child’s PLC level is far from being answered. For example, narratologists and folklorists tell us there are a delimited number of story plots available to speakers, but nowhere in the tests and questionnaires was extent of plot repertoires queried (Russell and Bryant 2003; Russell and van den Broek 1988; Russell and Wandrei 1996). Similarly, items probing joint attention were conspicuously lacking. Conversely, one can also question why Speech Fluency was such a central domain across these tests and tasks, especially when most of the items probing Speech Fluency did not concern prosody, a recognized indicator of speaker emotion and an impaired area of speech in children along the autism spectrum. Obviously no test can contain multiple items sampling each and every type of nuance within and across PLC domains and remain practicable. But, clearly, the content validity of most of the tests and tasks reviewed appear insufficient or lacking in empirical warrant in the sense of being based on an explicit derivational or empirical procedures. The fact that PLCs have a rather dramatic course of development at least through adolescence adds a further level of complication for achieving content validity, especially as there seems to be disparate rates of development of PLCs across these years.

The lack of correlation between the salience of PLC domains in tests and questionnaires/checklists is worrisome, as both purport to be assessing aspects of the same underlying construct(s) and there is a dearth of research showing that outcomes of evaluations using PLC tests and questionnaires/checklists correlate significantly with each other at large effect size magnitudes. Further, it was demonstrated that the instruments differ in their developmental focus, with questionnaires/checklists focusing more on the precursors and enablers (PE) and less on the more advanced discourse skills (Extended Literal and Nonliteral Discourse) than tests. In addition, in describing the ecological validity of the tests versus the questionnaires/checklists it appeared that they may in fact be assessing different constructs—pragmatic skill versus meta-pragmatic awareness. Although it is possible that measures of meta-pragmatic awareness and pragmatic skill are so highly correlated that their distinction is practically and diagnostically unnecessary, in the moral development domain, awareness turns out to be a less robust predictor of actual behavior than was hoped (Kurtines and Greif 1974; McColgan et al. 1983; Thoma et al. 1991). This analogy would seem to have more than passing relevance, when it is recalled that assessment of PLCs is focused not simply on the acquisition of this or that decontextualized skill, but on pragmatic skills that can be used appropriately in social contexts of interaction. The stipulation that the skill must be used appropriately has ineradicable moral connotations.

Clearly, the development of PLC instruments for clinical use is improving, even if no test or questionnaire has established and confirmed its dimensionality and diagnostic validity. In addition, a growing set of developmental psychopathologists involved in educational or neurocognitive assessment have joined speech language pathologists in recognizing the need for a new diagnostic category, Pragmatic Communication Disorder. Such a diagnostic category is needed to aptly characterize the PLC impairments that beset children with and without a variety of concomitant psychiatric disorders. Especially relevant for these researchers/clinicians are corroborated studies of PLC questionnaires/checklists or tests that provide estimates of their satisfactory sensitivity and specificity (see Bishop 2003, 2006 for exemplary advances in this direction; Russell et al. submitted). Here it will be necessary to show how the symptoms of Pragmatic Communication Disorder differ from those that characterize the other already recognized childhood communication disorders, namely, expressive and expressive/receptive communication disorder. There is much research to be accomplished before Pragmatic Communication Disorder is widely recognized and incorporated into standard psychiatric nosologies. Moreover, better characterizations of deficits in PLCs will hopefully better elucidate the extent and patterns of their comorbidities with psychiatric disorders. Suggestions as to how PLC impairments congeal into recognizable interaction styles and how these styles relate to psychopathology have already begun (Russell 2007).

Until then, however, the many children with PLC impairments must still be assessed and targets must be identified for intervention. How should the practicing clinician proceed? Our review suggests several reasonable answers. As a broad screening of PLC functioning, several questionnaires and an observation instrument currently have the best content validity. These are the CCC, the ORS, the TASCC, and the PP. Although it is unclear if their a priori dimensional structures will be supported by exploratory and confirmatory factor analytic studies, the composite groupings of subscales not only make conceptual sense, but can also guide the characterization of a child’s strengths and weaknesses across subdomains. However, in this set of questionnaires and an observational instrument, only two (CCC and PP) provide norms. As a consequence, it would seem reasonable to use the CCC, which has both UK and USA norms, and/or the PP as a minimal screen for PLC deficits, in addition to structured observations. It should be noted that these instruments can be completed by multiple informants to achieve a more circumspect multisource view of a patient’s PLCs in different contexts. To supplement their use, several instruments can provide intense examination of a single or a few PLC domains. For example, if a child appeared to have difficulty with understanding and/or using nonliteral language, on the basis of observation and questionnaire item analysis, it would appear reasonable to select a more narrowly targeted assessment instrument, such as the CASL’s Nonliteral Language subtest. Those PLC domains whose scores departed most from normative levels, on both the broad screenings, and narrowly targeted assessments, would be reasonable targets on which interventions could be focused. Clearly, qualitative impressions of a child’s PLCs are useful in the assessment process. However, psychologists and educational personal will need to become more adept in recognizing pragmatic deficits in the children they evaluate and more abreast of the research detailing their consequences for the children’s adjustment (see Russell 2007 for several vignettes illustrating pragmatic lapses). But there is currently little justification for basing an assessment of a child’s PLCs on these impressions alone, whether in the context of a speech/language, educational, or psychological evaluation.

Even if the mental health and educational workforce were collectively adept at differentiating and recognizing PLC impairments in the children that they evaluate, there is currently a dearth of empirically supported protocols for use in their treatment. Clearly, developments in diagnostic instrumentation must be matched with an intensive effort to devise and evaluate treatment protocols for empirically identified PLC targets. Such protocols might be fashioned to piggyback on evidence-based treatments for the disorders with which PLC impairments most often occur, first in efficacy and then in effectiveness trials. In a word, much needs to be done.

Our review has attempted to both characterize progress in PLC assessment and to illustrate abiding shortcomings. The field has been rapidly developing and there are new instruments under development and cross-national forms and norms of existing tests. These developments are overwhelmingly positive and need to be sustained. However, our review suggests that the prospects for the development of evidence-based PLC assessment instruments will largely rest on the degree to which the questionnaire/checklist and test innovators redress the type of shortcomings we have described and illustrated. Although these assessment instruments cannot supplant diagnostic acumen and clinical experience, their potential to augment them and produce more sensitive and specific pragmatic language evaluations is obvious.