Keywords

Elder abuse has been defined as “a single or repeated act or lack of appropriate action, occurring within any relationship where there is an expectation of trust which causes harm or distress to an older person” [162]. This consensus definition was adopted by the World Health Organization [168], and other key bodies such as the International Network for Prevention of Elder Abuse (INPEA). It highlights the complexity of this latent construct, which incorporates concepts of the multiple types of abuse, characteristics of the abuser and abuse victim, the nature of the relationship between abuser and person abused, whether suffering is experienced by the abuse victim, the intention of the behaviour, whether the abuse is an act of commission (abuse) or omission (neglect), and the context of the abuse [41].

Many have called for more comprehensive approaches to the measurement or screening of elder abuse and neglect that reflect this complexity [38, 35, 41], but many challenges remain to achieve this. This chapter undertakes a systematic review of research between 1995 and 2015 of screening measures for elder abuse or neglect that have been at least partially validated, and discusses methodological issues associated with the development of effective screening instruments in this challenging area. Others have proposed that there is inadequate research to support the effectiveness of screening [125, 126], and highlighted the need for more research to determine whether screening is indicated. Thus, the review will also examine the limited evidence that addresses the effectiveness of screening, and the research, policy and practice implications emerging from the review.

1 Screening for Elder Abuse

Screening for elder abuse is defined as a process of eliciting information about abusive experiences in a caring or family relationship from older or vulnerable adults who do not have obvious sign of abuse such as physical injuries [126]. The rationale for screening among non-symptomatic people, is that it may identify abuse not otherwise known, prevent future abuse, and reduce risk of future health impacts as a result of the elder abuse [126]. Screening is considered particularly important for problems with serious health implications, and where overall rates of identification are considered to be low. This is certainly the case for elder abuse and neglect [8, 144].

A cornerstone of effective screening is the development of valid and reliable screening measures with low measurement error. This has proved to be a challenging task, not only because of the methodological issues identified above, but because elder abuse, like other forms of family and interpersonal violence, is a largely hidden phenomenon, occurring in the home or institutions, usually without witnesses [93]. Victims are often reluctant to disclose the abuse [115, 137] because of shame or fear of being judged [22], failure to identify the behavior as abusive [132], dependence on the abuser, or feeling that the abuse is their fault [113]. Furthermore, elder abuse is poorly understood in the community and, as for domestic violence, there can a reluctance to question or interfere with what goes on within families.

The many faces of elder abuse add further complexity. While physical, sexual and, to some extent, financial forms of abuse are more readily measured and verified, other forms such as psychological, emotional, verbal, and coercive abuse, and neglect and abandonment are much more difficult to verify, or even for the elder to understand. Yet, these are the most prevalent forms of elder abuse [148]. In addition to the difficulties of measurement and identification of abuse, a number of other barriers have been noted by health professionals that impact on their ability to screen elders in their care. These include lack of time, lack of knowledge, lack of confidence that there are adequate resources and systems to address potential elder abuse, gaining sufficient privacy to ask the sensitive questions about abuse, and lack of skills in eliciting reports of abusive acts or situations [144].

There is a clear need for better measures of these more hidden forms of abuse since research has demonstrated considerable health impact of abuse and neglect [56, 50, 94, 109, 148]. In response, there has been growing interest over the past three decades in the development of valid and reliable screening instruments designed to detect risk of elder abuse and neglect in different contexts. Screening measures have been categorised in different ways. For instance, Cohen [25] categorised screening instruments into three groups based on both method and intention of the screening instrument. The first group comprised direct questioning tools that ask about the elders’ experience, the second group were tools that assess signs of actual abuse, and the third group were measures designed to assess risk of abuse. Each of these have both strengths and limitations as screening measures, and Cohen has argued for a comprehensive screening model that incorporates all 3 forms of screening. The majority of screening instruments incorporate the direct questioning method, and assessment of risk of abuse. Those that assess signs of actual abuse tend to be used more in the phase of substantiating abuse [8] and are not the primary focus of this review on screening.

Direct questioning tools comprise a set of screening questions that are intended to be completed either by the elders themselves or by a health professional or researcher asking questions of the elder. Direct questioning tools tend to ask whether the elder has experienced a range of abusive behaviors by a caregiver or someone known to them, and they depend on the elders’ willingness and ability to disclose abusive behaviors. Examples of direct questioning items about abuse include questions such as “Has anyone tried to hurt or harm you recently?”, “Has anyone made you feel afraid?”, or “Has anyone taken any of your belongings without your permission?”. Direct questioning tools are useful to researchers who seek to establish prevalence and risk factors of abuse in certain populations, or who need to screen a larger population to identify those who have experienced potentially abusive actions. They are also used by clinicians to screen for abuse among elderly patients presenting in primary care or hospital settings to better understand the nature of care relationships and the possible contribution of abuse to the medical presentation.

Another common approach to screening are questions that probe for risk indicators for abuse. This has been justified because it can be difficult to get reliable answers to questions about direct abuse, and because risk factors have been demonstrated to reliably discriminate between abuse and non-abuse cases [26, 139]. Examples of the risk factor approach include checklists of risk factors assessed by professionals such as the Indicators of Abuse Screen (IOA) [139], and self-complete questionnaires about risk factors such as the Vulnerability to Abuse Screening Scale (VASS) [146]. Risk factors may include both client and care-giver factors associated with abuse, as well as characteristics of the situation. Many measures combine types of questions, including direct questions about specific behaviors, actual harm caused, and risk indicator.

Another way of categorising screening instruments is to consider the setting and purpose of the screening tool. One set of screening instruments have been designed for mass screening of elders and/or their caregivers at the community or population level. Another set of instruments has been designed for more targeted screening among elderly in health, welfare and institutional service settings. The following review of elder abuse and neglect screening instruments classifies them broadly into these two categories, primarily based on the intention when the scale was developed, and the context of the original validation studies. It is acknowledged however, that some instruments can potentially be used both ways. The chapter also separately reviews scales that have been developed to screen specifically for elder neglect and self-neglect, since these constructs pose additional measurement challenges as they seek to assess the absence of caring behaviour, rather than the presence of abusive behaviour.

2 Systematic Literature Search for Screening Measures for Elder Abuse and Neglect

A systematic literature search was undertaken to identify and review validated screening measures for elder abuse and neglect. A search was made of English language articles on two key electronic databases, Medline and Psych Info, from 1995 to 2015. The terms entered were: (exp Elder Abuse OR [“elder abuse” or “elder mistreatment” or “elder neglect” or “elder self neglect” or “elder financial abuse” or “elder financial mistreatment” or “elder sexual abuse”] AND (exp Screening [or exp Mass screening] OR [measurement or screening or survey* or questionnaire* or observation*). The Psych Info search produced 312 records and the Medline search produced 392 records, making a total of 704 records. An additional 65 records were identified using manual searches and reference lists of reviewed articles. After excluding 155 duplicate or irrelevant records, there were 614 records remaining. Abstracts were first reviewed, and then the full article for those deemed potentially relevant were assessed. After the review of abstracts, a further 274 were excluded: 39 not applicable, 50 books, and 116 non-empirical articles. The remaining 340 records were retrieved and reviewed. From this more detailed review, a total of 33 screening tools met the minimum criteria of having detailed the development of the scale, reported items and scale structure, and some evidence of psychometric evaluation. A total of 10 scales were designed for use in community-based screening, 16 for screening in healthcare settings, and a further 7 scales are designed to screen specifically for neglect and self-neglect. These are reviewed in the following three sections (Fig. 9.1).

Fig. 9.1
figure 1

Flow diagram for systematic review

3 Measures for Community-Based Screening for Elder Abuse and Neglect

Community-based screening measures aim to identify the prevalence and determinants of elder abuse in the community. These measures are suitable for use in research (e.g., to assess prevalence and risk factors), as well as for screening older people who access services to determine who are at risk of abuse or neglect. They can be self-completed or administered via interview. Ten scales were identified that met the review criteria of being designed primarily for use in community-based studies, and including some reliability and validity evaluation. A critical review of these measures is described in the next section. Key characteristics of these measures are summarised in Table 9.1, with details related to the origin of the scale, the number of items and subscales, the intended respondent for the measures and the method of administration. In addition, details of the psychometric properties of each scale are provided along with the types of samples on which the validation was undertaken. A summary of strengths and limitations of each measure is also provided. It is noted that there are other large scale community studies published in the 1995–2015 period, but were not included due to inadequate description of the measure, or inadequate reporting of psychometrics.

Table 9.1 Validated measures for screening for elder abuse and neglect at population level

3.1 Abuse of the Elderly in the European Region (ABUEL)

The Abuse of the Elderly in the European Region (ABUEL) study is one of the very few cross-cultural studies available that have used a common elder abuse scale [103, 108, 111, 153]. It examined elder abuse among representative community samples aged 60–84 years from cities in 7 European countries (N = 4,467, 57 % women). These were: Stuttgart, Germany; Athens, Greece; Ancona, Italy; Kaunas, Lithuania; Porto, Portugal; Granada, Spain; and Stockholm, Sweden. Unfortunately varied sampling (registry, random community, cluster sampling) strategies and modes of administration (interview and self-complete written questionnaires) were used across countries. The scale was translated using recommended forward and backward translation, and some items were modified for cultural fit, although details of this are not reported.

The elder abuse questionnaire included 52 items based on the CTS2 [156] and the UK study of abuse/neglect of older people [16], although only examples of items are reported. It measured both the severity and chronicity of abuse in six domains. Psychological Abuse was assessed using 11 items (6 severe abuse, 5 minor abuse); Physical Abuse with 17 items (10 severe, 7 minor); Injuries with 7 items (4 severe, 3 minor); Sexual Abuse with 8 items (5 severe, 3 minor); Financial Abuse with 9 items (5 severe, 4 minor); Neglect—unmet needs (13). Chronicity was assessed by asking about the frequency of abuse on 8 point scale. Good internal reliability statistics are reported across countries, with Cronbach α of 0.85 for Psychological Abuse, 0.80 for Physical Abuse, 0.72 for Injury, 0.76 for Sexual Abuse, and 0.71 for Financial Abuse [153]. Validity was supported by associations with some hypothesized constructs such as gender and mental health.

3.2 Abuse and Violence Against Older Women (AVOW)

The Abuse and Violence against Older Women (AVOW) study undertook a formative evaluation of a newly developed elder abuse scale [96]. The scale was based on six different types of abuse defined by the World Health Organisation [168] and adaptation of items in the Conflict Tactics Scale (CTS) [155]. The original 34 items comprised domains of neglect and emotional abuse (9 items each), and financial abuse, physical abuse, sexual abuse and violation of personal rights (4 items each). Responses are provided by elders for the last year on a 4-point frequency scale. Three self-report methods of administration have been used in a large study of 2880 home-dwelling older women aged 60 and over from 5 European countries: postal survey, face-to-face interview, or telephone interview [96]. Relatively limited psychometrics have been reported, with no data available on internal consistency reliability. Rather, analysis of items with high missing data and low prevalence was used to reduce the 34 items to 22 indicators that maximised reliability of prevalence estimates. Concurrent validity was tested by examining relationships between abuse and quality of life.

3.3 Conflict Tactics Scale (CTS)

The Conflict Tactics Scale (CTS) was one of the earlier measures developed for assessing conflict and abusive behavior in close personal relationships [155]. It was originally designed to measure the extent to which partners in an intimate relationship engage in psychological and physical attacks on each other and their use of negotiation to deal with conflicts. The original CTS had 19 items with three validated subscales: Violence (9 items), Verbal Aggression (6 items), and Reasoning (3 items). The CTS was designed for self-completion by the couple, and is suitable for use in elder abuse research for completion by the older person and their care-giver, or simply by the elderly. Parallel forms of the scale are designed to be answered both in relation to one’s own behavior towards a partner (carer) and one’s experience of the partner’s (carer) behavior towards the self (total 38 items).

The scale was developed based on theoretical principles derived from conflict theories [155]. It has face validity in terms of physical and psychological aggression but covers a very limited range of psychologically abusive behavior. It does not measure some important aspects of elder abuse such as coercive and intimidating behavior, sexual abuse and financial abuse. It has been found acceptable to family members, with reasonably good completion rates. There has been considerable research on the psychometric properties of the CTS, which have demonstrated moderate to high reliabilities for total scale and 3 subscales [155], and some construct and concurrent validity [155]. For instance, the internal consistency of the subscales ranged from Cronbach’s alphas of 0.75–0.98, with a total scale alpha of 0.94. However, much of the early validation work has been on couples or student samples rather than on the elderly [155]. Criticisms of the CTS have included the limited range of abuse behaviors, the skewed nature of scale distributions, and limited validation. Studies using this scale have often modified the scale or used different criteria to establish elder abuse, making it difficult to compare across studies. In response to these criticisms, a revised and expanded version, the CTS2, was produced.

3.4 Conflict Tactics Scale V2 (CTS2)

The CTS2 represents a considerable expansion of the CTS and addressed some identified limitations [156]. It comprises a total of 78 items (2 × 39 pairs of items designed to measure acts by partner towards self and acts by self towards partner). The modified original scales were renamed as Physical Assault (12 items), Psychological Aggression (8 items), and Negotiation (6 items), and two new scales were included: Injury (6 items), and Sexual Coercion (7 items). The validation sample comprised 371 undergraduate students from the US, with a mean age of 22 years, responding to the scale in relation to a dating, cohabiting or marital relationship. Reliabilities ranged from 0.79 to 0.95 for subscales and some construct and discriminant validity was demonstrated. None of the items were specific to the elderly or addressed unique forms of elder abuse. For instance, it does not address neglect, psychological coercion, or financial abuse. The CTS2, or particular subscales, have been used in a wide range of elder abuse studies, often supplemented by additional items or scales designed to tap into unique aspects of elder abuse and neglect (e.g., [14, 36, 37, 68, 106, 174]).

3.5 Modified Conflict Tactics Scale (MCTS)

The MCTS provides a further modification of the CTS designed more specifically for caring relationships, and with modified items suitable for elder abuse [14]. It comprises 10 items each for the elder person and 10 parallel items for their caregiver: 5 items each related to Psychological Mistreatment and 5 items for Physical Abuse, so each person answers 20 items in total. The MCTS has been validated on a sample of 265 family carer and care recipient dyads from three US cities, with moderate internal reliability on the 10 item scales (α = 0.69 elder, 0.67 caregivers). Both convergent and discriminant validity has been established [38, 35]. The scale has been reported in further studies [36].

3.6 Modified Caregiver Strain Index (MCSI)

The MCSI [159] is a modification of the Caregiver Strain Index first developed by Robinson [140]. The MCSI is a 13 item questionnaire with items covering the domains of employment, financial, physical, social and time strains experienced by carers. It has been validated in a community sample of 158 family caregivers aged 53 and older, with a mean age of 61. High internal and 2-week test–retest reliabilities have been reported for the scale (α = 0.90, 0.88, respectively). It is easy and quick to administer to those providing long-term care to a family member. Limitations of the scale are that it relies solely on the caregiver perspective, so does not take into account the elders’ perspectives on possible abuse. As well, the scale does not target abuse behaviors specifically, but rather is a measure of general strain associated with the caregiving role. Its utility is most likely in identifying high strain caring dyads where more tailored assessment for elder abuse and neglect could be directed.

3.7 Native Elder Life Scale (NELS)

The Native Elder Life Scale (NELS) was developed by Native American community elders and experts to address unique cultural aspects of elder abuse in Native American communities [84]. In particular, it was designed to fill a perceived gap in financial exploitation and neglect items for this population. The scale is made up of two subscales: the NEL-Financial Exploitation (NELS-FE) comprised 18 items and the NELS-Neglect scales comprised 12 items. The scale is designed to be administered to elders by Native American research staff. Validation studies were undertaken in 2 small communities of Native Americans aged 60 and over (N = 100); a sample of 50 Native Americans were recruited from a tribal senior centre on Northern Plains reservation, and another 50 from two protestant churches in urban South Central USA. The scale demonstrated moderate reliability but with variation across the two communities for the NELS-FE. Internal consistency reliability of the NELS-FE was 0.65 total, with a range of 0.75–0.53 for the two communities. The low sample sizes may partially explain this less consistent reliability. The NELS-Neglect scale had a reliability of α = 0.78 total, ranging from 0.80 to 0.78 for the two communities, suggesting this is the more reliable subscale. The NELS provides a useful starting point for screening Native American communities for elder abuse, but the lack of validity tests suggests much more work is needed.

3.8 Unnamed Korean Elder Abuse Scale

A Korean elder abuse scale contains a checklist of 25 items, 5 for each of 5 categories of abuse: emotional, verbal, economic, neglect, physical [127]. It was developed from literature reviews, then reviewed by gerontological nurses, pilot tested and refined. Respondents are asked the frequency of type of abuse in the past month. Abuse is considered present if it has occurred 2 or more times in last month, and total abuse is the sum of all incidents. Data were gathered by a structured interview in the home from a population based sample adults aged 65+ in one area of Seoul, Korea (N = 15,230, 52 % of population of area). The subscales have demonstrated internal consistency reliability with this Korean sample, with α = 0.70–0.92. Some validation of the scale was demonstrated by expected associations with variables such as disability, sick days, cognitive function, living circumstances, economic level and family relationships. As for most elder abuse scales, validity is conditional on participant willingness to disclose abusive acts.

3.9 Vulnerability to Elder Abuse Scale (VASS)

The Vulnerability to Elder Abuse Scale (VASS) is a brief 12 item questionnaire designed to assess risk of elder abuse over past 12 months [146, 149]. It has 4 subscales of three items each with yes/no response options, and is supported by psychometric evaluation. The subscales are Vulnerability, Dependence, Dejection, and Coercion. Ten items were adapted from the Hwalek-Sengstock Elder Abuse Screening Test (H-S/EAST) [81, 123], with two additional questions: “Has anyone close to you called you names or put you down or made you feel bad recently?” from the CTS [155], and ‘Are you afraid of anyone in your family?’ [110]. The VASS is designed for self-completion by older adults and has been validated on a large population-based sample of 10,421 women aged 73–78 from Australian Longitudinal Study on Women’s Health [146, 149].

The four subscales were derived through factor analysis and have face and content validity. The vulnerability and coercion scales have greatest face validity for abuse as items ask direct questions about experience of abusive behavior, and have demonstrated moderate to good construct validity [146]. The dejection factor resembles a measure of depression, whereas the dependence factor seems to measure vulnerability to abuse, since frailty is a recognised risk factor [148]. Reliability of subscales is adequate only for the dependence subscale (α = 0.74), whereas the other subscales have alphas ranging from 0.45 for vulnerability to 0.31 for coercion. The factor structure has been found to be stable over time by analysis of repeated waves of the longitudinal study [146, 147, 149, 148]. Construct, convergent and discriminant validity was largely supported by correlations with hypothesized variables in the predicted direction and lack of correlations with those deemed unrelated. For instance, the vulnerability and coercion factors were correlated with relationship conflict or breakdown, and to coercive life events such as “major conflict with children”, or “being pushed, grabbed, shoved, kicked or hit” in the last 12 months but unrelated to variables such as height and body mass index [146].

The VASS has shown predictive validity for poor health outcomes over 3 years [147], and for mortality and disability outcomes over 12 years [148]. The measure is very short, easy to complete, and low cost, and is suitable for screening in research and clinical settings. Like most screening measures, the scale is based on self-report and relies on recall and willingness to report abuse. It is also limited by having only one item measuring neglect, and none specifically measuring sexual abuse. The scale has been used and validated in a range of other studies. For instance, Dong and colleagues have used a 10-item modified version of the VASS in studies of Chinese populations in both the USA and Mainland China and demonstrated good reliability of the modified scale with these populations [21, 47, 55, 45]. It has also been incorporated into a national survey of US elderly [98], and another Chinese study [170].

3.10 Women’s Health and Relationship Survey (WHRS)

Women’s Health and Relationship Survey (WHRS) measured five types of abuse via telephone interview among 842 community-dwelling women aged 60–90 years in the US [67]. Factor analysis of items yielded five subscales: Psychological/emotional (3 items), control (3 items), threat (2 items), physical (4 items), and sexual abuse (3 items). The items were adapted from two sources: the Abusive Behavior Inventory [151] and the National Violence Against Women Survey [160]. For each item, women responded about whether they had experienced the behavior “since you turned 55”. Abuse was scored in four different ways: any abuse, repeated abuse (more than one type within a subscale), multiple abuse (at least two different types of abuse), and frequency of each type rated on a 5-point Likert scale from “never” to “very frequently”. Moderate reliability was provided for psychological/emotional abuse, physical abuse, and sexual abuse with Cronbach alphas ranging from 0.64 to 0.71 for these 3 subscales. Alphas for the other two scales were less than adequate at 0.59 and 0.52 (see Table 9.1). Validity is partially supported by the factor analysis into five scales, the face validity, and relationships between abused and other constructs such as the number of self-reported chronic health conditions, and specific current health conditions, although limited psychometric evaluation has been reported. The self-report data are uncorroborated, and no predictive validity has been undertaken.

3.11 Summary of Community-Based Screening Measures

The CTS and later modifications provided an important impetus to the field of measuring abuse and conflict in close relationships and have been widely used, particularly for research on intimate relationships. However, it is a relatively long screening measure that wasn’t developed specifically for elder abuse and fails to measure some key aspects of elder abuse and neglect. Furthermore, although it is the most widely used scale, most studies have modified it, making comparisons across studies difficult. A useful screening measure for elder abuse and neglect must be brief, easily completed and targeted to specific types of abuse most common in the phenomenon of elder abuse and neglect. While there has been a growing body of research attempting to measure elder abuse and neglect, the field has suffered by the lack of a definitive gold standard and most scales seek to measure risk or vulnerability rather than abuse per se, or combine both direct and risk or vulnerability questions. Among the brief and partially validated scales designed for community-based screening, the VASS has a number of advantages, making it one of three screening measures recognised by the Centers for Medicare and Medicaid Services [19, 150]. It is very brief and easy to complete and score (sum or mean of responses), and has been shown to have moderate reliability, and construct, concurrent, discriminative, and predictive validity, albeit that further validation work is required. In particular, further work is needed to determine sexual and financial abuse among the elderly, and to strengthen the internal reliability of some scales. It would be useful to the field to develop some consensus on an appropriate scale for community screening, so that studies could be compared more readily.

4 Screening for Elder Abuse and Neglect in Healthcare and Institutional Settings: Review of Measures

Compared with the relatively limited research described above on community-based screening, considerably more effort has been expended on identifying those who have experienced abuse in settings where signs and indicators of abuse may be more readily assessed. The greatest body of research on screening for elder abuse and neglect occurs in healthcare and institutional settings. A review of methods for substantiating abuse when potential victims are reported to adult protective services is beyond the scope of this chapter. The review identified 16 scales used in the period 1995–2015 that report some psychometric evaluation. These scales are shown in Table 9.2 and reviewed below.

Table 9.2 Validated measures for screening for elder abuse and neglect in health care and institutional settings

4.1 Brief Abuse Screen for the Elderly (BASE)

The Brief Abuse Screen for the Elderly (BASE) is a 5-item set of screening questions that trained health care professionals ask the elderly and their care-givers [138, 139]. The items address physical, psychosocial, and financial abuse or neglect by the caregiver. It takes only about 1 min, and is suitable for clinical settings. The health care professional requires some training in how to ask the questions and interpret the responses, as no scoring method is supplied. The internal consistency reliability was found to be 0.91; inter-rater reliability 90 % (using very small sample of raters), and predictive validity ranged around 0.89–0.91. However, the scale has essentially one item for each type of abuse, so it is not a scale in the full sense.

4.2 Caregiver Abuse Screen for the Elderly (CASE)

The Caregiver Abuse Screen for the Elderly (CASE) is an 8-item set of screening questions designed for trained health care professionals to ask care-givers to assess the risk of abuse [138, 139]. The items address physical, psychosocial, and financial abuse or neglect by the caregiver. It takes about 1 min, and is suitable for clinical settings. The health care professional requires some training in how to ask the questions and interpret the responses, as no scoring method is supplied. The internal consistency reliability was found to be 0.71.

4.3 Elder Abuse and Neglect Assessment Instrument (EAI)

The Elder Abuse and Neglect Assessment Instrument (EAI), a 41-item checklist, was designed for administration by health care providers to screen elderly patients in clinical settings [77, 72]. The measure assesses both objective signs and symptoms, and subjective complaints of abuse, neglect, exploitation, and abandonment across four domains: physical, level of independence, medical, and social. Sound psychometrics have been reported based on a random sample of 501 elder emergency department patients in the US. Psychometric results include an internal consistency reliability of 0.84, test–retest reliability of 0.83, content validity index of 0.83, and inter-rate agreement of 0.84. The scale has a sensitivity of 71 % and specificity of 93 %. The scale is considered easy to administer in clinical settings though it is longer than other screening options. There is no published scoring system so it relies on clinician judgement. The checklist has been used in a variety of studies (e.g., [71, 72, 73, 74]), and progressively refined. More recent studies have used either a 44-item scale designed to identify possible markers of EM over several domains: general assessment (4 items), neglect assessment, usual lifestyle, social assessment, medical assessment, emotional and/or psychological neglect, and a summary assessment [72] or a 51-item version, the EAI-R which includes four summary assessment items: evidence of abuse, evidence of neglect, evidence of psychological abuse, and evidence of financial abuse [76]. These are rated on four-point Likert scale ranging from no evidence, probably no evidence, probably evidence, and evidence.

4.4 Elder Abuse Questionnaire (EAQ)

The Elder Abuse Questionnaire (EAQ) is a 25-item single factor questionnaire [92]. The questionnaire is completed by long-term care providers or family members, using Likert scale response options. Only moderate reliability has been established with a Cronbach’s alpha of 0.66 for internal consistency, although this involved a very small sample of 29 long-term care providers and family members in rural Kansas. A larger sample may provide greater reliability. The content validity was verified by experts. A limitation of the scale is that it elicits carers’ reports only, and not the subjective experience of the elders, thus the accuracy of abuse reports is questionable.

4.5 Elder Abuse Suspicion Index (EASI)

The Elder Abuse Suspicion Index (EASI) was developed over 2002–2003 from literature searches, existing scales and taxonomies for elder abuse, and drew on the World Health Organisation (WHO) definition of elder abuse and family violence [172]. It comprises 5 interview questions for clinicians to ask of patients, and 1 item for completion by the clinician in relation to observed indicators of abuse. Compared with a gold standard elder abuse evaluation undertaken by trained social workers, moderate sensitivity has been reported as 0.47, and the specificity as 0.75. This validation was undertaken with a sample of 663 patients recruited by physicians at two Montreal family medicine centers and a government community-based health and social services center. A key advantage of the instrument is that it is very short and quick to administer, taking <2 min. It is available in French as well as English, and has been rated as having content validity in at least seven diverse countries [169]. There is also a self-administrable version for patients, the EASI-sa [171].

4.6 Detection Scales for the Risk of Domestic Abuse and Self-Negligent Behavior in Elderly Persons (EDMA)

The Detection Scales for the Risk of Domestic Abuse and Self-Negligent Behavior in Elderly Persons (EDMA) was developed in Spain by Touza et al. [161] for use by social services professionals to screen for elder abuse and self-neglect among social services clients. The EDMA comprises 2 separate scales: the Elder Scale (33 items), and Alleged Abuser Scale (21 items). The Elder Scale has 3 validated subscales: Abandonment, Neglect, and Self-Neglect (13 items); Domestic Abuse without Self-Neglect (17 items); Self-Neglect (3 items). The Alleged Abuser Scale has 3 validated subscales: Inflicted Inappropriate Treatment or Abuse (9 items); Restrictive Behaviors (6 items); and Inability to Offer Proper Treatment (6 items). All items are rated by the professional on a 5-point Likert scale. Good to excellent psychometric properties were demonstrated in this Spanish sample of 46 professionals who rated 278 elders. Alphas for total scale and subscale scores were high, ranging from 0.93 for Total Elder Scale to 0.73 for Self-Neglect. The Total Elder Scale discriminated abuse and neglect well, with good sensitivity and specificity (93 %, 88 %). An advantage of the scales is that they assess both the elder and caregiver risk indicators. However the results are based on ratings made by a relatively small sample of social service professionals who had prior knowledge of the older people and their caregivers. The method of administration would not work so effectively for larger groups who were less well known.

4.7 Elders’ Psychological Abuse Scale (EPAS)

The Elders’ Psychological Abuse Scale (EPAS), originally called the Psychological Elder Abuse Scale (PEAS), is a 32 item scale with yes/no response format [164, 165]. The scale was developed in Taiwan from focus groups and expert input to ensure it was consistent with community understandings. It was originally constructed in Chinese and translated into English. Screening takes place through a structured interview by clinicians of the elders in their home or institution (7 items), active observation (6 items), and questions to the elder person’s caregiver (18 items). Positive responses are summed, with higher scores indicating higher risk of psychological abuse, and a cutoff of 10 was set to provide abuse versus non-abuse groups. The scale has demonstrated good reliability and validity in a sample of 195 elderly Taiwanese aged 60 years and older who lived in institutions (49 %) and private homes (51 %). Agreement between two raters was 100 % for 7 indicators, and 79 % for 1 indicator, and internal consistency reliability was 0.82. Construct validity was demonstrated by inverse relationships with SPMSQ [131], and Barthel’s Index of physical functioning [39].

4.8 Geriatric Mistreatment Scale (GMS)

The Geriatric Mistreatment Scale (GMS) was developed and validated as a Spanish language scale by Giraldo-Rodríguez and Rosas-Carrasco [79]. Items were developed from the literature, prior scales, review of complaints made, and expert consensus panel, and refined through systematic psychometric evaluation. This resulted in a 22-item scale with subscales of physical, psychological, neglect, economic mistreatment and sexual abuse, with yes/no responses to each item. A positive response to any item was considered evidence of mistreatment. The scale was designed to be administered through interviews with elders in their home by trained professionals who had previous experience in elder abuse. The scale has demonstrated good psychometrics with a probabilistic sample of 613 older Mexicans aged 60 years and older, mean age of 72, 60 % female. Internal consistency reliabilities were 0.83 for the 22-item GMS, and 0.82, 0.72, 0.55, 0.80, and 0.87 respectively for the psychological mistreatment, physical mistreatment, economic mistreatment, neglect, and sexual abuse subscales. Content validity was established by expert ratings. Construct validity was founded on demonstrating hypothesised relationships between elder abuse and depression, needing help with daily tasks, and cognitive impairment. Overall, the scale was developed using sound psychometrics. Advantages include having Spanish and English versions, although it needs to be noted that the validation tests have been conducted on the Spanish version only. Less than acceptable reliability was found for the economic mistreatment scale, and no predictive validity is currently available.

4.9 Hwalek-Sengstock Elder Abuse Screening Test (H-S/EAST)

The Hwalek-Sengstock Elder Abuse Screening Test (H-S/EAST) is a 15 item questionnaire that measures three forms of elder abuse: violations of personal rights or direct abuse, characteristics of vulnerability, and potentially abusive situations [81, 123]. It was developed from a pool of over 100 items sourced from various elder abuse screening instruments and refined to best 15 items [81]. It is designed to be administered by interview of the elder person by health care providers in clinical settings, and by review of case notes. Internal consistency reliability of the scale is poor with Cronbach’s alpha of 0.29. Nevertheless, discriminant function analysis correctly classified 74 % of abused cases in one validity study [123] and yielded 9 and 6-item versions that best discriminated between abused and non-abused elders. The scale relies on elder recall and willingness to report abuse, and is limited by rating only three types of abuse. There has been debate about whether the scale is uni- or multi-dimensional scale (e.g., [84]), and a high false negative rate has been reported [112]. Despite relatively low reliability, the scale has face validity and has been used in a number of studies and with different populations, either in whole, or in part [63, 76, 84, 112, 149, 154]. It has been found to discriminate abused and nonabused elders in a sample of 100 older people living in public housing [112], with evidence that a 9-item form was as valid as the 15 item form.

4.10 Indicators of Abuse (IOA)

The Indicators of Abuse (IOA) screen was validated as a 29-item checklist of abuse indicators related to both care giver and care recipient problems [139]. It was developed from theory and research (e.g., [91]). The validated scale comprises 27 problem indicators and 2 demographic indicators, rated on a 5-point Likert scale from non-existent to yes/severe). Of the 27 problem indicators, 12 relate to the care-giver, and 15 to the care recipient. Risk of abuse is determined if any problem indicator is present. The IOA is administered by trained professionals via an interview of both the elder and their care-giver in the context of a lengthy global home assessment.

Psychometric evaluation was undertaken with a cohort sample of 341 adults aged 55 and older who were currently receiving health care and social service interventions as part of the PROJECT CARE Initiative in Canada. The evaluation found excellent internal consistency reliability of 0.92 and 0.91 in 2 studies; and the scale was found to have divergent and concurrent validity, discriminating likely abuse from likely non-abuse in 96 % cases based on referent BASE tool, with 16 % false negatives of abuse cases. The IOA discriminated 89 % of confirmed abuse versus non-abused cases, with 22 % false negatives. Construct validation identified 3 problem domains: caregiver intrapersonal issues (e.g., mental health, behavioral); caregiver interpersonal problems (e.g., relationships generally and with care receiver); and care receiver social support shortages and past abuse. A limitation of the scale as a screening instrument is that it takes the health professional about 20 min to complete, but is part of a very long overall assessment over 2–3 h which helps to inform the clinician ratings of abuse. It is not clear how valid it is on its own.

4.11 Expanded-Indicators of Abuse (E-IOA)

The Expanded-Indicators of Abuse (E-IOA) screen is closely based on the IOA, but indicators are broken down into more specific indicators [26]. It comprises 46 indicators related to the care recipient and 44 indicators for care-givers. Screening occurs through semi-structured Interviews of elders and carers by trained geriatric social workers, part of an overall assessment that takes approximately 2 h, and ratings are completed by the professionals. The validation sample were 108 older hospital patients aged 65 and older in Israel, and their care-givers. Indicators cover a comprehensive range of behavior problems, social isolation, emotional/cognitive issues, developmental delay, alcohol abuse, drug abuse, financial dependence, unrealistic expectations, caregiver reluctance, marital/family conflicts, current relationships, blaming, emotional dependency, social support, and lacking understanding.

Psychometric evaluations report good reliability with an inter-rater agreement of 93 %, and inter-item reliabilities ranging from 0.78 to 0.91. Face and content validity was found acceptable by experts. The E-IOA was found to discriminate between confirmed abuse/non-abuse groups, and correctly classified 92 % of probably abused and 98 % of probably not abused, with low false positive and false negatives. A cut-off score of 2.70 was deemed to indicate high risk of abuse. Confirmatory factor analysis largely confirmed the hypothesized factor structure, and the main indicators for risk included behavioral, emotional and family problems of both caregiver and elder.

4.12 Minimum Dataset Home Care Interview (MDS-HC)

The Minimum Dataset Home Care interview (MDS-HC) is a multidimensional in-home geriatric assessment tool designed as a community analog to the nationally mandated MDS for nursing homes [117]. It involves home-based interviews of elders by trained clinicians, and use of multiple sources of information to rate each item (client, care-givers, observation, medical records). The original validation study reported good 7-day reliability for elder abuse items of 0.79 [117]. The brief 5-item elder abuse items related to fear of family member, poor hygiene, unexplained injury, observed neglect or mistreated, and signs of physical restraint; and assessment was enhanced by consideration of other indicators such as physical and cognitive functioning, environmental and caregiver assessment, and medical records.

The 5-item elder abuse screen was further validated by Shugarman, Fries, Wolf, and Morris [152]. Construct validity was supported by demonstrated associations with social functioning and support, memory problems, mental health, and behavioral problems. This was based on a study of 701 older people aged 60+ seeking long-term care services in Michigan, with at least one informal carer. A strength of the study is that the elder abuse assessment is embedded in a comprehensive geriatric assessment which allows for multiple sources of data to inform ratings, however, limited information is availability about reliability and validity of items, and there is a possible confounding of abuse and neglect. The MDS-HC has also been used in a large scale multisite, multinational European study of over 4000 people aged 65+ receiving health or social community services in 11 European countries [34]. It is worth noting that substantial variations in prevalence rates are reported across countries, as other cross-cultural studies have found, although it is difficult to determine to what extent this may be due to methodological issues such as question translation, sampling, interviewer training and judgement, or more purely cultural issues.

4.13 Older Adult Financial Exploitation Measure (OAFEM)

The Older Adult Financial Exploitation Measure (OAFEM) is the only measure identified that is specific to financial abuse [33, 32], although other scales tend to include one or more items which are scored as part of general elder abuse. The scale was developed in agreement with Illinois Department on Aging and involved several phases. The first phase involved reviewing theories of financial exploitation from literature review, generating a large pool of items, and expert panels to develop theoretically valid items. Then a panel of 16 experts rated the severity of items generated for the concept mapping; resulting in 79 items in 6 domains in descending order of severity (# items): (1) theft and scams (22), (2) financial victimization (16), (3) financial entitlement (4), (4) coercion (13), (5) signs of possible financial exploitation (19), and (6) money management difficulties (5). The scale was developed with both client and staff versions [33], and gained consensus content validation of items and key constructs by experts.

The next phase involved evaluating the scale in a study of APS clients. The FE was administered via interview by 22 APS staff to 227 clients who were substantiated for at least one type of elder abuse [32]. The APS staff completed a staff observation questionnaire on each of the clients they interviewed. The scale was found to meet stringent Rasch analysis criteria for item fit and uni-dimensionality, and with very high internal consistency and item reliability for 79-, 54- and 30-item versions. For instance, the Rasch person reliability ranged from 0.92 (alpha = 0.96) for 79 items, to 0.85 (alpha of 0.93) for 30 items. Similarly, Rasch item reliability was 0.95 for 79 items and 0.96 for 30 items. The analysis supported a conceptualisation of the scale as a single hierarchy of severity levels rather than distinct subdimensions, and authors suggest that the 30-item form is the most promising and parsimonious as a screening measure, although there are specific items in the longer version that may be useful for certain purposes.

Strengths of the scale include a strong theoretical base and use of expert wisdom, modern measurement techniques, well-targeted clients, expert interviewers, and a relatively large database of substantiated clients of elder abuse to test validity and suggest cut-points for judgments of severity. A limitation is that the validation sample is limited to a small geographic area of Illinois, USA, and Conrad suggests that the groups and cut-points will require further replication and validation against external criteria [32]. A sensitivity–specificity analysis can be applied once a suitable cut-point is more reliably established. A 17-item modified version was used by Dong [42, 43].

4.14 Older Adult Psychological Abuse Measure (OAPAM)

The Older Adult Psychological Abuse Measure (OAPAM) has 31 and 18 item forms [31]. It is a unidimensional scale and includes 4 types of psychological abuse: isolation, threats and intimidation, insensitivity and disrespect, and shaming and blaming, as well as risk factors. It was developed in agreement with the Illinois Department on Aging from a literature review and expert panels. The OAPAM is designed to be administered via interview by professionals. Validation involved data from 226 clients with adequate cognitive capacity who were interviewed in their homes by staff of the Illinois Department of Aging. A high internal consistency reliability of 0.92 was reported for the scale and it met stringent Rasch analysis fit and unidimensionality criteria. Analysis suggested a cut-off score for abuse. This is the only scale identified which used Rasch analysis scale validation methods, but further validation is needed, and it has been validated in one area of Chicago only.

4.15 Resident-to-Resident Elder Mistreatment: Staff Version (R-REM-S)

The Resident-to-Resident Elder Mistreatment- Staff version (R-REM-S) is an 11-item scale measuring staff recollection of resident-to-resident mistreatment over the past 2-weeks [158]. The items were modified from the Cohen-Mansfield Agitation Instrument (CMAI; [27]), plus items derived from staff focus groups. The CMAI is a tool that seeks to measure behavioral disruption in nursing home residents. The validation sample was based on ratings by certified nursing assistants on 1812 aged residents of 5 large long-term aged care facilities in New York. Psychometrics on the scale suggest that it has moderate reliability and validity. The 11-item scale had an internal consistency of α = 0.74. There was some support for verbal and physical subscales (4 items each) with Cronbach’s α = 0.73, 0.65 respectively. Limitations to the robustness of the scale includes the low prevalence of some items that limits reliability estimates; and distributions are skewed with greater reliability at the “caseness” end of distribution. However, it provides a third-party assessment of potential abuse in long-term care facilities, where self-report may not be possible.

4.16 Social Vulnerability Scale (SVS)

Social Vulnerability Scale (SVS22). The Social Vulnerability Scale (SVS22) is a 22 item scale that measures gullible behaviors and credulity (cognitive behaviors) which are thought to contribute to exploitation of older people [135]. Gullibility items relate largely to financial exploitation while the credulity items relate to social vulnerability more generally. Because it seeks to assess vulnerability among cognitively impaired elders, knowledgeable informants complete the written questionnaire rather than self-completion by the elder themselves. Responses are provided on a 5-point Likert scale that measures frequency of behavior, with higher scores indicating greater vulnerability.

Sound psychometrics have been reported in terms of reliability, face and content validity. The scale has demonstrated excellent reliability among a small sample of informants, in this case, 167 University undergraduate students who completed the scale in relation to a relative/friend aged 50 years or older, with memory, stroke, dementia or other neurological condition (clinical sample) or a healthy control. The internal consistency reliability of the scale was α = 0.92, with test–retest reliability at α = 0.87. The scale was also found to distinguish between clinical and non-clinical samples, indicating good discriminant validity. A strength of the scale is its suitability for assessing vulnerability among the cognitively impaired, a high risk group for elder abuse. Limitations include no assessment of the subjective experience of the elders, potential bias of informants, a limited validation sample and few validation tests. Further validation is needed with more diverse and larger samples. Despite these limitations, the SVS22 seems useful as a measure of susceptibility to financial exploitation in particular, and social vulnerability more generally among this population that poses measurement challenges.

Social Vulnerability Scale (SVS15). The Social Vulnerability Scale has been further refined to a 15-item short form (SVS15) [134]. As for the SVS22, the SVS15 measures informant report of older vulnerability to exploitation. Validation tests support both the unidimensional scale and a two factor solution: gullibility (8 items), and credulity (7 items), making it a relatively brief and easily administered scale. Validation was undertaken with a sample of 266 informants, 116 in the clinical group and 150 in the non-clinical group. Reliability of the total scale and two subscales was good, range from 0.90 for the total scale to 0.85 and 0.86 for the gullibility and credulity subscales respectively. The scale was also found to discriminate between clinical and non-clinical older adults on social vulnerability. A confirmatory factor analysis confirmed the two factor solution with Cronbach alphas ranging from 0.79 to 0.88. Further validation is needed in terms of convergent, discriminant and predictive validity, along with larger samples.

4.17 Summary of Elder Abuse Screening Measures for Use in Healthcare Settings

The measures used to screen for elder abuse and neglect in healthcare and institutional settings range from general measures of abuse and conflict in relationships, such as the CTS and its derivatives, to increasingly more specific and differentiated measures of different forms of elder abuse and neglect. This trend has followed research and theory about different forms of abuse, especially in relation to psychological, emotional, and coercive abuse. This development in the understanding of forms of abuse also provides a challenge in terms of yielding measures that are brief enough to be useful in busy clinical settings, but specific enough to provide a reliable risk assessment. Further work is needed to meet this challenge and to improve the state of research on the validity of different screening instruments and types of abuse and neglect. In particular, little progress has been made in validating measures of sexual abuse, financial exploitation, or neglect by caregivers. As well, the complexity of caring relationship difficulties needs to be better captured. One aspect of elder abuse screening that has received some attention in recent years is the screening for elder neglect and approaches to screening for both care-giver neglect and self-neglect are reviewed in the following section.

5 Screening for Elder Neglect and Self-Neglect

The construct of elder neglect is complex and challenging to operationalise for the purposes of screening older people. It includes two distinct categories: care-giver neglect and self-neglect. Care-giver neglect has been defined as “the failure of an elderly person to receive essential services from a responsible caregiver” [2]. By contrast, elder self-neglect has been defined as the behaviors of an elderly person that threaten his/her own health or safety, and can involve an older person’s refusal or failure to provide himself or herself with adequate food, water, clothing, shelter, safety, personal hygiene, and medication (when indicated) [120]. Self-neglect is often considered in the context of elder abuse as a form of self-abuse, although, in many respects, it is conceptually quite distinct from usual definitions of interpersonal abuse, which imply abuse by another.

It needs to be noted that much has been written about assessment and substantiation of elder neglect but there are very few studies that report on the psychometric evaluation of screening instruments. A summary of seven screening measures for elder neglect and self-neglect, that are at least partially validated, are shown in Table 9.3. Unlike the measurement of other forms of elder abuse, the measurement of care-giver neglect and self-neglect largely involves measuring the absence of appropriate caring behaviors rather than the commission of specific acts of abuse. This review revealed only one study on the development of a specific screening measure for care-giver neglect and a small number of validated scales measuring self-neglect in the time period from 1995 to 2015, perhaps reflecting this challenge. Yet, the development of valid and reliable measures of neglect is important since neglect is thought to account for a majority of elder mistreatment cases, both reported and not reported [72]. For instance, the National Elder Abuse Incidence Study reported that 49 % of 70,942 elder mistreatment cases substantiated by Adult Protective Services (APS) were categorized as neglect [120].

Table 9.3 Validated measures for screening for elder neglect

The measurement of neglect is often embedded with more comprehensive or generic elder abuse scales and these have been reported in Tables 9.1 and 9.2. Table 9.3 presents scales that were designed specifically to measure care-giver neglect (n = 1) and self-neglect of the elderly (n = 6). Measurement of elder neglect draws heavily on two primary theoretical constructs related to risk or vulnerability factors rather than specific acts of neglect. One is the construct of unmet needs, as measured in scales of Activities of Daily Living (ADL) and Instrumental Activities of Daily Living (IADL). Another is the construct of self-care agency or self-care skills, as the hypothesised inverse of self-neglect. A third influence on the measurement of self-neglect is the importance of environmental assessment involving indirect and observational measure of the level of self-care and care of the home environment. Only one study was identified in the 1995–2015 period which reported on the reliability of a caregiver neglect screening scale [43, 46]. By contrast a growing number of studies have sought to validate measures of self-neglect.

5.1 Caregiver Neglect

The Caregiver Neglect scale was examined as a measure of unmet needs in the PINE Study of Chinese elders in Chicago [43]. The 20 item scale was modified from the Activities of Daily Living (ADL) scale (8 items) and the Instrumental Activities of Daily Living (IADL) scale [86, 99]. Internal consistency reliability was 0.93 for the 20 item scale (personal communication, [21], 0.92 for the eight ADL items, and 0.90 for the 12 IADL items [46]. The scale is administered by trained multicultural and multilingual interviewers in the elder’s home. Severity of impairment and unmet need is rated on a 4 point scale from none to severe. There are limited psychometrics reported, although ADL and IADL have been used and validated by others as a surrogate measure of neglect (e.g. [11, 133]).

5.2 Elder Self-Neglect (Unnamed Scale)

The 21 item Elder Self-Neglect Scale was developed for the Chicago Health and Aging Project, a community study of elders who had been reported to social service agencies for suspected elder self-neglect [43, 49, 56, 54, 58, 59, 51, 52, 57, 53]. Conceptually it is based on identifying unmet needs for care and addresses five domains of self-neglect: hoarding (4 items), personal hygiene (4 items), unsanitary conditions (5 items), house disrepair (5 items), and inadequate utilities (3 items). The measure is completed by trained interviewers, mainly by observation, during in-home assessments. Each item is rated on a 4-point scale with a higher score indicating more severe self-neglect. The scale has demonstrated good to excellent reliability and some construct validity. Internal consistency reliability of 5 domains was 0.97–0.98, and inter-rater reliability >0.70. Content validity was supported by an expert panel review. Construct validity was supported by significant, though low, correlations with cognitive impairment (0.10), physical function impairment (0.05), and depressive symptoms (0.12). Predictive validity has also been supported by prediction of mortality [56], and hospital re-admissions [49].

5.3 Elder Self-Neglect Assessment (ESNA)

The Elder Self-Neglect Assessment (ESNA) comprises a 77 or 62-item assessment developed as part of the Illinois Comprehensive Care Assessment questionnaire for assessing elders for home and community-based services [82]. The items were developed through a concept mapping framework involving 50 professionals who developed, sorted and rated a list of self-neglect indicators for “likeness” and for their importance to the concept of self-neglect. This resulted in seven conceptual areas: personal endangerment, environmental, financial, mental health, personal living conditions, physical health, and social network and culture. These seven areas were grouped into two broad conceptual areas: environmental aspects, and physical and psychosocial aspects of self-neglect. Professionals completed the assessment through observation in the home on a 5-point response scale. The ESNA was found to meet RASCH fit criteria with good internal consistency, item reliability, and construct validity. A 25-item shorter version was also found to meet RASCH fit criteria. The ESNA is designed for assessment of those suspected of self-neglect, and the validation sample had limited diversity. Its strengths lies in the RASCH analysis of items, the involvement of 13 agencies in the validation phase, and the finding that both environmental and behavioural aspects are important to assess. The findings also supported a hierarchy of items associated with severity of abuse by frequency of occurrence, aspects of the validation process that other studies should pay attention to.

5.4 Self-Care Agency Scale (ASA)

The Self-Care Agency Scale (ASA) is a 24 item scale derived from the theory of self-agency which seeks to explain the link between self-care needs and action [3, 83]. Self-care agency is understood to comprise both power components and self-care actions. The 24 items are made up of 15 positive-worded items, and 9 negatively worded items, all rated on a 5-point Likert scale. Responses are summed for a total score, with higher scores reflecting higher self-agency. The scale is administered by interview and has demonstrated internal consistency reliabilities ranging from 0.71 to 0.86 for different patient samples [3, 83], with a reliability of 0.72 on one elderly Dutch sample [66]. The scale discriminates between elderly patients receiving institutional rehabilitation care versus those living at home [105], and was associated with nutritional self-care, providing some support for concurrent validity.

5.5 Screening Scale for Elder Abuse: Self-Neglect Subscale (SSEA-SN)

The Screening Scale for Elder Abuse (SSEA) was used in a population based survey in South Korea (Kim, Kwon, Lim, and Lee [89]. The full scale is not reported in this review as it had not been published in English. However, the 5-item Self-Neglect subscale is published in English and its inclusion here adds to the global picture of screening instrument development [100]. The scale was administered in home-based interviews by trained interviewers. It was first validated with a sample of 481 Koreans professionals who worked in social service agencies, then in a random stratified sample of 1023 adults in a metropolitan city of South Korea. The scale was found to have an internal consistency reliability of 0.76 in the population based sample and some concurrent validity was demonstrated through correlations with ADL and IADL scales [100]. The scale is available in Korean and English languages.

5.6 Self-Neglect Severity Scale (SSS)

The Self-Neglect Severity Scale (SSS) was developed by the Consortium for Research in Elder Self-Neglect of Texas (CREST) [87]. The item pool was developed through structured interviews with APS specialists and then reviewed and refined by an expert panel. It was further refined to comprise 30 items in three domains of personal hygiene (5 items), Impaired Function (6 items), and Environmental Status (19 items) [87]. The scale also includes social history questions and a physical examination, a global risk assessment item, and several summations scores. It is completed by observational assessment in the elder’s home. Response options and scoring methods differ across items with some 4-point scale response and some dichotomous items. Inter-rater reliability has been found to be high, and internal consistency adequate for the environment and composite scores. Validation of the scale is at an early stage, with data provided from a field test on 23 older adults. The SSS has been found to discriminate between self-neglect and non-self-neglect groups on the three domains and composite scores, but the clinical significance of these differences has not been established, nor has the validity of the composite scores. The sensitivity and specificity of the scale was reported as poor. Like many neglect measures, it does not screen for nutrition issues.

5.7 Neglect Scale (Unnamed)

Ayalon [11] reports a validation study of a 7-item measure of unmet need for services, with items developed from the ADL and IADL literature, expert panels and pilot testing [64]. The scale was designed to be conducted via interview in the elder’s home, though some elect to self-complete a written questionnaire. The scale was first validated in a study of over 1000 community-dwelling older adults taking part in an Israeli National Survey of Elder Mistreatment, and then in a convenience sample of 148 matched older adults in daycare centres, family members, and home care workers. The internal consistency reliabilities ranged from 0.57 to 0.93 for different samples. Concurrent validity was based on associations with the older adults’ reports of loneliness, feelings of safety, and overall sense of neglect, as well as with lower financial status, and lower satisfaction with the caring relationship [11]. A 5-item version of the scale was determined to be a more adequate measure of neglect, as it demonstrated better configural, metric and scalar invariance across the three groups of informants.

5.8 Summary of Screening Measures for Elder Care-Giver Neglect and Self-Neglect

Screening for elder neglect and self-neglect requires further conceptual development. Only one study evaluated a screening tool for care-giver neglect, using ADL and IADL assessment of needs as the proxy measure [42, 46]. The six tools reviewed here for screening for elder self-neglect highlight the complexity and diversity of approaches in this area. There seems to be a growing consensus that effective screening needs to incorporate both behavioural/self-care skills as well as environmental assessments, although it could be argued that much of this investigation is perhaps more oriented towards substantiation of self-neglect rather than development of a simple screening measure. These two purposes could be better differentiated in future research, as there is room for both, given the potential expense of the environmental assessment aspects.

6 Development of Screening Tools for Elder Abuse and Neglect: Methodological Issues

A number of key methodological issues arise from the research reviewed and these are discussed in the following section to indicate some of the complex challenges associated with screening for elder abuse and neglect.

6.1 Conceptual Issues

Many have noted the complexity of the phenomenon of elder abuse (e.g., [16, 34, 41, 70]), which is generally understood to comprise one or more aspects of physical, psychological, emotional, verbal, financial, and sexual abuse, as well as neglect and (sometimes) self-neglect. Measurement of complex latent constructs such as elder abuse requires a formative measurement approach that seeks to identify indicators of abuse or vulnerability that are known to predict the phenomenon of interest [96]. Early measures tended to assess only limited aspects of these domains, and increasingly there has been recognition of the need for more differentiated measures. For instance, the original development of the CTS focused on physical assault and psychological aggression [155], whereas later developments have added in dimensions such as sexual coercion and injury in the CTS2 [156]. Others have expanded the forms of abuse to include financial exploitation [33, 32, 159], neglect [42, 43, 46] and self-neglect [56, 51, 52]. However, greater differentiation challenges the need for brevity and ease of completion in an effective screening instrument.

General measures of abuse include the EAQ [92], the EASI [172], and the IOA [139], but the constructs included vary and the relative weight of different constructs can influence any research that relies on such global measures of abuse. Measures differ in terms of whether they include one item per type of abuse or multiple items. The reliability and validity of measures for specific types of abuse has been at least partially demonstrated for the following: physical assault (e.g. Giraldo- Rodriguez and Rosas-Carrasco [79, 155, 156], psychological abuse (e.g., [31, 79, 155, 156, 164, 165]), sexual abuse (e.g.[79, 156]), financial abuse (e.g., [33, 32, 79]), neglect (e.g., [42, 79]), or self-neglect (e.g., [56, 51, 52, 57, 53, 82]). There are some conflicting findings in terms of whether scales such as the H-S/EAST [123] are unidimensional or multidimensional (e.g. [84]).

There is ample evidence of conceptual fuzziness in many attempts to measure abuse. For instance, neglect and self-neglect are often assessed using ADL and IADL tools which represent a very indirect way of assessing neglect. There may be many reasons why an older person’s functioning is impaired and their care needs not met, which do not comfortably fit within the usual meaning of neglect [61]. Unfortunately many studies use poorly defined measurement tools and provide no information about reliability and validity. Even those measures that have received some level of validation, are frequently modified when taken up by other researchers, making it very difficult to build an evidence base for any particular measure [41].

Another aspect of conceptual fuzziness found throughout the research reviewed was the frequent conflating of the different purposes of an assessment instrument under the concept of screening for elder mistreatment. Some measures are designed to screen a large group of older people to identify those possibly at risk, who can then be referred for further evaluation. This purpose reflects the original definition of a screening measure [125, 128]. However, the term screening has also been applied to the purpose of investigating or substantiating abuse allegations, reflecting a more diagnostic evaluation. These are two different things, contributing to the very diverse approaches to measuring abuse.

Another conceptual issue is that the context of abusive behaviour is poorly understood and measured. For instance, most scales that assess caregiver characteristics seek to identify certain traits or behaviors associated with abuse, with little attention to contextualising the abuse. Family carers, for instance, may demonstrate no abusive behaviour over most of their caring experience, yet we know little about what specific circumstances or dyadic interactions may lead to abusive acts. Research on couple violence suggests that abusive behaviour can be interactive, reciprocal and retaliatory in specific circumstances, i.e., situational violence rather than an ongoing pattern of violence [88]. Future work needs to bring a greater understanding of relational dynamics, and context to better illuminate the phenomena. Such information can then inform more targeted interventions.

6.2 Item Generation

Most studies reviewed in this paper that set out to develop a new scale have used a methodologically sound approach to item generation, basing it on literature review, expert consensus, pilot testing and refinement through psychometric evaluation (e.g. [33, 32, 31, 79, 84, 123, 149, 164, 172]. However, a key issue affecting the reliability and validity of a measure is the number, breadth and appropriateness of items used to measure each construct. As elder abuse is a complex latent construct, reliable measurement requires more than a single item, and item generation needs to be guided by theory.

Much research has focused on the need for prevalence data, where the focus has been on generating a dichotomous variable of abuse: present or not. Thus, many studies, particularly in the medical literature, have relied on one item per type of abuse. This provides a very limited and potentially unreliable screen. Others have generated multiple items, and developed coherent reliable conceptual factors or scale. While this initially leads to a large number of items, progressive psychometric evaluation can be used to develop robust longer and shorter forms of scales with documented reliability and validity (e.g., [32]).

The use of multiple items to assess each type of abuse also allows for an exploration of more nuanced concepts of severity and chronicity of abuse. Elder abuse may occur as a one-off incident or an ongoing pattern of abusive behaviour, in single or multiple domains, and measures need to distinguish between these important characteristics. Abusive acts can also vary in severity from minor to severe, and the context in which it occurs can be an important component. Measures that define abuse on the basis of one incident, regardless of whether it is minor or severe, one-off or chronic, will have a large amount of variance in the data, and provide little guidance about what type of intervention may be required. As an example, the brief EASI measure has been criticised on face validity grounds, as the scoring method allows for a classification of abuse on the basis of any one item. This is problematic since four of the six items reflect a different type of abuse, one item measures a clinician’s overall assessment of any sign, and one does not actually measure abuse, but rather functional impairment.

6.3 Response Options and Scoring

The construction of methodologically sound response options is a crucial aspect of scale construction (Schofield and Forrester-Knauss [145]). As already noted, questions on elder abuse often require a simple yes/no response option which provides limited and potentially inaccurate information—how does a person answer if they think neither option is accurate? The use of ordinal scaling methods allows for a more differentiated response which can provide useful information about issues such as severity, chronicity, importance and so on. Likert scales are a commonly used form of ordinal scaling in this type of research, but few studies are adequately informed by the psychometric theory, although recent research using Rasch analyses represents an advance in this area [33, 32, 31, 82]. Such analysis may lead to more robust methods of establishing appropriate cut-points on scales for identifying clinically relevant levels of elder abuse, in conjunction with use of better external criteria.

The review indicates a lack of consensus in how to score instruments for elder abuse. Some studies define “caseness” by any positive response to a single item (e.g. [108]), whereas others assess it in terms of the number of positive items (e.g., >10/32 items; [164], or number of times abuse is experienced (e.g., 10 or more times, [16]). Some researchers select a cutpoint on an ordinal or interval scale depending on how conservative they want to be. For instance, Cooper et al. [38] defined caseness as a score of >2 on a 5-point scale (sometimes occurs) for any item. Rasch analysis has also been used to define possible cutpoints based on a combination of severity and frequency data [32, 31]. Such differences will clearly produce very different results for prevalence and risk factor estimates and make comparisons across studies impossible.

Some researchers have evaluated the impact of using different combinations of elder abuse subtypes on prevalence rates. For instance, Dong [42] reports on variations to prevalence rates by applying criteria from least to most restrictive and demonstrates how this may account for some of the observed variation in prevalence rates across studies. Given the complex nature of elder abuse and the serious implications of identifying potential cases of abuse, this is an urgent challenge for the field. Most scales include both minor and severe forms of abuse, and research is needed to better encapsulate the severity and chronicity aspects of the phenomena in the scoring method.

6.4 Modes of Assessment

The most common modes of assessment are self-report questionnaires or interview by a health professional, either in the health setting or home environment. Self-report has the advantage of being more economical and better allowing for mass screening. It may also facilitate more honest answers when completed in private. Disadvantages include that it may be unsuitable for those who are cognitively impaired, do not have adequate language or reading ability, or who lack the time or motivation to complete. Another disadvantage of questionnaires administered to large samples such as national surveys is that there is often no strategy to follow up those who are identified as at risk of elder abuse.

Screening undertaken by health professionals has the advantage of optimising motivation, embedding the screen in a more comprehensive assessment, allowing for follow-up questions and elaborations if needed, and gathering of more information about context. It is also more likely than for questionnaire screening that appropriate referral pathways or interventions can be implemented by the health professional for those identified as at risk. Disadvantages include the lack of time for many busy clinicians, the resource intensive nature (especially for home interviews), potential lack of training and comfort in asking highly sensitive questions, potential bias in scoring and interpretation, and there may be a lack of known referral and intervention options in some cases.

Observational screening tools have been promoted, particularly for the assessment of neglect and self-neglect, when the older person may be unable or unwilling to reliably report on these aspects. Observational measures are viewed as more objective in some ways, but once again, can be subject to unreliable conclusions. For instance, many have noted that poor nutrition is a key feature of self-neglect, but attempts to validate an observational measure by methods such as pantry or refrigerator inventories have been repeatedly found to be unreliable (e.g., [61]). The amount and quality of food identified may vary dramatically depending on when the last shop was conducted, or the older person may have already prepared food brought in by others on a regular basis. Different diets will also make it difficult to develop a reliable scoring system.

Modes of administration using innovative technology such as hand held personal digital assistants (PDAs) for data collection are of emerging interest. PDAs have a number of advantages for use in research in healthcare settings such as efficiency of collection and entry of data, ease of use, portability, quick access to databases and use of information for intervention, and lower cost for large scale studies. However, it is not clear whether they are feasible and whether the data collected are comparable to more conventional methods. One research group has evaluated the use of PDAs in a multisite study of elder neglect in emergency department environments in the US [72, 80]. While thorough training of data collection staff led to competence in use of the devices to collect EAI data in the training environment, once collection began in busy emergency department settings, a number of difficulties were identified, which made them difficult to use in situ, both for technical as well as relational reasons [80]. This may be particularly problematic in research on sensitive issues such as elder abuse and neglect, where gaining good quality data depends on maintaining a good interpersonal relationship with the older person being interviewed. Many staff felt that the challenges of trying to enter data in the presence of the respondent was difficult and many resorted to writing responses and entering them later. Other challenges encountered were problems with uploading data due to an unreliable server connection, equipment loss or failure, concerns about security, battery failure, and problems associated with obsolence of both hardware and software over the course of a study. As technological innovations progress, some of these issues can be addressed, however, it is noted that technology will never be totally reliable and a number of recommendations are made to ensure there are back-up methods for optimising data collection under all conditions [80].

6.5 Multidimensional Assessment

To address the challenges of conceptual confusion, and provide a more comprehensive and rigorous screen, there has been a move towards using a multi-dimensional approach to assessment. An early trend was to include assessments of both the care-recipient and care-provider (e.g. [10, 14, 138, 139, 155, 156]). More recently, Fulmer and colleagues investigated the value of dyadic vulnerability and risk profiling among older people and their carers to see whether the dyadic profile provided a more thorough means of identifying elder neglect [73]. Their model was derived from an earlier risk-and-vulnerability model applied to elder abuse [69]. This involved measuring various proxy indicators such as: unmet needs using six ADL and nine IADL items; personal resources; childhood trauma; personality; and the Caregiver Hassles Scale [90]. Interestingly, their findings supported the inclusion of risk factors related to both caregiver and elder vulnerability as a means of screening more comprehensively for elder neglect. However, a limitation of this model is that contextual and relational dynamics are not included. Others have developed a more complex package of assessment tools such as the Ohio Elder Abuse and Domestic Violence in Late Life Screening Tools and Referral Protocol [65], but psychometric validation did not meet the eligibility criteria for this review.

Another interesting approach has been to develop a more comprehensive risk index by aggregating across various measures [48]. Dong and Simon found that a 9-item vulnerability index demonstrated good accuracy for reported and confirmed elder abuse outcomes in a community sample. For instance, older adults with three or more risk factors were almost four times more likely to experience elder abuse as those with two or fewer risk factors. Those with five or more risk factors were 26 times more likely to experience elder abuse. The nine variables making up the index were: age, gender, race, income, medical conditions, cognitive function, physical function, depressive symptoms, and social network. To combine the variables into a risk index, cutpoints for each risk factor were examined for precision in detecting elder abuse, and the cutpoints with the strongest associations with elder abuse outcomes were applied in the combined index. Such studies address problems associated with using one risk indicator or one abusive behaviour as the basis for prevalence estimates and establishing ‘caseness’, and are potentially able to be applied in healthcare and statutory environments.

Environmental assessments have long been considered an important aspect of a comprehensive geriatric assessment [117] and have been used in a variety of elder abuse measures, particularly of neglect. Home-based assessment also allows for a higher level of contextual information to inform judgments about the level of abuse or neglect. Greater context is also required for accurate interpretation of relational dynamics that may contribute to abusive behaviour. While there is recognition that multiple perspectives are useful, what is of most interest is the interplay between perspectives. What does it mean if the elder person and their carer have very different responses to a question?

6.6 Culture and Language

A limited number of studies suggest that scales translated into other languages using established methods of forward and backward translation have shown adequate reliability. For instance, translated versions of the CTS and CTS2 have been reported to perform reliably for Chinese populations [20, 46, 173, 174], and for Tamil speakers [23]. Scales have been also translated and validated in Spanish such as the CASE [130], while the H-S/EAST and VASS have been validated in Chinese speaking populations [55, 170]. Others have used local expert input to develop or modify scales for specific cultural contexts, such as the development of the EDMA in Spain [161], the GMS in Mexico [79], the E-IOA in Israel [26], the PEAS/EPAS in Taiwan [164, 165], and a Korean language scale [127]. Nevertheless, there is a commonality across the content of major domains and items. Further work is needed to determine the effectiveness of translation and to better understand appropriate methods to measure elder abuse in difficult cultural context.

Few studies have administered screening measures in different cultural contexts or examined how they perform across cultures. The few multi-national studies identified in this review report different prevalence rates across countries (e.g., [40, 68, 108, 152]). The cause of these differences is not known, but could be partially explained by measurement error. Methodological issues could include translation problems, different meanings or values placed on certain behaviors or constructs in different cultures, different sampling strategies or modes of administration, differences in training and so on. However, there is some evidence that differences may also relate to community-level indicators of difference, as well as the individual characteristics of the elder or their caregiver. For instance, in the largest multi-national study identified, the ABUEL study [68, 111], a number of country and city level socioeconomic indicators helped to explain differences in elder abuse rates across seven European countries. More specifically, country-level indicators of economic inequality were significantly related to the prevalence of financial elder abuse, and the city mean tertiary education explained a significant amount of variation in the prevalence of psychological abuse among these elderly populations. This study provides valuable insight into the potential of additional community-based indicators to contribute to elder abuse, and provide a means of targeting higher risk groups for prevention, screening and intervention programs.

6.7 Cognitive Impairment

Most studies have not investigated the applicability of screening instruments for those with cognitive impairment, despite this being a risk factor for several forms of elder abuse and neglect [59]. This is partly due to the limited reliability of self-report data with this group, and potential bias arising from asking carers to report on their abusive behaviour towards the care-recipient. This is an area where alternative assessment methods need to be explored. One scale, the SVS22 and the SVS15 [134, 135], was designed specifically to assess vulnerability to exploitation among cognitively impaired older people. It has demonstrated good psychometrics for two subscales of gullibility and credulity, and discriminated well between clinical and non-clinical populations. However, it relies on care-givers as informants and doesn’t not gain the elders’ perspective. This is problematic for studying elder abuse by care-givers.

Another research group has explored the challenges of assessing cognitive capacity as an indicator of self-neglect. The Consortium for Research in Elder Self-Neglect of Texas (CREST) developed the COMP Screen to assess elders’ capacity to make decisions relevant to safe and independent living [119]. The researchers argued that conceptualisation of self-neglect needed to distinguish between three forms of lack of self-care: that resulting from disability, an autonomous choice, or from a decline in decision-making capacity. They proposed that only this third group should be seen as suffering from self-neglect. Their COMP Screen was designed to test three cognitive domains (attention, expressive language, and delayed recall), and a fourth conceptual domain of awareness, which tests the ability to compare alternatives and articulate a choice between options. However, they failed to support this assessment as an adequate screening approach for self-neglect. One conclusion from this study is that a screening instrument for elder self-neglect needs to assess both the capacity to make decisions and the capacity to act on those decisions [119].

7 Effectiveness of Screening for Elder Abuse and Neglect

The USPSTF [163] criteria for the effectiveness of a screening test states that the test must be able to detect the target condition earlier than without screening and with sufficient accuracy to avoid producing large numbers of false-positive and false-negative results; and the screening and follow-up treatment should improve health outcomes compared with treating patients when they present with signs or symptoms of the disease. Whether screening for elder abuse meets these criteria has been the subject of some debate [129]. Some authoritative health bodies have recommended routine screening of older patients (American College of Emergency Physicians [5]; American Medical Association [AMA] [6, 7]; National Gerontological Nursing Association [NGNA] [121], while others suggest that research evidence fails to support this recommendation [126, 128, 166]. The most comprehensive systematic review of the effectiveness of screening for elder abuse and neglect has been undertaken by the US Preventive Services Task Force [126] for the years 2002–2012, extending their earlier review of studies from to 2002 [125]. This review examined the effect of screening asymptomatic elders or vulnerable adults in health care settings for elder abuse or neglect and found only a very small number of studies. Among these, the USPSTF found no adequate randomised control trials or controlled observational studies that examined whether screening reduced exposure to abuse and neglect, or reduced physical or mental health harm or mortality. The USPSTF concluded that there is insufficient evidence that screening or early detection reduces exposure of vulnerable elders to abuse or decreases any harm caused by the abuse [126].

The relative lack of evidence is a result of complex factors such as the nature of elder abuse itself, the difficulty in accurately measuring it, and the relatively early stage of development of valid and reliable measures. Furthermore, there have been few longitudinal studies that have examined the predictive validity of screening. Of course, lack of evidence is not proof that screening is ineffective, and there is broad agreement that efforts should be made to both identify elder abuse early and put in place measures to prevent or reduce its impact. Research related to specific questions about the effectiveness of screening are addressed next.

7.1 Does Screening for Elder Abuse and Neglect Identify Those at Most Risk of Elder Abuse and Neglect?

Because of a lack of a gold standard measure of elder abuse and neglect, it has been difficult to address the question of whether screening accurately identifies those at most risk of elder abuse and neglect. The 6-item EASI measure [172] was rated as of fair quality by the USPSTF in its most recent review and identified as the best validated measure suitable for screening for elder abuse in healthcare settings available at that time [126]. However, the scoring system is not clearly specified and the first item does not actually measure elder abuse.

Some studies of ADL and IADL measures have reported associations with confirmed elder abuse status, though sensitivity and specificity analysis is not available. For instance, a matched case–control study of abused and non-abused recipients of Israeli social welfare services found that the abused group was significantly more impaired [104]. This suggests that such a brief easily administered may be a useful measure of vulnerability to abuse, but the causal direction has not been established. Interestingly, while ADL measures are usually conceptualised as measures of neglect or self-neglect, functional impairment is related to elder abuse more broadly, not just neglect.

Some methodological issues that impact on this question include the representativeness of samples. Are those who take part in screening studies representative of all older people, or may those who are vulnerable to abuse be less likely to participate, leading to under-representation in studies? Item non-response is another methodological issue. Are older people willing to answer questions about abuse by a family member or carer? De Donder et al. [40] found that item non-response ranged from 1.8 % for neglect to 4.2 % for items such as excluding, ignoring, or destroying possessions. They found that different non-response patterns related to different types of abuse. For instance, response patterns for neglect and emotional abuse differed from those for more serious forms such as financial, physical, sexual abuse and violation of personal rights. Response patterns also differed by mode of administration—postal versus face-to-face interview versus telephone interview. Compared with postal survey, face-to-face interviews had less missing data. After taking into account individual characteristics and modes of administration—higher non-response was associated with higher age, lower income, lower education, and poorer mental health. Presence of another person (interviewer or family member/health care provider) was also associated with less missing data. Such issues have implications for the ability of screening measures to produce accurate identification of those at risk.

7.2 Does Screening for Elder Abuse and Neglect Predict Poor Health Outcomes and Mortality?

A very small number of studies have established that substantiated reports of elder abuse significantly predict shorter life spans after adjusting for other factors related to mortality in older adults. These include the New Haven Established Populations for Epidemiologic Studies in the Elderly [94], and the Chicago Health and Aging Project which found that reported and confirmed elder abuse and self-neglect predicted 1-year all-cause mortality [56]. Furthermore, mortality associated with reported and confirmed elder abuse was greatest for those with lowest levels of psychological and social well-being [50].

Of more relevance to this chapter are studies that have examined the long-term health outcomes for those identified by screening tools as at risk of abuse. These epidemiologic studies are based on self-report of abuse, not substantiated abuse. Two large community-based prospective studies have examined health outcomes associated with self-reported elder abuse. Analysis of data from the Australian Longitudinal Study on Women’s Health (ALSWH) indicated that one of the four VASS subscales, dejection, predicted 3-year physical and mental health outcomes [146, 147, 149]. A longer-term follow-up study of this cohort found that the VASS subscale, coercion, significantly predicted mortality 12 years after screening. Disability was predicted by the vulnerability and dejection subscales, after controlling for other confounders. In the US, the Women’s Health Initiative data of women aged 50–79, self-reported physical and verbal abuse independently predicted mortality over 7–8 years, with physical abuse having the highest predictive value [12].

Limitations of current research on health outcomes of elder abuse include the small number of studies, short-term follow-up in prospective studies [56, 147], a reliance on cross-sectional research [24, 30, 67], and a focus on cases of substantiated abuse only [50, 94], or on current symptomatology rather than longer-term serious health outcomes such as mortality and disability [30, 67]. Only one study was found that that examined long-term health outcomes over a decade or more [148], and few that examined disability as an outcome. Given the long-term costs of disability, this is an important outcome to investigate.

7.3 Does Screening Lead to Effective Referral and Treatment?

Screening practices are considered useful only if they lead to development, implementation and evaluation of effective preventive and early intervention programs. Very little research has addressed this question. Most studies of screening for elder abuse do not report on follow-up of those identified as at risk, except among studies of suspected cases reported to authorities such as Adult Protective Services (APS). One retrospective study of 575 elderly veterans (96 % male) aged 65–103 years in California, USA, identified those possibly exposed to abuse or neglect and referred them for social work interventions [114]. The actions taken to reduce exposure to abuse or neglect are reported, but actual health outcomes are not adequately measured.

7.4 Are There Effective Intervention Programs Available for Those at Risk?

A systematic review of the effectiveness of interventions for elder abuse found insufficient evidence to support any particular intervention [136]. In this review, only eight studies met inclusion criteria for the review, and many methodological weaknesses were identified in the studies. Of concern, intervention had no effect on abuse in most studies and may have even increased future abuse. The interventions were also found to have no positive effect on at-risk care-giver outcomes and behaviour, and mixed outcomes for professional knowledge and behaviour. This suggests that the pathway from identification of risk to successful improvement of outcomes is fraught with many difficulties, and will require more innovative approaches. The type of interventions available have also been reviewed by Alon and Berg-Warman [4]. Yet, those who work at the coalface, see a clear need for intervention programs, as a legal framework is insufficient on its own to address this complex social problem, and has potential to create harm as well as benefits, as has been found in the challenging area of child abuse.

7.5 Are There Adverse Outcomes from Screening?

Family violence research suggests that a major concern of screening is the potential for retaliation by the abuser should they find out about the screening, resulting in direct harm caused by the screening [107]. Given the added dependency between carers and elders, it may be that screening for elder abuse and neglect could put the older person at greater risk. For instance, postal surveys may be opened by abusive carers. The presence of carers may also make it difficult for screening to take place in the home or in the health care setting. Although no studies have specifically sought to examine this question, there is very little evidence available of actual harm caused by screening for elder abuse or neglect. The USPSTF systematic reviews suggested that the potential for harm is small, but may include shame, guilt, self-blame, fear of retaliation or abandonment by perpetrators, and distress caused by false-positive results [125, 126]. Further research is required to address this important question. A screening program needs to ensure that the questions are asked in a safe and private environment so that elders are not put in further harm.

8 Future Research

Despite a growing literature on screening tools for elder abuse, there is a need for further development and evaluation of brief reliable and valid screening instruments that can be widely used to screen for abuse in different screening contexts, as previously noted [35, 42]. A better distinction is needed between the three purposes of screening tools identified in the current research: population screening, screening as part of routine healthcare, and ‘screening’ whose primary purpose is to investigate suspected or reported elder abuse. Different tools are likely needed for these different purposes and each need to be validated within the context for which they are intended.

A need for greater specification and measurement of types of elder mistreatment has been identified [18, 42]. Some interesting research has been undertaken to explore the conceptual and theoretical framework underpinning elder mistreatment as a foundation for development of more differentiated screening instruments and this would seem to be a promising area for future research. For instance, Conrad and colleagues have undertaken concept mapping methodologies to identify expanded components of psychological abuse and financial exploitation, to inform the development of more robust screening measures [33, 32, 31]. In the area of elder self-neglect, Burnett and colleagues have used an analysis of APS data to define four unique subtypes of elder self-neglect: physical and medical neglect (50 %), environmental neglect (22 %), global neglect (21 %), and financial neglect (9 %) [18], paving the way for more differentiated methods of measurement. Others have identified potentially new constructs that have not been incorporated into most measures of elder abuse to date and these warrant further investigation. For instance, an observational study of staff-patient interactions in aged care settings identified infantilization of elders by their carers as a form of abuse in that it demeaned the older person and caused them distress [143], yet this behaviour is not adequately tapped by most existing measures. The field needs a clear consensus among researchers about a brief general measure of abuse suitable for widescale screening, as well as consensus about more differentiated measures of specific forms of abuse.

Clearly, there is a need for more robust psychometric evaluation of screening measures to progressively define conceptually clear, short versions that have adequate reliability and validity [35, 41, 96]. There is an ongoing tension between the thoroughness of longer forms, and the practical utility of more succinct but robust shorter forms of screening measures. Some examples are provided in this review of the progressive refinement of measures and ongoing validation (e.g., [33, 32, 31]). The Centers for Medicare and Medicaid Services [18, 150] used an expert technical panel to derive a consensus recommendation about screening tools that met a sufficient range of quality criteria to recommend for use for routine screening in healthcare settings. Three screening measures were commended: the H-S/EAST [124], the VASS [149], and the EASI [172]. Nevertheless, there is a clear need for more research into the reliability and validity of methods of data collection in this sensitive research area [41, 126].

Further research on the effect of modes of administration is needed to examine differences between the most commonly used modes of written questionnaire, face-to-face interview, and telephone interview. Observational measures also need further evaluation to determine optimal, reliable, valid and cost-effective methods. This is particularly so for assessing nutritional self-care, as no studies were identified that had successfully validated a method to assess this aspect of self-neglect. A relatively recent development in observational methods, particularly in nursing homes, is the use of electronic surveillance to monitor the wellbeing of residents. This offers interesting potential to assess staff-to-resident abuse, resident-to resident abuse, or even resident-to-staff abuse, but is fraught with ethical challenges [15].

Even less is known about the effectiveness of screening—what actions are taken with those who demonstrate higher risk on screening measures and how effective are those programs? What barriers are there to implementation of screening and interventions, and how can these barriers be overcome? These are fertile areas for multidisciplinary collaboration and further work could be undertaken to determine optimal team approaches to the detection and intervention for elder abuse risk. Research is also needed on the best ways of having screening embedded in routine care.

The majority of research on screening has been undertaken in either primary care or institutional care settings, or among higher risk individuals referred to adult protective services. Research on population based screening is sparse [96, 108, 148], and conclusions are difficult to draw given the wide range of measures and methods used. In addition to the need for further validation of screening instruments suitable for population studies, longitudinal studies are needed to determine the predictive power of the screening instruments for future abuse, for health and mortality outcomes and the effectiveness of appropriate intervention. Research is also needed on the impact of screening on care-giver and care-recipient dynamics.

Given the importance of familial care relationships to vulnerable elders, a valuable approach would be to examine the effectiveness of therapeutic interventions to ameliorate the triggers for abusive behavior and foster healthy caregiving dynamics. For instance, research evaluating interventions for other forms of family violence may be useful in developing appropriate therapeutic and supportive interventions for elder abuse and neglect. This would require development of appropriate measures to assess the impact of such interventions. The use of well-designed qualitative research could also provide a complement to the psychometric approach, as well as measurement of dyadic interactions.

Key recommendations for future research include:

  • Further development of elder mistreatment theory and alignment of screening measures to better assess more clearly differentiated theoretical constructs and types of abuse.

  • Greater clarity of screening purpose and alignment of measures to the purpose.

  • More rigorous psychometric evaluation of instruments over different populations and cultures.

  • Examination of modes of administration and adaptation to different cultural contexts.

  • Refinement of existing instruments to produce shorter reliable and valid version.

  • Well designed longitudinal research to examine the predictive validity of elder abuse screening measures for a range of outcomes including future abuse, substantiated abuse, mortality, and health outcomes.

9 Policy Implications

The value of screening is predicated historically on having reliable and valid measures of the phenomenon of interest, appropriate referral pathways, evidence of successful interventions for those found to be at risk on the screening measure, and evidence of improved outcomes [163]. Dong [44] notes, however, that screening for elder abuse is complex and can’t be evaluated by the same criteria as for medical issues. Rather there is a need to consider the specific challenges associated with screening for elder abuse and determine whether the intended benefits of screening outweigh the potential harms. This review has highlighted some gaps between these criteria and available evidence, and raises a number of policy implications.

First, screening is necessary to raise awareness of the scope and nature of the elder abuse problem in society. It is only through accurate measurement, determination of prevalence and types of abuse, as well as risk factors, that appropriate policy planning can take place. Without a more thorough approach to screening, we will remain in a catch-22 situation where it is argued that there is insufficient evidence to justify screening. As for all forms of family violence, accurate measurement of the phenomena will remain challenging, but it is only through such efforts that awareness of the scope of the problem and potential solutions can be articulated.

Second, a key policy debate relates to whether screening for elder abuse and neglect should be part of routine healthcare screening practice, or implemented for only higher risk groups, or where signs of abuse or suspected abuse reports are available. Some have argued that screening should occur routinely in healthcare settings [6, 7, 44, 121], while others have argued that there is insufficient evidence to support such a recommendation [125, 126, 128]. Dong [44] has made a strong case that elder abuse should not be subjected to the standard medical model criteria for implementation of screening programs, but rather that the contextual and often hidden nature of elder abuse requires a more proactive approach to identification, in order to prevent the significant negative health outcomes. This is more than a medical issue, it is also one of societal justice.

Third, to address these debates, coordinated multi-sector initiatives are needed to bring advances to the screening dilemmas currently facing the field. There are many examples of state-based systems working together to develop more comprehensive approaches to integrating screening, referral and interventions approaches (e.g., [13, 65]), but arguably there is an urgent need for better coordinated high-level national and international approaches. As an example, the CMS in the US convened a national symposium to examine the role of elder abuse screening as a Physician Quality Reporting System (PQRS) measure in Medicare beneficiaries [44]. The purpose was to review available evidence and make consensus recommendations about a comprehensive approach to screening, identifying and implementing appropriate follow pathways for those at risk of elder abuse [19].

The CMS Report of the Technical Expert Panel on Elder Maltreatment Screening documents agreement that screening should include screening for differentiated forms: physical abuse, emotional or psychological abuse, neglect (active or passive), sexual abuse, abandonment, financial or material exploitation, and unwarranted control [19]. The Panel recommended two scales as among the best available currently: The H-S/EAST and the VASS. Both measures have been tested in diverse populations, have shown adequate reliability and validity across cultures, and are relatively brief and suitable for both population screening and use by healthcare professionals. Self-neglect, while clearly important, was not included as a form of elder abuse per se as it is not perpetrated by a carer, but it is considered an important part of any geriatric assessment.

A fourth policy consideration is to need to further develop more differentiated pathways for referral and intervention when older people screen positive for elder abuse. Currently, the legal framework dominates, with reporting to authorities such as the APS in the US being legally mandated for suspected elder abuse. However, with advances in our knowledge of the range of types of abuse, and widely varying severity and chronicity, there is a need for more differentiated pathways following positive screens. The evidence base is accumulating about possible prevention and intervention strategies that may assist in reducing the risk of elder abuse and poor long-term outcomes. Policymakers need to work with researchers to define appropriate criteria to recognise a more differentiated understanding of suspected abuse, and develop more accessible and effective strategies. What non-legal options are there for follow-up screening and investigation, with potential implementation of effective intervention programs available for those identified as at risk of abuse?

10 Practice Implications

Research in the areas of child abuse and interpersonal violence suggests that it is possible to develop and implement effective interventions for complex societal issues such as family violence [157]. These include the use of home visits of potentially at risk families using a partnership model, carer training programs, school-based life skills training, public policy initiatives, and therapeutic approaches [118, 157]. While reporting of suspected abuse is mandatory in most jurisdictions, it may be necessary to find more appropriately tailored supportive interventions for all but the more severe end of the abuse continuum. An interesting alternative model for assessment and intervention that may be of value in this field comes from the family dispute/mediation area, where the traditional legal/family court approach to divorce disputes has increasingly been replaced by a model that puts the focus on a more supportive community-based family mediation/dispute resolution model [167]. Under this model, couples in dispute are required to attend family mediation to assess the extent of the problems and develop mediated solutions, prior to court attendance. Winkworth and McArthur outline a practice framework to guide screening and assessment throughout the period of contact, recognising that needs and safety concerns are likely to change over time. It proposes a more responsive and less reactive model than in may protection services. It may be that all but the more extreme end of elder abuse activity, could be better addressed in a more comprehensive community-based assessment and intervention centre as increasingly available for divorcing families, and partner violence. Demonstration projects to assess the effectiveness of such coordinated community-based approaches are needed.

Given that much elder abuse involves important familial relationships, and the majority of abuse involves psychological abuse and neglect, there is a need for greater development of psychological and therapeutic approaches to change potentially harmful relational dynamics. The burdens of caring can exacerbate long-standing relationship dynamics, as well as foster new problems, and there is a need to distinguish between these situations. While there has been considerable development of therapeutic approaches for intimate partner violence, there is little development of approaches to elder abuse that may help to preserve positive aspects of the carer relationship with victims. Approaches to date have largely focused on structural interventions such as placing the abused person in institutional care, or provision of nursing and home care services [118]. Community-level integration of elder services could potentially provide a more holistic approach to addressing the underlying issues and improve quality of life for older people, with less stigmatisation associated with APS referral.

Moracco and Cole [116] identified three types of interventions for intimate partner violence that may help reduce risk and improve outcomes, and it is worth considering how these approaches could be applied to elder abuse. The most common intervention is referral to community services such as counseling, legal services, alternative accommodation, and social welfare services, and empowerment strategies such as support groups, education, and volunteer advocates may be useful. The equivalent of dedicated domestic violence services is not readily available for those experiencing elder abuse, and their greater dependency makes it difficult to seek out support services. Yet, this is clearly an area that would benefit from a more supportive case management approach, especially for less severe forms of abuse, without the stigma and associated problems arising from official reporting to APS type bodies.

Second, home visits by professional staff could be expanded to provide more preventive and supportive interventions to assist those in the home. Social support is a critical element of any supportive care and needs to be provided in an ongoing way to be effective. A third approach to intervention is to offer programs to address risk factors of the abusive care-giver. These may include counselling, groups programs, provision of respite care, substance abuse therapeutic programs, and helpline support services. Individual supportive counseling may be useful to reduce anxiety, stress, and depression in the carer, and cognitive-behavioral methods can be used to educate the carer about the reasons for a dependent person’s behavior, their needs, and developmental limitations. Such programs have been successful with abusive parents and could be adapted to elder abuse caregivers.

Drawing on evidence of successful parenting interventions for child abuse [85], it is suggested that supportive psychoeducational and therapeutic interventions that aim to support caregivers to undertake their demanding role would be useful for this population [157]. Both child and elder abuse occur in the context of a dependency relationship, where the needs of the dependent person (child or adult) need to take priority. These practices from the wider family violence field point to areas of potentially useful intervention and evaluation priorities.

11 Summary

This chapter has undertaken systematic review of elder abuse screening tools published in the 20-year period from 1995 to 2015. The review identified 33 measures that reported any psychometric evaluation in the period: 10 used primarily for population-based screening purposes, 16 for screening in health care settings, and a further 7 instruments developed to screen for elder neglect and self-neglect. Key issues related to screening for elder abuse are critically reviewed, and some future directions identified.