Introduction

Many jurisdictions routinely conduct problem gambling prevalence studies for the purpose of monitoring the aggregate impact of gambling to the community, with the Problem Gambling Severity Index (PGSI; Ferris and Wynne 2001) most commonly used. Although the PGSI contains some items that probe negative consequences, it is primarily based on an addiction-based model, rather than a public health approach to assessing gambling-related harm. Theoretical conceptualisations of problem gambling distinguish between the excessive gambling behaviour, and the consequences of that behaviour (Walker 1992; Walker et al. 2012; Griffiths 2005; Blaszczynski and Nower 2002). There is now a widely-accepted view that measurement of gambling problems (as opposed to identifying problem gamblers) should distinguish between addiction-like symptoms, and on the negative consequences of gambling, that may occur to a greater or lesser degree, and with or without the symptoms of behavioural addiction (Dickerson et al. 1997; Svetieva and Walker 2008). This perspective is aligned with a public health approach to problem gambling that recognises that the largest component of gambling-related harm in the community may be derived—not from severely addicted gamblers, but rather from the much larger group experiencing less severe problems (Shaffer 2003; Shaffer and Korn 2002; Svetieva and Walker 2008).

If gambling problems are construed as the extent of “…negative consequences for the gambler, others in his/her social network, and for the community” (Ferris et al. 1998, p. 58), then harms arising from gambling are the key index of interest; as conceptualised in a problem-centred, public health model (Korn and Shaffer 1999; Shaffer and Korn 2002). The PGSI comprises a mix of dependence symptoms, and negative consequences from gambling. In a research context, the PGSI total score has properties more consistent with an ordinal, rather than scalar interpretation (Currie et al. 2013). The PGSI includes low- and moderate-risk, as well as problem gambler (PG) interpretative categories. However, Currie et al. (2013) note that delineation of the intermediate PGSI categories is relatively poorly supported. This may be because PGSI items are overrepresented by symptoms associated with disordered gamblers rather than people with lesser problems (McCready and Adlaf 2006).

In government-sponsored population monitoring programs, the most common use of the PGSI is in tracking the prevalence of problem gamblers (PGs) in a population; with this prevalence estimate often being treated as a surrogate for aggregate negative impact by industry and government. Given the conceptual mismatch between problem gambler prevalence and aggregate population impact, forceful arguments have been made that this represents a misuse of this tool (Svetieva and Walker 2008). It has even been suggested that problem gambling prevalence studies—which consistently find low levels of problem gambling prevalence—might serve to obfuscate the true impact of gambling in the community (Young 2013, 2007); and some researchers question the value of continuing to conduct problem gambling prevalence studies at all (Markham and Young 2016).

Whilst the PGSI is a valuable measure of per-individual problematic gambling in its severest form, tracking the public-health impact of gambling requires a specific instrument to capture population-level gambling harm. Such an instrument should ideally be able to: be aggregated over individuals within a population, assess progress towards reducing gambling-related harm, identify sub-populations suffering disproportionate harm, and assist with the evaluation of policy alternatives. To address this goal in Victoria Australia, Langham et al. (2016) recently created a comprehensive 72-item harms checklist of specific gambling-related harms and reported psychometric characteristics based on a sample of over 3000 individuals (Li et al. 2016). By applying public-health elicitation methodologies, the researchers calculated WHO Burden of Disease disability weights (DW); a zero–one bounded metric that captures the detriment to an individual’s health and wellbeing caused by gambling (Browne et al., in press b). DWs were estimated for each category of the PGSI. This information was subsequently combined with weighted PGSI prevalence data to create an aggregate measure of population level impact of gambling harm (Browne et al., in press a). However, an acknowledged limitation was that, in the absence of a recognised population screen for gambling harm, the method relied on the PGSI as an indirect measure of the distribution of harm in the population. Furthermore, the 72-item checklist was designed for comprehensive coverage of all potential harms, and is too long in format to function as a practical population screen.

Direct measurement of population-level gambling-related harm demands administration of a short form instrument specifically designed for the purpose of assessing the presence and extent of harm. Therefore, the current study aims to validate a brief harms scale for use as a brief population screen. We employ the original 72-item set of harm probes from the original study as an initial item pool. The dataset originally gathered by Browne et al. (2016) is not suitable for psychometric validation, because it was targeted at those already experiencing harm rather than the population exposed to potential harm. Accordingly, we shall perform validation on a new sample that appropriately reflects the population at risk—i.e. those who have gambled in the last year. Our psychometric analysis is intended to test whether the brief screen provides a meaningful per-person metric of gambling related harm; that can ultimately be aggregated in population-representative weighted samples, in order to obtain a valid and direct measure of population-level harm.

Methods

Participants

Adult gamblers (N = 1524) who had gambled in the last 6 months were recruited for the study via an online survey panel recruitment service (Research Now). The sample was comprised of 49.4% males and ages ranged from 18 to 101 years (M = 44.99, SD = 15.57). The majority of participants were born in Australia (78.1%), with the remainder born in England (5.6%), New Zealand (2.4), India (2.2) and other countries (11.7%). See Table 1 for a summary of other demographic characteristics of the sample.

Table 1 Education, employment, marital and income status of the sample

Procedure

Eligible members of the online panel were invited via email to participate in an online survey regarding their gambling activity and outcomes. The survey took approximately 10 min to complete and participants were compensated with credit points from the panel provider which they can accumulate and exchange for cash and prizes. The study underwent ethical review and was approved by the institutional Human Research Ethical Review (HREC) board, with participants providing informed consent before participation.

Measures

Comparisons were made between the full 72-item checklist and the derived shorter scale, along with a number of related measures described below. All measures were worded using a 12-month time-frame.

Gambling Consumption

Respondents completed the Consumption Screen for Problem Gambling (CSPG), which is designed to measure the consumption of gambling products. Two items measure frequency and duration of gambling activities (e.g., “How much time did you spend gambling on a typical day in which you gambled in the past 12 months?”), with one item measuring time spent gambling during a typical day (“How often did you spend more than 2 h gambling (on a single occasion) in the past 12 months?”). The CSPG has been shown to have high predictive validity (100% sensitivity, 92.7% specificity) when compared with the established Problem Gambling Severity Index (Rockloff 2011). Cronbach’s alpha in the current sample was α = .73.

Gambling Problems

Participants completed the Problem Gambling Severity Index (PGSI), designed to measure problem gambling in the general population (Ferris and Wynne 2001). The scale consists of nine questions such as “Have you bet more than you could really afford to lose?”. Answered on a 4-point Likert scale, PGSI scores are summed to determine one’s level of gambling risk (Holtgraves 2008). Cronbach’s alpha for the current study was high (α = .95). Common with our experience with other online panels (Browne et al. 2016), we found that Research Now internet panel included a disproportionately high number of non-recreational gamblers 39.5% RG, 22.3% low-risk, 17.4% medium-risk, and 20.6% PG. It is unclear precisely why this is the case, but a reasonable explanation is that gamblers experiencing problems are more likely than others to respond to the incentives provided by online providers.

Well-Being

Eight items forming the Personal Wellbeing Index (PWI) from the Comprehensive Quality of Life Scale (Cummins 1997) were used to measure satisfaction across seven different life domains including; standard of living, health, achievements, relationships, safety, community belonging, future security, and “life as a whole”. Questions are responded to on an 11-point scale, whereby 0 indicates no satisfaction and 10 indicates complete satisfaction. Items are averaged and multiplied by 10 to produce an overall well-being score, with an Australian norm of 75. The PWI is a reliable and valid measurement tool regularly applied to Australian samples (Cummins et al. 2003; Lau et al. 2005). Internal scale reliability in the current study was α = .94.

Gambling Harms

Gambling harm was measured using a recently developed 72-item checklist, designed to be a comprehensive set of harms typically reported by gamblers. The checklist was originally constructed for the purpose of generating condition descriptions for elicitation of health-related quality of life (HRQoL) decrements associated with gambling (Browne et al., in press b). The item set was determined from a literature review, conceptual framework, and qualitative data (Langham et al. 2016), and item-response theoretic (IRT) properties of the items have been reported elsewhere (Li et al. 2016). However, it was not intended for—and has not been evaluated with respect to—practical use as a summative measure of gambling harm. The checklist was implemented by considering which harms occurred to gamblers during a 12-month time frame. The binary scored items are organised into six domains. These included; financial (e.g., “Reduction of my available spending money”), health (e.g., “Neglected my hygiene and self-care”), relationship (e.g., “Spent less time with people I care about”), emotions (e.g., “Felt worthless”) work/study (e.g., “Was late for work or study”), and other forms of social dysfunction (e.g., “Reduced my contribution to religious or cultural practices”). The full set of items is provided in Online Resource 1.

Addictive Gambling Symptoms

The PGSI contains a mixture of addictive symptoms and consequences of gambling. To our knowledge, no ‘pure’ measure of gambling addiction currently exists. Accordingly, items were selected from the South Oaks Gambling Screen (four items; Lesieur and Blume 1987), the lie/bet scale (two items; Götestam et al. 2004), the National Opinion Research Centres DSM-IV Screen (six items; Wickwire et al. 2008), and Brief Biosocial Gambling Screen (two items; Gebauer et al. 2010) with the criteria of purely measuring cognitive or behavioural symptoms of addiction, as distinct from harm or simple excessive consumption. These 14 items were combined into an ad-hoc composite scale, provided in Online Resource 2. It included items such as “When you were betting, have you ever told others you were winning money when you really weren’t winning” and “Have you ever felt the need to bet more and more money?”. The common header for the group of probes was “In the last 12 months…”. All items were answered on a binary yes/no scale. Polychoric alpha and hierarchical coefficient omega reliability (technical details discussed below) were particularly high (.97 and .93 respectively). Combined with the strong content validity derived from the source instruments, this suggested no serious psychometric issues in employing this ad-hoc composite scale.

Short Harm Screen Item Selection

Exploration of the item performance of the original 72-item scale showed that it has very high unidimensionality and reliability, with most items being quite strong indicators of the latent factor of ‘gambling harm’. In line with this observation, preliminary investigation revealed that a large number of different item combinations could potentially be selected, all resulting in a shorter scale with good to excellent psychometric properties. Also, each harm probe is specific and concrete, which is an attractive property of the checklist, in terms of yielding probes with little room for subjective interpretation. However, an associated disadvantage of specific, concrete symptomology is that there is a potential to ‘miss’ the harm experienced by particular individuals who may not possess the specific symptomology included within a shorter scale. Therefore, in item selection, the principle criteria to optimise appeared to be to minimise false negatives (or maximising sensitivity). This amounts to minimising the number of cases that score positive/high on the full harms measure, but zero/low on the short harms measure.

Given these considerations, we implemented a customised item selection algorithm as follows. The first item was selected based on maximising simple prevalence. The second and following items were chosen based on the maximum prevalence amongst cases who have not answered positively on the previously selected items. The effect of this algorithm is two-fold. First, it selects items so as to minimise false negatives of the complete scale. Second, it has a tendency to select items that are relatively less highly correlated, since the criterion is based on the maximum number of positive responses, amongst cases for which previously selected items are negative. Whilst this is potentially prejudicial to reliability, this criterion acts to promote construct coverage—i.e. promoting the inclusion of probes indicating semi-independent sub-constructs of harm, or probes more likely to be positively answered by different groups. This may be assessed by considering the correlation of the shorter harms scale with the full scale.

Table 2 summarises the cumulative performance of the items selected, in order of selection (top to bottom). Also shown is the percentage of the total sample responding positively to each item. The progressive number of false negatives is given, which is the proportion of non-zero responses on the current subset, relative to the non-zero responses on the full harms checklist. The running Spearman correlation of the subset sum with the full harms sum is also given. It can be seen that saturation with respect to both construct coverage/subset-total correlation (.935), and percentage of false negatives (4.8%) is achieved at around 10 items.

Table 2 Progressive properties of the 1–10 selected harm probes with respect to the full checklist

The goal of subsequent analysis was to determine the psychometric properties of the shortened 10-item candidate scale. We applied a series of modern psychometric approaches, based on either polychoric correlations or logistic link/item-response functions, appropriate to the binary data. Details of each procedure will be provided in the following section.

Results

Validity

As described in Table 2, the 10-item short gambling harms screen (hereafter SGHS) was strongly correlated with the sum of the full harms list at .94, which suggests that the SGHS has good coverage of the primary construct captured by the full harms checklist. Table 3 summarises correlations between the PGSI, wellbeing, addict our ad-hoc composite measure of ‘pure’ addiction (Addict), gambling consumption (CSPG), the sum of symptoms on the full harms checklist (Harm72) and the SGHS. Quality of life was more strongly correlated with the full harms checklist (−.32) and the SGHS (−.29) than with any other measure. The ‘pure’ measure of gambling addiction was more strongly correlated with the SGHS (.63) than with gambling consumption (.46). The SGHS was also more strongly correlated than the CSPG with the PGSI (.68 vs. .48).

Table 3 Spearman correlations of the SGHS with gambling consumption, addiction, and quality of life

Participants who scored greater than zero on the SGHS (41%) had an average PGSI score of 7.4, compared to versus 1.17 for those scoring zero on SGHS, t = 23.74, p < .001. Participants scoring positively on the SGHS had an average wellbeing (PWI) score of 69.9, compared to 80.2 for those scoring zero, t = −10.24, p < .001. Figure 1 plots average wellbeing scores by SGHS and PGSI, showing a general linear negative relationship between SGHS and wellbeing. For each extra harm nominated, expected PWI decreased by 2.37 (SE = .18, t = −12.73, p < .001). The PGSI also displayed an approximately linear relationship with wellbeing for participants with PGSI <= 7 (N = 1209). For this subset, expected PWI decreased by 1.77 (SE = .22, t = −7.87, p < .001) for each unit increase in the PGSI.

Fig. 1
figure 1

Average wellbeing by SGHS (a) and PGSI (b) scores

Reliability and Internal Structure

The SGHS involves binary scored items, which requires special handling when calculating reliability metrics that involve correlations between individual items. A polychoric or latent correlation matrix involves inferring the continuous correlation between latent (unobserved) scores, based on observed ordered category (or binary) data (See Uebersax 2015 for a brief introduction). This is recommended for calculation of reliability coefficients on ordinal or binary data (Gadermann et al. 2012). Alpha reliability of the SGHS was .93, suggesting an excellent degree of classical reliability. Coefficient omega (hierarchical) represents an extended approach to assessing internal consistency (Zinbarg et al. 2005). By estimating the proportion of variance in a correlation matrix that is accounted for by a general factor relative to sub-factors, it assesses unidimensionality as well as classical reliability. Coefficient omega for the SGHS was high at .83. An average of 78% of the (latent) variance of each SGHS item is shared with the general factor. The eigenvalue of the general factor was 5.25, as compared to eigenvalues of 0.66, 0.48, and 0.33 for three potential (undesirable) subfactors. Similarly, a unidimensional confirmatory factor analysis (CFA) with logistic links to measured variables showed good properties (GFI = .996, RMSEA = .027, SRMR = .047). The only pair of items with a residual correlation that would increase the fit significantly was “Reduction of my available spending money” and “Reduction of my savings”—which is very likely due to obvious content similarity between this pair of probes. Together, this analysis suggests both strong reliability and unidimensionality of the SGHS. However, comparison of the base model with a tau-equivalent CFA model, in which item factor loadings are set to unity, resulted in a significant decrease in fit, χ(6) = 6.26, p < .001. This confirms that items varied somewhat in their degree of relationship shared with the common factor; or equivalently, their ability to discriminate those with higher or lower degrees of harm.

Table 4 presents both CFA and item-response theoretic (IRT) properties of each item in the SGHS. The IRT difficulty parameter and CFA threshold parameter describe the intensity or extremity of the probe with respect to the latent harm construct. The IRT discrimination parameter and the CFA loading parameter indicate the reliability of the item in indicating harm. It can be seen that whilst “Sold personal items” had the weakest relationship with the common factor, positive responses on this item was an indication of the highest degree of gambling harm. “Feeling distressed about my gambling” was the most reliable indicator of gambling harm, whilst reduction of available spending money/savings was a reliable “early indicator” of low to moderate levels of harm.

Table 4 Item response theory parameter estimates for the SGHS

Measurement Invariance

Measurement invariance or measurement equivalence entails that the same construct is being measured across some specified groups. CFA permits testing for invariance by comparing models with parameters constrained and unconstrained between groups. We considered measurement invariance of the SGHS between genders and age categories (18–29, 30–49, and 50+). Comparison between constrained and unconstrained models was undertaken simultaneously for item thresholds, item loadings, and residual variances. That is, the constrained model specified that all these parameters to be fixed across groups, whilst the unconstrained model allowed them to vary across groups. Model comparison for measurement invariance is traditionally done via Chi square difference testing. More recently, researchers have recommended also comparing change in fit indices, predominantly the comparative fit index (CFI), with guidelines for acceptable fit difference thresholds determined by exhaustive simulation studies (Cheung and Rensvold 2002). However, determination of these thresholds is a complex and ongoing field of research, with thresholds dependent on sample size and several other factors (Chen 2007). We found a non-significant difference for gender, χ 2(12.03) = 13.56, p = .332), ΔCFI = −0.001. We also found a non-significant difference for age, χ 2(12.40) = 13.65, p = .351, ΔCFI = −0.004. Our observed delta CFAs was well within the thresholds suggested by recent research. Given the non-significant Chi square change, and small decrease in comparative fit, these results suggest that the SGHS is satisfactorily invariant across these demographic categories.

Discussion

Our goal was to develop a short screen for gambling harm; i.e. the negative consequences of excessive gambling. Our initial item pool was comprehensive, and derived via a systematic process of qualitative research (Langham et al. 2016). Gambling harms are heterogeneous, and often their presence or absence is highly dependent on an individual’s life circumstance. Therefore, we selected items based on sensitivity, yielding a shorter scale that addressed this issue: 95% of participants scoring >0 on the full 72-item checklist also scored >0 on the SGHS. The SGHS possessed a high (.94) correlation with the full checklist; and this, along with other analyses, suggested good construct coverage. Other psychometric properties of the SGHS: unidimensionality, external validity, and measurement invariance, were found to be strong.

Rather strikingly, we found that the 42% of participants who scored >0 on the SGHS had an average wellbeing 10 points lower than other gamblers in our sample. For comparison, the average difference in wellbeing in households earning $15–30 k AUD and $101–$150 k is less than 5 points (Australian Unity 2015). Thus, scoring >0 on the SGHS has strong implications for an individual’s quality of life. Further, we found that wellbeing decreases linearly with the number of harms indicated on the SGHS. This suggests that the SGHS may be used not only to classify individuals as being harmed (>0) or not (score 0), but also scored as a count of harms, to yield a scalar measure of the degree of individual harm being experienced. Likewise, at the population level, as well as describing the prevalence of harmed individuals, it may be aggregated as (population weighted) raw scores—thus capturing both prevalence and severity in a single metric. Notably, as illustrated by Fig. 1, PGSI raw scores cannot be aggregated in this manner. By referencing SGHS raw scores to other normed or cardinal scales such as wellbeing (PWI), or health-related DW—as was done by Browne et al. (2016) for the PGSI, this figure can then be converted to other public health metrics.

Since early work by Korn and Shaffer (1999), the importance of shifting to a public health approach to gambling problems has been repeatedly stressed in the literature (Adams et al. 2009; Korn et al. 2003). Likewise, there is an acknowledged conceptual ambivalence regarding whether problem gamblers actually possess a mental or physical disorder comparable to substance addictions, and whether non-problem gamblers are necessarily free of gambling problems (Walker 1989; Korn and Shaffer 1999). A public health approach acknowledges that gambling can create both benefits and harms, and predicts that significant harms can occur to sub-clinical gamblers and those around them (Shaffer and Kidman 2004). For agencies tasked with minimising harm from gambling, aggregate harm to individuals is arguably a more relevant measure than the estimated prevalence of problem gamblers. In our sample, a similar mean wellbeing score was associated with PG status (PGSI 8+; 67.5) and those experiencing harm (SGHS > 0; 69.9). However, the prevalence of those experiencing harm was twice that of PG (41.5 vs. 20.6%). Conceptually, this relates to the observation that many non-PG individuals are experiencing significant harm from gambling. Moreover, it implies that a focus on PGs tends to underestimate the true impact of gambling harm. Shaffer and Korn (2002) suggest that members of the larger group of sub-clinical gamblers experiencing some degree of problems, may collectively have the greatest impact at the population level. This has been supported by prior work to determine aggregate population-level harm in Victoria (Browne et al. 2016) and the present analysis. Using the SGHS rather than the PGSI as a population-level measure of impact would significantly broaden the segment of the gambling population that is considered “of concern”.

Our ‘pure’ measure of the behavioural symptoms of pathological gambling showed good internal consistency, and was quite highly correlated with the full checklist (.68) and short screen (.63) for gambling harm. The observation of approximately 42% of shared variance is understandable, given the strong couplings between addiction, excessive consumption, and harm. Nevertheless, cross-correlations between harms and dependence items are far lower than correlations within each construct, which supports calls for a better conceptual distinction between addiction and harm (Svetieva and Walker 2008).

An extensive body of literature exists on the impact of gambling on families and other concerned significant others (CSOs), (Dowling et al. 2016). One limitation of the present instrument is that it only measures impact with respect to the gambler themselves. Browne et al. (2016) measured harms to CSOs as well as gamblers using 72 matched harm items. However, in an unpublished analysis by the authors, it was demonstrated that the specific profile of harms was markedly different for CSOs as compared to gamblers. For example, in terms of negative affect, gamblers were much more likely to report feelings of shame and guilt, whilst CSOs tended to report helplessness and anger. Accordingly, our view is that measuring harm to CSOs demands the creation of a separate instrument based on an appropriate sampling of the at-risk population. An associated limitation is that the SGHS does not measure social and economic harm in the broader community; an acknowledged component of the negative consequences of gambling (Korn et al. 2003). For these reasons, a population-aggregated SGHS metric should not be treated as a complete measure of the ‘cost’ of gambling to a community.

Researchers have been aware for some time that the negative consequences of gambling extend well beyond the set of individuals who meet clinical diagnostic criteria. Despite this, population-level measurement of the negative impact of gambling has incongruously continued to rely on clinical screens. Our presentation of the SGHS as an alternative population screen for harm is motivated by a recognition of the need for a consequence-focused measure of problematic gambling in the community. The SGHS is very brief, consists of concrete probes measuring commonly reported harms, and has strong psychometric properties, including external validity comparisons with the PGSI. Accordingly, we recommend the SGHS—or metrics such as DW or HRQoL that can be derived from it—to jurisdictions aiming to achieve more sensitive and more valid monitoring of the population-level impact of gambling.