Reliability of Self-Reported Health Service Use: Evidence from the Women with Co-occurring Disorders, and Violence Study

Chung, Sukyung; Domino, Marisa Elena; Jackson, Elizabeth W.; Morrissey, Joseph P.

doi:10.1007/s11414-007-9105-z

Reliability of Self-Reported Health Service Use: Evidence from the Women with Co-occurring Disorders, and Violence Study

Regular Paper
Published: 31 January 2008

Volume 35, pages 265–278, (2008)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

The Journal of Behavioral Health Services & Research Aims and scope Submit manuscript

Reliability of Self-Reported Health Service Use: Evidence from the Women with Co-occurring Disorders, and Violence Study

Download PDF

Sukyung Chung PhD¹,
Marisa Elena Domino PhD²,
Elizabeth W. Jackson PhD³ &
…
Joseph P. Morrissey PhD⁴

225 Accesses
11 Citations
Explore all metrics

Abstract

In behavioral health services research, self-reporting provides comprehensive information on service use, but may have limited reliability because of recall bias and misclassification. This study examines test–retest reliability of self-reported health service use, factors affecting reliability, and the impact of inconsistent reporting on the robustness of cost estimates using the test–retest data from the Women, Co-occurring Disorders, and Violence Study (n = 186). Reliability varies widely across service types: moderate to substantial (k = 0.65–0.94) for any use; slight to substantial (ICC = 0.12–0.93) for quantity of use; and none to moderate (k = −0.06–0.79) for service content, but is not affected by psychiatric symptom severity. Cost estimates do not differ according to the use of test or retest data. Findings suggest that self-reporting provides reliable data on service quantity and is adequate for economic evaluations. However, self-reporting of treatment content in highly specified service categories (e.g., individual counseling during residential treatment) may not be reliable.

Harmonizing healthcare and other resource measures for evaluating economic costs in substance use disorder research

Article Open access 08 April 2021

“There Are Things I Want to Say But You Do Not Ask”: a Comparison Between Standardised and Individualised Evaluations in Substance Use Treatment

Article Open access 16 August 2018

Health Service Utilisation of People Living with Psychosis: Validity of Self-report Compared with Administrative Data in a Randomised Controlled Trial

Article 21 November 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Self-reported service use data are used extensively in health services research because they provide comprehensive information on a variety of services, and yet, are relatively inexpensive to obtain.1,2 Self-reported data are particularly useful in behavioral health services research where any single source of provider records cannot describe all of the services received because of the number and wide variety of provider types available for treatment. Persons with behavioral health problems use a variety of services not only in conventional clinical settings, but also in conjunction with welfare programs and often in the criminal justice systems.3 Typically, administrative records can only provide information about services received within one agency or organization, whereas a person’s self-report ideally would include services received in all settings, from all providers. In an economic evaluation study taking a societal perspective, self-reported service use data could be more comprehensive and valid than information available from provider or insurer records, which may only represent service use from the perspective of the administrative party who maintains the data.4

The usefulness of self-reported service use data depends on the validity and reliability of the measurements.5 Much attention regarding psychometric properties of self-reported health service use has been given to the validity of these measures. Generally, validity studies compare self-reported service use against administrative records and show inconsistent findings. Some show favorable level of congruency between data from the two sources,6,7 but others do not.8–10 Disagreements are often ascribed to different ranges of services represented in the two data sources, errors in the administrative data such as incomplete recording, and recall bias in self-reporting.7–10 To compensate for problems with both sources and to produce comprehensive and valid utilization data, hybrid method can be used by collecting self-reported data using a brief measure of provider contact, such as the Brief Health Services Questionnaire, and then retrieving provider records for detailed health care use information.7,12 Even in this method, however, obtaining reliable answers from self-reporting is prerequisite for further collection of information from provider records.

Our study focuses on the reliability of self-reported services use. Assessment of the reliability (or consistency) of self-reported health service use requires repeated measures for each survey item (or test–retest data), preferably by the same rater, within a reasonably narrow time window. Partly because of the lack of appropriate data, there is a paucity of evidence on reliability of self-reported service use. Test–retest is a commonly used method in psychology and education research to assess the reliability of survey items. In the health services literature, test–retest has been frequently applied to examine the reliability of self-reported health status.11,13

We identified eight studies that examined reliability of self-reported health service use.14–21 All but one study21 were conducted on small samples of focused populations and in a particular geographic area: six studies tested similar instruments, which were designed to elicit responses from parents and children about which services children used14–20; and one study examined the use of typical medical services (inpatient, outpatient, and emergency room) among persons with schizophrenia.19 All the previous studies, except two, used samples drawn from either medical or mental health service settings to ensure enough service use to make test–retest reliability results valid.14–18 Two studies, which used community samples rather than clinical samples, used limited measures: one examined any use of health services with a high school sample,20 and another examined any use of preventive screening services in a small subsample from a national survey.21

Taken together, the existing studies on reliability of self-reported services use consistently reported substantial agreement for any use (yes/no) of service for specific service types, and fair to moderate agreement for the quantity of service for specific service type. By service type, reliability of reporting was higher for inpatient care than outpatient-based services and higher for aggregate service categories than more specific service categories. However, self-reporting of more specific information, such as frequency of outpatient services provided by mental health professionals, tended to be less reliable.

Regarding determinants of reliability, studies document that question factors, such as sentence complexity, recall period (time between events and reporting), and service types are more important than individual characteristics, such as age, gender, and ethnicity.16,18,20 None of the previous studies has examined reliability of specific content of services received or evaluated the impact of inconsistency in reporting on evaluation study outcomes, which were explored in the present study.

The present study examines the reliability of self-reported service use among women with behavioral health problems and extends the population studied and scope of analyses beyond that of previous studies. First, the study participants are from a population not studied before, and they were recruited from nine sites nationwide representing diverse geographic areas. Second, this study examines instruments measuring diverse dimensions of service use including any use as a binary variable, quantity of use for service users, and content of service in terms of the focus of treatment for each service type. Third, the survey instrument from this study captures a comprehensive range of services including typical inpatient and outpatient care, residential treatments, and jail or shelter use from which a significantly large fraction of participants with behavioral health problems received services.3,22 The reliability of self-reporting for services received from these atypical sectors is unknown. Fourth, this study is the first to examine factors influencing consistent reporting and to explore the impact of inconsistency of reporting on the robustness of overall cost estimates.

The specific research questions this study seeks to answer are: (1) What is the test–retest reliability of self-reporting on quantity and content of service use in a variety of settings where health care services are provided? (2) What are the determinants of the consistency of self-reported service use considering such factors as service type, level of service use, severity of psychiatric symptoms, study site, and demographic characteristics? and (3) How sensitive are cost estimates to inconsistency in the quantity of self-reported service use with repeated measures?

Methods

Data

The test and retest data used in this study come from the baseline survey of the Women, Co-occurring Disorders, and Violence Study (WCDVS) conducted in nine sites nationwide from 2001 to 2003. The study participants were women with psychiatric and substance abuse disorders and histories of interpersonal violence. The WCDVS is a quasi-experimental study with an intervention arm that provided comprehensive, integrated, trauma-informed, and consumer/survivor/recovering person-involved care, and a comparison arm that provided usual care in each of the nine sites. Other details about the WCDVS study design have been reported previously.23 Because retest data were collected at baseline, before the intervention was executed, no intervention effect is anticipated in the present study.

The retest sample included 8% (n = 186) of all study participants (n = 2,729) at baseline. There were approximately equal numbers of participants from each study site and in both intervention and comparison arms. The retest participants were randomly selected and hence display characteristics similar to the other WCDVS participants (Table 1). The retest interview (retest hereafter) was conducted during an average of 7 days (s.d. = 4.2; range = 2–35 days) after the initial interview (test hereafter) by the interviewer who conducted the test and used the same set of survey items as those used in the test. One exception was that the retest used the identical recall period that was used for the test (i.e., previous 3 months from the original baseline interview date) for the service use questions. All the survey questions were read and answers were recorded by interviewers during in-person interviews.

Table 1 Sample characteristics^a

Full size table

The characteristics of the study sample, described in Table 1, are based on the test data. The study sample consists of women aged 19–59 with a mean age of 37 and represents diverse racial groups, education levels, marital status, and insurance status. General psychological distress level was measured using the Global Severity Index (GSI). GSI is the average score of Brief Symptom Inventory (BSI),24 a 53-item self-report scale (ranging from 0 to 4; higher scores indicating greater severity) measuring nine psychiatric symptom dimensions. The average GSI, 1.4, was quite high, which reflects the fact that all the women in our study sample had complex behavioral health problems.

Variables

Study participants were asked to report frequency and content of services they received during the last 3 months in a variety of categories (Table 2). The survey instruments capture all the services received and are not limited to services received at the participating study site. For each service type, respondents were asked if they received any service. If positive, respondents were asked to answer questions on the frequency of service use. The frequency of emergency room visits and the number of days in inpatient (or overnight stay) facilities were requested with open-ended questions. For counseling sessions or outpatient visits, frequency categories were used instead of open-ended numeric responses. Each frequency was converted into the total number of visits or sessions during 3 months for the analysis. “Daily” was converted into five times a week or 65 times for the 3 months; “a few or two to four times a week” into three times a week or 39 times for the 3 months; and “two to three times a month” into 2.5 times a month or 7.5 times for the 3 months. We also used the original categorical scale and found similar results, and thus presented results based on the continuous scale throughout this paper. For all service types, respondents were also asked about the content of services received for each stay, visit or session and could choose one or more of the relevant categories. Figure 1 demonstrates how quantity and content of services were measured and coded.

Table 2 Test–retest reliability of self-reported service on any use and quantity of service use

Full size table

For the reliability of quantity of service use, the ten service types examined are hospital, emergency room, detoxification, individual and group counseling at residential and outpatient facilities, outpatient medical visits, homeless or domestic violence shelters, and jail. The frequency of service use was generally high, ranging from 19% for hospital to 61% for outpatient medical visit during the last 3 months. The average number of hospital days was 1.9 and the average number of individual counseling sessions in a residential facility was 2.2. See Table 2 for the frequency and intensity of other types of service use.

For the reliability of content of service, the four content areas examined are physical health, mental health, substance abuse, and trauma in each of the five service types: individual and group counseling during residential stay and outpatient visits and outpatient medical visits.

Analysis

Agreement between test and retest data on service use is indexed for dichotomously coded services by Cohen’s kappa statistic (k)25 and for continuous-scaled measures of service use by the intraclass correlation coefficient (ICC).26 Following the method proposed by Shrout,27 we interpret kappa and ICC below 0.1 to represent no agreement; 0.1 to 0.39 as slight agreement; 0.4 to 0.59 as fair agreement; 0.6 to 0.79 as moderate agreement; and above 0.8 as substantial agreement. The reliability of self-reporting is assessed with the magnitude rather than statistical significance of these indices. Although k and ICC are by far the most widely used indices, they are not free from limitations. These indices are influenced by the variability of event frequency and could be upwardly biased for seldom used services.28,29 However, this potential bias might not cause serious problems given that frequencies of events in our sample were relatively high even for less frequently used services such as hospitalization (19%) and jail use (20%).

Multivariate regressions are used in the analysis of determinants of consistency of reporting. A logit model is used for the dependent variable indicating consistent reporting of any use of services in test and retest data (1 = agreement), and a linear regression model is used for the agreement rate in quantity of use among any users. This study defines agreement rate as 1−|(N _T − N _R) / (N _T + N _R)|, following a method similar to that used in the literature,6,30 where N is the total number of visits or stays, T is test, and R is retest. The agreement rate ranges from 0 (none) to 1 (perfect). The number of observations for the any use model (n = 1,820) is the number of respondents with valid answers for all covariates (n = 182) multiplied by the number of service types (n = 10). Those who reported non-zero response in either test or retest for each service type are used for the analysis of quantity of use (n = 725). Factors examined are service type, time interval between test and retest, study sites, total number of visits or days of 10 service types as a proxy of utilization level, GSI at the test interview, age, race, marital status, education level, and insurance type. Service type is coded with dummy variables with a hospital day as the reference category. Huber-White cluster-adjusted robust standard errors are used to correct for individual clustering across service types.

Finally, we estimate cost for each service type and overall service cost using test and retest data to examine whether the cost estimates drawn from two datasets differ because of the potential inconsistency in reporting. The estimate of unit cost of each type of service is from diverse sources and approximates the societal perspective as is described elsewhere.31 To draw statistical inferences between estimates from test and retest data, standard errors are calculated by bootstrapping with 500 replications with replacement.

Results

Test–retest reliability of quantity of service use

The test–retest reliability of self-reporting on any use of each category of service is generally good. The levels of agreement are moderate to substantial across all service categories (k = 0.65–0.94), highest for jail days and lowest for outpatient medical visits (Table 2). Agreement on the quantity of service use is lower than agreement on any use, ranging from slight (ICC = 0.12) for outpatient medical visits to substantial (ICC = 0.93) for shelter days.

The reliability of the total number of days in inpatient facilities is substantial for all the services except for detoxification days (ICC = 0.64). Individuals may not distinguish services received through detoxification from those received in residential facilities. When these two categories are combined, the reliability of service quantity improves (ICC = 0.79; result not in the table), but remains at a moderate level.

The number of residential or outpatient counseling sessions and outpatient medical visits show slight to substantial agreement (ICC = 0.12–0.82). The reliability of reporting is higher for counseling services received in outpatient settings than for counseling services received during residential treatment, and is lowest for outpatient medical visits. When aggregated, the reliability of any outpatient visit and any residential counseling is moderate (ICC = 0.61, 0.74, respectively). We repeated all the analysis with the two subgroups above and below the median level of GSI, 0.76, and found no noticeable difference between the two groups.

Test–retest reliability of content of service use

The reliability of self-reported service content during residential or outpatient counseling and outpatient medical visits ranges from none to moderate (k = −0.06–0.79) (Table 3). Generally, the reliability of reporting on the content of services received during counseling sessions is higher for mental health (k = 0.56–0.77) or substance abuse (k = 0.52–0.75) than for trauma (k = 0.45–0.60) or physical health (k = −0.06–0.40). An exception is that for outpatient medical visits, physical health is more consistently reported (k = 0.63) than other content areas (k = 0.12–0.59). The reliability of reporting of service content is higher for services received during outpatient visits (k = 0.13–0.77) than for those received during residential treatment (k = −0.06–0.75). Again, the reliability increases only slightly when aggregate categories (i.e., any residential counseling, any outpatient visit) are used.

Table 3 Patients perceptions on service contents—mental health/ substance abuse/ trauma

Full size table

Determinants of the consistency of self-reporting

We find few observable factors that are associated with consistent reporting. For any use, counseling services and outpatient medical visits are less likely to be consistently reported than hospital use, after controlling for other relevant factors (Table 4). For quantity of use, only the number of outpatient medical visits is less consistently reported than hospital days. Consistency of reporting also varies across study sites, which may reflect differences in the research staff who conducted the interviews. White race (vs. other race) and some college education (vs. less than high school) are associated with more consistent reporting in quantity of use. None of the following factors affects consistency of reporting: level of mental distress (GSI), level of service use in aggregate, and time interval between test and retest.

Table 4 Predictors of consistency in reporting: any use and frequency of service use^a,b

Full size table

Robustness of cost estimates from test and retest data

The average total cost estimates from test and retest data are $9,168 (s.d.: 10,128) and $8,883 (s.d.: 10,243), respectively (Table 5). Note that the variances in costs are quite large, which is typical for cost data particularly among high-end users of health services. Rather than excluding extremely high cost users using an arbitrary cut-point, we used standard errors from bootstrapping to address the issues of relatively skewed distribution and small sample size. The mean difference in total cost ($285) is only 3.2% of the average total cost and is not statistically different from zero. By service type, hospital costs ($2,772; 30%) and residential treatment costs ($1,800; 20%) comprise a majority of total costs in the test data.

Table 5 Cost estimates in test and retest data

Full size table

Discussion

This study adds to the literature on the reliability of self-reported service use by extending the population studied and the scope of analyses. Studies on children in the community14–18,20 and on persons with schizophrenia19 have shown substantial agreement in reporting any use of service and fair-to-moderate agreement in reporting quantity of services. Consistent with the previous findings, this study shows moderate to substantial agreement for any use and slight-to-substantial agreement for quantity of services, among women with behavioral health problems.

The wide variation in reliability by service types is notable. Quantity of service is more consistently reported for inpatient days than for outpatient visits, maybe because inpatient stay is a more salient episode and thus easier to remember than outpatient visits. On the other hand, quantity of counseling services is more consistently reported for services received during outpatient visits than for services received during residential treatment. The treatments received during residential stay are so complex that service recipients may be difficult to discern specific treatment elements. It is also likely that the frequency of receiving specific services while staying in residential facilities might be harder to remember than the frequency of visits to outpatient facilities that require more effort and time to attend. We also found that reliability improves by aggregation of service categories, which suggests that a lower level of details would be easier to remember and answer consistently.

A confounding factor that might have influenced the reliability of counseling services and medical visits is the wording of the question. Frequencies of these services were elicited by fixed categories for the average frequency of service use per week or per month during the previous 3 months versus open-ended questions about the total frequency during the previous 3 months for other service types (See Fig. 1 for an example of each). Therefore, the difference in reliability may partly come from the difference in question format (i.e., categorical vs. open-ended). Furthermore, because counseling and outpatient visits are high-frequency events, an inconsistency in the answer for one category may result in a large difference in the total frequency over the 3-month period. With this survey design, the variation in reliability ascribed to different question formats could not be teased apart from the variation ascribed to different service types.

This study provides novel evidence on the reliability of self-reported content of care received during counseling services and medical visits. The reliability of service content is generally lower than the reliability of service quantity, and is below the acceptable level (k < 0.4) for some categories. Particularly of concern is the lowest reliability of reporting on service content during medical visits, which is the most common type of service relying on self-reporting in health services research. People with behavioral health problems receive a variety of services and therefore may have difficulty in differentiating services focusing on behavioral health from those addressing comorbid physical health problems during medical visits.

We find no evidence of an association between severity of psychiatric symptoms, measured by the GSI, and the consistency in reporting. This is consistent with the findings of other studies,6 ^, 33 ^, 34 which reported validity or reliability of self-reporting was not influenced by the severity of psychiatric conditions. On the other hand, type of illness or symptomatology may influence reliable reporting because of cognitive deficits associated with some psychiatric conditions. We were not able to investigate the variation across different symptomatology because of the limited sample and data on diagnostic information. Previous studies have shown that self-reported health behavior or services use among persons with severe mental illness or substance abuse problems are also reliable and valid.19 ^, 35–39 This suggests that there would be little influence on reliability of reporting because of cognitive deficit associated with psychiatric conditions.

For both any use and quantity of service, the total number of outpatient medical visits is significantly less likely to be consistently reported than hospital days, which cautions against the wide use of self-reporting in measuring the frequency of outpatient medical visits. These results are consistent with the literature on determinants of the validity of self-reporting, which indicates that the saliency of events and well-defined (vs. ambiguous) events accounts for more of the variance in response accuracy than any other class of variables.5

The overall findings on the determinants of consistency of reporting suggest that factors associated with the survey administration are more important than those representing subject characteristics. Similar findings were reported in previous studies on health services use among children.18,20 These findings are also consistent with the literature that indicates task factors, such as question form, wording, and mode of administration, account for more of the variance in response accuracy than any other class of variables.32 On the other hand, it is noteworthy that a large proportion of the variation across repeated measures was not explained by the variables in the model, as indicated by relatively low R-square (0.16). More detailed information on individual and service and availability of different question forms would help increase our understanding of determinants of reliable measures of services use.

One of the important applications of self-reported service data is in economic evaluation research. Our results show that although reliability varies across service types, the aggregated cost estimate for overall service use is robust across repeated measures. This robustness is partly because reliability of reports of quantity of use is higher for the more intensive and costly services, such as hospital use. The less consistently reported services, such as outpatient medical visits, constitute a small proportion of total cost for the population of this study.

In interpreting our results, one should be careful in generalizing our study findings to population or settings different from ours. Reliability of self-reporting among our study participants (women with behavioral health problems) could be different from the reliability among other groups of behavioral health service users. Furthermore, the results indicated by kappa and ICC indices measured for different populations might not be directly comparable.

Based on the findings and the limitations of this study, we suggest several areas for further research to better understand the reliability of self-reporting of health service use. First, future research should explore the relationship between question phrasing (e.g., open-ended vs. close-ended) and response reliability of quantity of service use and between service type specification (e.g., residential treatment vs. specific type of services during residential treatment) and reliability of treatment content during the service use. Such research would help in developing survey instruments that induce more reliable data on service quantity and content during health care services. Second, future study may also consider aided recall to stimulate the memory for specific events. For example, providing a motivation to remember or contextual cues may considerably improve reliability and validity of recall because the vicissitudes of memory are common to both the test and retest data, particularly among clients with complex treatments or with cognitive deficits. Similarly, a shorter recall time frame than the 3 months in this study may improve the reliability of reporting. Third, the reliability of self-reported service use in other populations with different ranges or levels of services, such as clients with physical health problems or clients of primary care, would help in assessing generalizability of our findings. Finally, further study should examine the validity of self-reported service use data in populations similar to ours. A review of provider records and other objective and unobtrusive measures would be valuable in checking the validity of client’s self-reporting. Such evidence is essential to understand psychometric properties of self-reported data in populations similar to ours and will allow for the comparison of validity of self-reported services use among diverse populations.

Implications for Behavioral Health

Although self-reported data are widely used in assessing health service use, evidence on the quality of the data, particularly on the reliability of reporting, is very limited. Findings of our study suggest that among individuals with behavioral health problems, self-reported health service use data are reliable in capturing the quantity of services received in a variety of service areas. However, self-reporting of treatment content in highly specified service categories (e.g., individual counseling during residential treatment) may not be reliable. Similarly, the low level of reliability for the quantity of service use and content of service during outpatient medical visits, the most common medical events, needs attention. To determine the quantity and content of service use during general medical visits, physician records may be a better alternative than participant responses.

Despite some lack of agreement in reports of quantity and content of services, cost estimates did not vary with repeated measures and were unaffected by the inconsistency in reporting. Self-reported service use data produce robust cost estimates in aggregate and have the unique advantage of encompassing comprehensive types of service use. Therefore, self-reported service use data can serve as a useful source of information for the economic evaluation of behavioral health service programs.

Our findings on determinants of consistent reporting suggest that reliability of reporting varies widely by service types and may be improved with better measurements or administration methods, but may not be sensitive to respondent characteristics such as demographics and disease severity. However, more evidence from using different survey instruments, study populations, and study settings is needed to generalize our study findings to a broader behavioral health service context.

References

Ganiats TG, Sieber WJ, Weisman M. Health-related quality of life. Best Practices and Benchmarking in Healthcare. 1997;2:57–62.
CAS PubMed Google Scholar
Brown JB, Adams ME. Patients as reliable reporters of medical care process: recall of ambulatory encounter events. Medical Care. 1992;30:400–411.
Article CAS PubMed Google Scholar
Veysey BM, Steadman HJ, Morrissey JP, et al. In search of the missing linkages: continuity of care in U.S. jails. Behavioral Sciences & the Law. 1997;15(4):383–397.
Article CAS Google Scholar
Hargreaves W, Shumway M, Hu TW, et al. (eds). Cost-outcome Methods for Mental Health. San Diego: Academic Press, 1998.
Google Scholar
Del Boca FK, Noll JA. 2000 Truth or consequences: the validity of self-report in health services research on addictions. Addiction. 2000;95:347–360.
Article Google Scholar
Rozario PA, Morrow-Howell N, Proctor E. Comparing the congruency of self-report and provider records of depressed elder’s service use by provider type. Medical Care. 2004;42:952–959.
Article PubMed Google Scholar
Booth BM, Kirchner JE, Fortney SM, et al. Measuring use of health services for at-risk drinkers: how brief can you get. The Journal of Behavioral Health Services & Research. 2006;33:254–264.
Article Google Scholar
Fendrich M, Johnson T, Wislar JS, et al. Accuracy of parent mental health service reporting: results from a reverse record-check study. Journal of the American Academy of Child and Adolescent Psychiatry. 1999;38(2):147–155.
Article CAS PubMed Google Scholar
Rhodes AE, Fung K. Self-reported use of mental health services versus administrative records: care to recall. International Journal of Methods in Psychiatric Research. 2004;13(3):165–175.
Article PubMed Google Scholar
Beebe TJ, McRae JA, Barnes SA. A comparison of self-reported use of behavioral health services with Medicaid agency records in Minnesota. Psychiatric Services. 2006;57(11):1652–1654.
Article PubMed Google Scholar
Klabunde CN, Reeve BB, Harlan LC, et al. Do patients consistently report comorbid conditions over time? Results from the Prostate Cancer Outcomes Study. Medical Care. 2005;43:391–400.
Article PubMed Google Scholar
Netert PJ, French MT, Kirchner J, et al. Health services utilization and cost for at-risk drinkers: rural and urban comparisons. Journal of Studies on Alcohol. 2004;65(3):352–362.
Google Scholar
Beckett M, Weinstein M, Goldman N, et al. Do health interview surveys yield reliable data on chronic illness among older respondents. American Journal of Epidemiology. 2000;151:315–323.
CAS PubMed Google Scholar
Horwitz SM, Hoagwood K, Stiffman AR, et al. Reliability of the services assessment for children and adolescents. Psychiatric Services. 2001;52(8):1088–1094.
Article CAS PubMed Google Scholar
Hoagwood EK, Jensen PS, Arnold LE, et al. Reliability of the services for children and adolescents-parent interview. Journal of the American Academy of Child and Adolescent Psychiatry. 2004;43(11):1345–1354.
Article Google Scholar
Canino G, Shrout PE, Alegria M, et al. Methodological challenges in assessing children’s mental health services utilization. Mental Health Services Research. 2002;4(2):97–107.
Article PubMed Google Scholar
Bean DL, Rotheram-Borus MJ, Leibowitz A, et al. Spanish-language services assessment for children and adolescents (SACA): reliability of parent and adolescent reports. Journal of the American Academy of Child & Adolescent Psychiatry. 2003;42(2):241–248.
Article Google Scholar
Farmer E, Angold A, Burns B, et al. Reliability of self-reported service use: test–retest consistency of children’s responses to the Child and Adolescent Services Assessment (CASA). Journal of Child and Family Studies. 1994;3(3):307–325.
Article Google Scholar
Goldberg RW, Seybolt DC, Lehman A. Reliable self-report of health service use by individuals with serious mental illness. Psychiatric Services. 2002;53:879–881.
Article PubMed Google Scholar
Santelli AJ, Klein J, Graff C, et al. Reliability in adolescent reporting of clinician counseling, health care use, and health behaviors. Medical Care. 2002;40(1):26–37.
Article PubMed Google Scholar
Nelson DE, Holtzman D, Bolen J, et al. Reliability and validity of measures from the Behavioral Risk Factor Surveillance System (BRFSS). Sozial- und Präventivmedizin. 2001;46(S1):S3–S42.
PubMed Google Scholar
Roberts AR. The organizational structure and function of shelters for battered women and their children: a national survey. In: Roberts AR, ed. 2nd edn. Battered Women and their Families. New York, NY: Springer Publishing; 1998:58–75.
Google Scholar
McHugo GJ, Kammerer N, Jackson EW, et al. Women, Co-occurring Disorders, and Violence Study: evaluation design and study population. Journal of Substance Abuse Treatment. 2005;28(2):91–107.
Article CAS PubMed Google Scholar
Derogatis LR. Brief Symptom Inventory (BSI): Administration, Scoring and Procedures Manual. 4th edn. Minneapolis, MN: National Computer Systems; 1993.
Google Scholar
Cohen J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement. 1960;20:37–46.
Article Google Scholar
Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin. 1979;86:420–428.
Article PubMed CAS Google Scholar
Shrout PE. Measurement reliability and agreement in psychiatry. Statistical Methods in Medical Research. 1998;7:301–317.
Article CAS PubMed Google Scholar
Spitznagel EL, Helzer JE. A proposed solution to the base rate in the kappa statistic. Archives of General Psychiatry. 1985;42:725–728.
CAS PubMed Google Scholar
Kraemer HC. Ramifications of a population model for k as a coefficient of reliability. Psychometrika. 1979;44:461–472.
Article Google Scholar
Carsjo K, Thorslund M, Warneryd B. The validity of survey data on utilization of health and social services among the very old. Journals of Gerontology. Series B, Social Sciences. 1994;49:S156–S164.
CAS Google Scholar
Domino ME, Morrissey JP, Chung S, et al. Twelve-month service use and costs for women with co-occurring mental and substance use disorders and a history of violence. Psychiatric Services. 2005;56:1223–1232.
Article PubMed Google Scholar
Schwarz N. Self-reports: How the questions shape the answers. The American Psychologist. 1999;54(2):93–105.
Article Google Scholar
Clark RE, Ricketts SK, McHugo GJ. Measuring hospital use without claims: a comparison of patient and provider reports. Health Services Research. 1996;31:153–169.
CAS PubMed Google Scholar
Kashner TM. Agreement between administrative files and written medical records—A case of the Department of Veterans Affairs. Medical Care. 1998;36(9):1324–1336.
Article CAS PubMed Google Scholar
Calsyn RJ, Allen G, Morse GA, et al. Can you trust self-report data provided by homeless mentally ill individuals. Evaluation Review. 1993;17(3):353–366.
Article Google Scholar
Tsemberis S, McHugo G, Williams V, et al. Measuring homelessness and residential stability: the residential time-line follow-back inventory. Journal of Community Psychology. 2007;35(1):29–42.
Article Google Scholar
Klinkenberg WD, Calsyn RJ, Morse GA, et al. Consistency of recall of sexual and drug-using behaviors for homeless persons with dual diagnosis. AIDS and Behavior. 2002;6(4):295–307.
Article Google Scholar
Nieves K, Draine J, Solomon P. The validity of self-reported criminal arrest history among clients of a psychiatric probation and parole service. Journal of Offender Rehabilitation. 2000;30(3–4):133–151.
Article Google Scholar
Sohler N, Colson PW, Meyer-Bahlburg HF, et al. Reliability of self-reports about sexual risk behavior for HIV among homeless men with severe mental illness. Psychiatric Services. 2000;51(6):814–816.
Article CAS PubMed Google Scholar

Download references

Acknowledgment

This study was funded by the grant, number TI-00-003, from Substance Abuse and Mental Health Services Administration’s three centers: the Center for Substance Abuse Treatment, the Center for Mental Health Services, and the Center for Substance Abuse Prevention. This grant was entitled “Cooperative Agreement to Study Women with Alcohol, Drug Abuse and Mental Health Disorders who have Histories of Violence: Phase II”. The abstract of this paper was presented at the Academy Health’s Annual Research Meeting held June 26–28, 2005, at Boston, MA. Additional support was received by Dr. Chung and Dr. Domino from the National Institute of Mental Health: T32-MH-0182-61 and K01-MH-0656-39, respectively.

The assistance of project staff at the following participating sites (listed in alphabetical order by state) is gratefully acknowledged: Los Angeles, California: PROTOTYPES Systems Change Center, Vivian Brown, Principal Investigator; Stockton, California: Allies: An Integrated System of Care, Jennie Heckman, Principal Investigator; Thornton, Colorado: Arapahoe House – New Directions for Families, Nancy Van DeMark, Principal Investigator; Washington, DC: District of Columbia Trauma Collaboration Study, Roger Fallot, Principal Investigator; Avon Park, Florida: Triad Women’s Project, Margo Fleisher-Bond, Co-Principal Investigator, Colleen Clark, Co- Principal Investigator; Boston, Massachusetts: Boston Consortium of Services for Families in Recovery, Hortensia Amaro, Principal Investigator; Cambridge, Massachusetts: Women Embracing Life and Living (WELL) Project, Norma Finkelstein, Principal Investigator; Greenfield, Massachusetts: Franklin County Women’s Research Project, Rene Andersen, Principal Investigator; New York, New York: Portal Project, Sharon Cadiz, Principal Investigator. The Coordinating Center is operated by Policy Research Associates (PRA), located in Delmar, New York, in coordination with the National Center on Family Homelessness of Newton, Massachusetts and the Cecil G. Sheps Center for Health Services Research at the University of North Carolina at Chapel Hill, North Carolina. The interpretations and conclusions contained in this publication do not necessarily represent the position of the WCDVS Coordinating Center, participating study sites, participating Consumer/Survivor/Recovering persons, or the Substance Abuse and Mental Health Services Administration.

Author information

Authors and Affiliations

Department of Psychiatry, University of California, San Francisco, 2727 Mariposa Street, Suite 100, San Francisco, CA, 94110, USA
Sukyung Chung PhD
Department of Health Policy and Administration, School of Public Health, University of North Carolina at Chapel Hill, 1104G McGavran-Greenberg Hall, CB#7411, Chapel Hill, NC, 27599-7411, USA
Marisa Elena Domino PhD
Innovation, Research and Training, Inc., 1415 NC Highway 54 West, Building 300, Suite 121, Durham, NC, 27707, USA
Elizabeth W. Jackson PhD
Cecil G. Sheps Center for Health Services Research, University of North Carolina at Chapel Hill, 725 Martin Luther King Jr. Blvd., CB#7590, Chapel Hill, NC, 27599-7590, USA
Joseph P. Morrissey PhD

Authors

Sukyung Chung PhD
View author publications
You can also search for this author in PubMed Google Scholar
Marisa Elena Domino PhD
View author publications
You can also search for this author in PubMed Google Scholar
Elizabeth W. Jackson PhD
View author publications
You can also search for this author in PubMed Google Scholar
Joseph P. Morrissey PhD
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sukyung Chung PhD.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chung, S., Domino, M.E., Jackson, E.W. et al. Reliability of Self-Reported Health Service Use: Evidence from the Women with Co-occurring Disorders, and Violence Study. J Behav Health Serv Res 35, 265–278 (2008). https://doi.org/10.1007/s11414-007-9105-z

Download citation

Received: 31 May 2007
Accepted: 17 December 2007
Published: 31 January 2008
Issue Date: July 2008
DOI: https://doi.org/10.1007/s11414-007-9105-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Reliability of Self-Reported Health Service Use: Evidence from the Women with Co-occurring Disorders, and Violence Study

Abstract

Similar content being viewed by others

Harmonizing healthcare and other resource measures for evaluating economic costs in substance use disorder research

“There Are Things I Want to Say But You Do Not Ask”: a Comparison Between Standardised and Individualised Evaluations in Substance Use Treatment

Health Service Utilisation of People Living with Psychosis: Validity of Self-report Compared with Administrative Data in a Randomised Controlled Trial

Introduction

Methods

Data

Variables

Analysis

Results

Test–retest reliability of quantity of service use

Test–retest reliability of content of service use

Determinants of the consistency of self-reporting

Robustness of cost estimates from test and retest data

Discussion

Implications for Behavioral Health

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Reliability of Self-Reported Health Service Use: Evidence from the Women with Co-occurring Disorders, and Violence Study

Abstract

Similar content being viewed by others

Harmonizing healthcare and other resource measures for evaluating economic costs in substance use disorder research

“There Are Things I Want to Say But You Do Not Ask”: a Comparison Between Standardised and Individualised Evaluations in Substance Use Treatment

Health Service Utilisation of People Living with Psychosis: Validity of Self-report Compared with Administrative Data in a Randomised Controlled Trial

Introduction

Methods

Data

Variables

Analysis

Results

Test–retest reliability of quantity of service use

Test–retest reliability of content of service use

Determinants of the consistency of self-reporting

Robustness of cost estimates from test and retest data

Discussion

Implications for Behavioral Health

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation