Introduction

In the past ten years there have been several proposals regarding potential disparities in cancer risk between sexual minority and heterosexual individuals. By “sexual minority” individual we mean homosexual, bisexual, transgendered, transsexual, and intersex individuals. An Institute of Medicine report [1] report suggested that sexual minority women might experience two to three times the breast cancer risk levels of heterosexual women, and that risk levels for other cancers were unknown. The limited research focus has been on differences in breast cancer risk factors and the combined effects of these differences on breast cancer incidence [2]. For men, differences in HIV related cancers have dominated the discussion [36], although other cancers have been studied, such as anal cancer [7, 8]. In all these examples, cancer rates are potentially higher in sexual minority individuals. The main methodological issue that prevents a clear identification and quantification of this potential disparity is the lack of population-based data on cancer incidence and prevalence by sexual orientation status. This paper will discuss the reasons for the absence of these data and the efforts to use other methodologies for the study of this problem. We will then propose options for addressing this problem that could provide evidence for or against disparities in cancer incidence by sexual orientation.

Data on risk factors provides clues to differences in incidence

Measures of sexual orientation have already been included in a number of large population-based surveys that assess the nation’s health and risk behaviors. These data are just now becoming available and should help to provide answers to potential risk differences among sexual orientation categories. For example, measures of sexual orientation have been included in the National Survey of Family Growth, Cycle 6 Data Year 2002. (See: http://www.cdc.gov/nchs/nsfg.htm). This survey focusses on reproductive issues among people 15–44 years of age. In 2002, the survey included more than 7,000 women and about 5,000 men. Public data are available from April 2005 at this website. A recent paper [9] reported differences in women’s obesity, a risk factor for several cancers, by sexual orientation category. There is a new website, www.gaydata.org, listing all government databases that collect data on sexual orientation. The national probability-sample studies included on this website are the National Health Interview survey, containing only indirect measures (living with same-sex partner and also a composite HIV-risk measure that includes sex between men) and National Health and Nutrition Examination Survey, containing sexual orientation, sexual identity, and also numbers of male and female sex partners (lifetime and past 12 months) for participants up to age 59. The availability of these risk factor surveys, together with sexual orientation, should allow for future research into risks for cancer in people of diverse sexual orientations.

In the absence of publications on definitive national data, the available large nationally funded studies (e.g., Women’s Health Initiative, Nurses Health Study II) as well as the small population-based studies indicate that sexual minority women have more cancer risk factors compared to heterosexual women [10, 11]. Thus the existing knowledge about cancer in sexual minorities is uneven in that the available information on sexual minority men point to an increase in the incidence of AIDS-related cancers as well as anal cancer that is caused by the human Papilloma virus which is transmitted through sexual contact [4, 12]. Sexual minority women’s disparities in cancer incidence are hypothesized based on the higher prevalence of cancer risk factors.

Large samples of older women [10, 11] provide information about risk factor differences that have been related to cancer in other studies. These differences include differences in reproductive and demographic risk factors, as well as behavioral risk factors such as obesity, diet, and physical activity. The prevalence of cancer increases with age, and therefore the older samples available in these large studies can be very important to understanding risk factor patterns in relevant populations. A survey of Kaiser Permanente members included sexual orientation, finding older people in higher proportion among the non-responders to a sexual orientation question compared with younger people [13]. Therefore, older sexual minority individuals may be underrepresented in large cohort studies of risk factors.

Tobacco use levels provide a future glimpse of current or future lung cancer rates in a population. Sexual minorities are more likely than heterosexuals to use tobacco products, as documented in reviews and recent population-based studies [1425]. Recent data collected using strong methodology have documented approximately double the rates of smoking for sexual minority women, compared to heterosexual women in California [17, 19]. This single risk factor difference could account for up to one-third of disparity-related deaths, given national estimates on the impact of smoking on health. It is not clear when this trend emerged, and therefore any lung cancer rate differences may be currently different among sexual orientation categories or may only be different after a period of 5–40 years. Either way, this disease disparity will almost certainly occur unless disparities in tobacco use are rapidly reduced. Moreover, studies of cancer incidence in HIV-infected persons detected a higher incidence of lung cancer compared to the HIV-negative population, possibly suggesting that the excess in lung cancer may be explained by the higher smoking rates in sexual minority men [26, 27].

Another area of potential risk factor difference is in obesity. Obesity is another risk factor for many cancers (e.g., breast, colorectal) [28], and therefore the study of obesity could provide information on possible cancer rate disparities by sexual orientation category. A recent review [29] found that sexual minority women were more likely to report being overweight and obese, compared to heterosexual women.

Current state of cancer surveillance

The NCI’s cancer registry as the benchmark

Cancer surveillance in the United States is recognized as one crucial tool in monitoring the health of the nation. In 1973, the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute began collecting data on cancer cases, and is currently an important source of information on cancer incidence in the US [30]. To date, the SEER Program provides data on cancer incidence and survival from 14 population-based cancer registries [8]. The data that are routinely collected by SEER are information on primary tumor site, morphology, stage at diagnosis, first course of treatment, and follow-up for vital status as well as age, sex, race/ethnicity, and address. Trained coders collect data from medical records in clinical facilities within the SEER catchment areas; these data are assembled into a database available to researchers.

Two recent key examples of using SEER data to identify social- or demographic-based disparities in cancer incidence and mortality include differences in breast cancer mortality by stage between African American women and Caucasian women [31, 32], and geographic differences in cervical cancer incidence [33]. Using SEER data collected between 1995 and 2003, investigators determined that incidence of breast cancer was not different between African American and Caucasian women, but that mortality from breast cancer showed consistent disparities. African American women were more likely to die from a diagnosis of breast cancer and to be diagnosed at later stages [34]. These data helped to identify inadequate screening levels as a potential behavioral disparity that could be targeted for intervention. Similarly, the connection between rural residence and cervical cancer incidence and mortality identified infection from HPV and lack of pap smear screening as two key variables to improve in low income neighborhoods and regions [33]. Only through the use of SEER and other registry databases, such as state cancer registries, were investigators able to identify these disparities due to racial/ethnic group and geographic location, to enable targeted intervention to occur.

Sexual orientation via cancer registries

In an ideal world a data system such as SEER would provide a key starting point for identification of any important disparities in cancer incidence. This rigorously conducted and well-designed data collection system cannot directly be used for identifying and monitoring cancer rates by sexual orientation because of the lack of any sexual orientation data collected as part of SEER. The most comprehensive work with respect to measuring and defining sexual orientation is Laumann et al.’s work on sexual practices in the United States [35]. Laumann et al.’s contribution is to distinguish between three different dimensions of sexual orientation: behavior, desire, and identity. Behavior refers to the gender of sexual partners, as well as the sexual practices and the timeframe within which sexual relationships or activities take place. Desire refers to the appeal of a person of the same or opposite gender, and identity to one’s identity or label as, for example, homosexual or bisexual [35, 36]. Research into HIV risk has clearly presented an example of sexual behavior as more important than sexual identity in determining risk for disease, supporting the importance of these decisions [37]. These controversies or issues must be considered in any future efforts in this area. To date, SEER data do not allow for any identification of disparities in cancer incidence based on sexual orientation, regardless of the dimension of sexual orientation (identity, behavior, or desire).

The Surveillance, Epidemiology, and End Results staff obtained data for the SEER database from medical chart review, by trained abstractors. The data collected on patient demographics do not include sexual orientation, because sexual orientation is rarely and inconsistently in the medical record. Other demographic data, such as age, race, and sex, are routinely in the medical record and therefore are consistently abstracted for the SEER database.

It is worth noting that among the demographics SEER collects are detailed information on patients’ sex, in that patients’ sex is recorded as male, female, hermaphrodite, or transsexual [38]. While this appears to allow for the determination of cancer incidence in the transgender population, data on transgender population are neither reported in SEER documents nor released on request for reasons of confidentiality (B.F. Hankey, personal communication, June 10, 2005). Since it is not known how often patients’ sex is reported as hermaphrodite or transsexual in medical records, it is likely that SEER data undercount transgender populations.

The lack of available data on sexual orientation, or more importantly, the lack of data collection on sexual orientation by SEER, means that the sexual minority population’s cancer incidence, treatment, and survival are not assessed. This essentially neglects this population’s health, in that programs targeted at prevention or control of cancer cannot be informed by strong evidence. How likely is it, then, that this gap in the medical record will be overcome, so that sexual orientation data can be abstracted into SEER? The answer is, not very likely. The prevalence of sexual orientation disclosure to physicians is unknown and there are conflicting results from previous studies. Some survey studies found that the proportion of disclosure to health care providers ranges from 28 to 84% among lesbians and bisexuals [16, 39]. It is unknown if and how often patients’ disclosure of their sexual orientation enters into their medical records. Collection of sexual orientation information by health care providers is not likely to become routinized and standardized in the near future.

Efforts to supplement needed data on sexual orientation

The lack of SEER data has driven researchers to identify alternative methods of gathering cancer incidence by sexual orientation. These efforts can be categorized into the use of alternative data repositories, collection through other ongoing studies, and other eclectic methods.

Use of alternative repositories

The AIDS epidemic resulted in data registries that include measures of sexual orientation or at a minimum the Centers for Disease Control and Prevention defined HIV transmission category of men who have sex with men. Like cancer registries, AIDS registries are population-based. A number of studies linked AIDS and cancer registries to obtain population-based data on HIV/AIDS-related cancer incidence [6, 26, 27, 40, 41]. The results indicate that people with AIDS, which include many sexual minority men, have significantly increased incidence of Hodgkin’s disease, multiple myeloma, leukemia, lip cancer, and lung cancer, but do not provide detailed information by sexual orientation or HIV transmission [27, 41]. Two studies co-authored by the AIDS Cancer Match Registry Study Group report more detailed information by transmission category. One study concluded that all persons with AIDS have an increased incidence of adult T-cell leukemia/lymphoma compared to the general population, yet the incidence tends to be higher in people with unknown or heterosexual HIV transmission [40]. Similarly, among men with AIDS the relative risk of Kaposi Sarcoma is highest among men who have sex with men, who have the highest incidence of testicular seminoma [26]. While the SEER registries lack sexual orientation information equally on men or women, the ongoing surveillance of AIDS brought about sex differences with respect to available knowledge on cancer incidence in sexual minorities. Comparable data sources that include sexual orientation information on women do not exist.

In many countries (e.g., Germany, Spain, Canada, Sweden) as well as specific geographic areas within the US (Massachusetts, Vermont, San Francisco) same sex partnerships are recorded as marriages or as civil unions. These registries offer the opportunity to gather a possibly population-based sample of sexual minorities [42, 43].

For example, a group from Denmark used a registry, where same-sex marriage-like relationships are registered as well as heterosexual marriages, to assess disparities in cancer incidence [44]. They reported similar rates of cancer between sexual minority and heterosexual women. However, the median age of the female registry participants at registry was 37, leaving a relatively small sample at older years, where cancer is more prevalent. Other repositories that entail basic demographic information, clinical medical records, and other needed data, have been used where possible to compare cancer diagnosis between sexual minorities and heterosexual samples. For example, a small scale study using medical records at an existing community clinic facility found higher rates of breast biopsies among sexual minority women, but because the cohort was small and the research methodology was not population-based, these findings have been questioned [45].

Use of ongoing studies

Another possible solution to the problem of the lack of registry data on sexual orientation is to use data from ongoing or previously collected population-based research projects as the basis for asking about sexual orientation, in addition to collecting data on exposures and cancer diagnosis. One study found higher risk of breast cancer among lesbians, using a previously collected case–control study of breast cancer in women [46]. Sexual orientation was not directly collected in this study, but the investigators defined “sexual minority” in multiple ways, using a marital status variable. This method provides glimpses into possible disparities, but it does not give us strong evidence of differences, given the lack of clear measures of the sexual orientation variable.

Once again due to the AIDS epidemic, data on same-sex sexual behavior in men is more readily available and have been used to detect AIDS-related cancers [4, 12]. Kaposi’s Sarcoma, a rare cancer before AIDS, surged among men with AIDS [3, 5], in that about one in four men who had sex with men developed Kaposi’s Sarcoma, a rate that has since dropped due to more effective treatments for HIV infection [47]. In addition to Kaposi’s Sarcoma, non-Hodgkin lymphoma has been recognized as the second most common AIDS-related cancer that surged in geographic areas with high incidence of AIDS cases and never married men in San Francisco [48]. However, non-Hodgkin lymphoma occurs in 4–10% of people with AIDS, regardless of how they were infected with HIV, whereas KS was specifically linked to men who became infected with HIV through sex with men [47]. Ultimately, the ways of transmission of HIV brought about the collection of data on same-sex sexual behavior among men that led to a more detailed knowledge about the cancer incidence in HIV-infected gay and bisexual men, yet knowledge about HIV-negative sexual minorities’ cancer incidence is not complete with the exception of the excess of anal cancer [49].

Possible solutions for the gap in data on sexual orientation and cancer incidence

It seems clear from this discussion that we need cancer surveillance on sexual minority women and men. Here we present possible solutions to the issue.

Adding sexual orientation to existing registry databases, such as SEER and state cancer registries, appears at first glance as the most logical idea. Scientifically, these databases represent the most rigorous population-based case identification methods in the North America. However, it will not be simple to implement this recommendation. The registries can only collect data from existing medical facilities. Hospitals and medical facilities do not regularly collect sexual orientation data, and therefore including sexual orientation as a standard demographic variable must involve the original clinical facilities changing their policies and practice. This might happen, but will take years, as many medical facilities have taken the position that collecting sexual orientation is not only not necessary but also potentially invasive or inappropriate. In addition, sexual minority individuals might feel distrustful of providing information on sexual orientation, but data from research projects (e.g.,[10, 11, 35]) does not support this concern.

In terms of immediate research activities, conducting a study using SEER data, by approaching a large group of cases through registries to measure sexual orientation would be a huge step forward. Each year a call for target studies using the SEER database is released. This study could be conducted as one of the targeted studies announced by SEER, calling for participation from multiple sites, collection of sexual orientation using multiple measures, and collection of relevant risk factors for the cancer sites specified. It is likely that breast cancer, lung cancer, colon cancer, and anal cancer would be on the list of cancer sites targeted. This special studies model would be adaptable to the question of cancer risk among diverse sexual orientations, because of the ability of this model to support other important questions regarding cancer risk in special populations in the past.

Identification of sampling methods that approximate probability sampling but are appropriate for rare or hard-to-find populations might be a strategy for surveys to estimate risk factor prevalence. Recent work by Magnani and colleagues [50] and others [51, 52] have identified possible sampling methods, such as respondent driven sampling, that could be applied to sexual minority populations. It is not clear, however, that a method like respondent driven sampling, which relies on social connections among the members of a group to be sampled, would have a sufficient yield in sexual minority cancer patients. Studies need to be conducted in this area.

Existing clinical facilities that regularly collect demographic data, such as insurance companies and health maintenance organizations, might be another source of data on sexual orientation and cancer diagnosis. Many large medical facilities collect demographic data on new patients and/or annually from members. Sexual orientation could be included on such a form or data collection effort. One example of a large Health Maintenance Organization that has collected sexual orientation data is Kaiser Permanente in Oakland, CA with few or no negative effects [13]. The de-identified data can then be matched with diagnosis information and risk factor information to conduct research. Another idea is to check for same sex couples using queries of same sex primary insurance holder and adult dependent within the database. These data are often not publicly available, but can sometimes be analyzed collaboratively with the insurance company. These databases are often not population based, but they are often relatively well defined, and represent a good source of data for other research questions.

Using data that exists in datasets that does not directly measure sexual orientation but allow for the assumption of sexual minority status might be a useful strategy. An idea is to obtain insurance data from domestic partners or same-sex married couples, as a stand-in for an assessment of sexual orientation. The changes in state laws that have produced gay marriages, domestic partnership laws, and other forms of civil unions could allow for the formation of population-based registries that could be followed for appropriate lengths of time to collect cancer outcomes. In Massachusetts, for example, a marriage license contains sex of both parties, allowing for potential contact with same-sex couples. This method has been used in previous research [44], when sexual orientation for both coupled and non-coupled individuals was not easily available. Conducting studies on coupled individuals only will incur bias in that coupled individuals likely have different lifestyles, risk factors, and diagnoses than uncoupled individuals. Data from existing epidemiological studies might be used to estimate sexual minority status, (e.g., [46]), even though they were not conducted for this purpose.

Finally, a recent change in research policy points the way to a possible solution to this issue. In 1990 the National Institutes of Health mandated federally funded studies to include women and minorities, or justify the reasons for exclusion in all studies, and in 1994 began to require reporting of proportions of participants by sex and race/ethnicity [53]. These disparities targets have given attention and publicity to inclusion of these demographic groups in federally funded research. If the federal government mandated that studies report the sexual orientation of their human samples, this would both raise the visibility of this demographic group and could provide valuable data on sexual orientation and disease risk.

Conclusions

The existing published data provide some indication that differences in cancer incidence might exist by sexual orientation categories, but they do not provide definitive evidence. Several opportunities exist to add to the existing published literature on this topic. These opportunities represent our best chance of addressing this important disparity topic. Long-term strategies, such as treating sexual orientation as another demographic variable in clinical and research settings, should be pursued as well.