Introduction

Several villages in Southwest Guizhou Bouyei and Hmong Ethnic Autonomous Prefecture, China represent a unique case of endemic arseniasis, which is related with indoor combustion of high arsenic-content coal (Jin et al. 2003; Zheng et al. 2005; Liu et al. 2002). The exposure in the endemic villages was given via a multiplex route, consisting of inhalation (Smith et al. 2009; Pal et al. 2007) of As-polluted indoor air, ingestion of As-contaminated food and, also possible, of direct skin penetration (Wester et al. 2004; Lowney et al. 2007). However, since the 1980s, when endemic of arseniasis was first found in the prefecture, the total As concentration in drinking water sources in the area tested have never been reported to exceed the level Chinese National Standards (GB) requires.

Since the early 1960s, as local woods and bushes had gradually faded out, farmers in the area have to burn local high As-containing coal in poorly or unventilated stoves (without chimney) for cooking, heating and drying crop and food. The highest As concentration in local coal was once detected as 3.2–3.5% (Ding et al. 2001). Since the early 1970s, hundreds of arseniasis cases emerged. All the cases concentrate in three small isolated areas in the prefecture. The area where the target village of the present investigation is located was the first one reported (877 cases, 1976; Zhou et al. 1997). Most of the cases diagnosed and confirmed so far (1,386 out of 2,241 cases) are clustered in this township (Jin et al. 2003).

The target village is a multiethnic mosaic one. Ethnic Hmong people have been living together with ethnic Han people (ethnic majority in China) in the same village for generations. The house architecture style and many aspects of daily life in ethnic Hmong families have been largely Hanized (Sinified). The intra-marriage (only marry the spouse from the same ethnicity, even the same ethnic offset) is still strictly followed by Hmong people.

Most of the work about this endemic population released so far was focused on environmental causes and confirmed the causality between indoor burning of high arsenic-content coal and the excess prevalence of arseniasis cases in rural population (Jin et al. 2003; Zheng et al. 2005; Liu et al. 2002; Zhou et al. 1993). In a preliminary investigation conducted in this village in 2002, a family aggregation of arseniasis cases was observed in several ethnic clans, which were proved to live in the same village for generations (Lin et al. 2003).

Expanded knowledge of various risk factors, either related or non-related to exposure, and of their combination is urgently required to pave the way for a quantitative understanding of all the factors that might impact on the excess risk of arseniasis. Since a few dental fluorosis cases were observed in the village during our preliminary field work in 2000 (not reported yet), the association of fluorosis and arseniasis was also set to be one of the objects of the present investigation.

Subjects and methods

Subjects

All members (n = 702, there of 369 males) of three patrilineal ethnic clans (two of Han ethnicity and one of Hmong ethnicity) were enrolled in the present investigation. All the patrilineal ethnic clans concerned have been living together in the same village for generations. Two ethnic Bouyei women who joined ethnic Han clan by marriage were excluded from the analysis. The two major ethnic Han clans (clan G and B) have been keeping intact clan genealogy records that show that their forefathers settled in the village during the first half of 19th century (in Qing dynasty). G1, G2, and G3 represent different subclans (lineages) of clan G, the largest patrilineal clan in the village. All G1 members are the consanguineous offspring of the first settler couple (settled in this village in 1822). Subclan G2 is made up of the posterity of a boy adopted by the clan in 1900. G3 members are the descendants of two brothers who joined the clan with their remarried widow mother in early 1950s. The Hmong people (including clan P and some sporadic families) in this village belong to a special Hmong offset, called Wan-Shu-Miao (Bent-Comb Hmong).

The present study, including its ethical aspects, was approved by the local public health authority and met all the legal requirements of Chinese laws and regulations.

Epidemiologic study

A cross-sectional epidemiologic field study was conducted in all the members of three local patrilineal clans in the target village in April 2004. The investigation was conducted at two levels: family level and individual level. The field work included door-to-door questionnaire query and physical examination. The arseniasis cases in all (three) ethnic clan members were diagnosed, namely, by dermal lesion symptoms (hyperkeratosis of palms and soles, hyper- or hypo-pigmentation of body trunk) according to “Diagnosis guideline for arseniasis, WS/T 211-01” issued by the Chinese State Ministry of Health. The main points of the guideline have been described in English elsewhere (Lin et al. 2006; Chen et al. 2009). All the dental fluorosis cases were diagnosed according to the “Clinical diagnosis guideline for dental fluorosis, WS/T 208-01”, which was formulated based on the methods recommended by WHO (1997).

The information collected at both levels (either family level or individual level) included subject’s name, gender, ethnicity, education, year of birth, smoking and tea drinking habits, arseniasis or fluorosis status, history of other chronic diseases, history of cancers; and resident place, family annual income, the square meters of each family’s kitchen, family consumption of vegetables, of meat, fish and other animal source food, and family history of cancers.

The questionnaire-based survey was conducted on door-to-door bases by the preventive medical personnel of the local Center of Disease Prevention and Control (CDC) and of the township hospital, who had taken a 2-day long training course, including the discussion and disabusing before the fieldwork started.

Statistic analyses

Grouping and coding of variables

The risk factors were grouped into non-exposure-related factors (demographic factors, such as: age, gender, ethnicity, annual income, clan, and education, etc.) and exposure-related factors (such as: drinking water source, smoking habits, tea drinking habits, alcohol consumption, vegetable consuming status, kitchen area, stove types in the family, the period the family used local high As-content coal, the period the family utilized the poorly or unventilated traditional stove, the point of time the family switched to a well-ventilated stove), respectively. Variable grouping and coding of all factors are listed in Table 1. Among all the variables, drinking water source, smoking habits, tea drinking habits, stove types, and the clan consanguinity were coded in the form of dummy variable. Reference groups are coded as “null”, others were incorporated into the model as linearity grouping variable. The individual’s status of arseniasis was set as the response variable (Y).

Table 1 Variate coding and univariate logistic regression analysis of arseniasis cases in target village

Logistic regression analyses

A two-level logistic regression model was employed. The data of the present study comes from a two-level structure, including level 1 or the individual level, and level 2 or family level. The lower level, the individual level, is nested within the higher level, the family level. The data structure inevitably possesses the “background effect” or the “group effect”, which means that the individual risk to arseniasis might be associated with the family environment or the family life style. The statistic analysis at the individual level would go wrong if the background effect had not been taken into suitable consideration. On the other hand, type I error (false positive) would be maximized since the consequences observed by conventional procedures could be the interaction between the effect at the individual level and the background effect. The multilevel model approach would serve as an ideal system to deal with the situation with background effects.

The individual demographic, socioeconomic status and the parameters concerning the indoor exposure to arsenic were analyzed. Possible family clustering of arseniasis cases was taken into account. For a rational evaluation of the errors originated from the data of a two layer structure and for the adjustment of confounding factors, the combination of multilevel and multivariate logistic regression model and non-conditional univariate logistic regression model was applied to analyze the risk factors and their impact on the excess arseniasis prevalence.

The multilevel regression model

To prevent an ecological fallacy in the case that there are multilevel structures or aggregating data, the multilevel logistic modeling was applied. When probability of arseniasis treated as dependent variable, the individual ID and family ID were treated as independent variables, the multilevel regression model can be expressed as:

$$ {\text{logit}}(p) = \beta_{0j} + \sum \beta_{i} x_{ij} + e_{0ij} $$
(1)
$$ \beta_{0j} = \beta_{0} + u_{0j} $$
$$ u_{0j} \sim N(0,\sigma_{{_{u0} }}^{2} ),\text{var} (p_{ij} ) = \delta \pi_{ij} (1 - \pi_{ij} )/n_{ij} . $$
(2)

Here: logit(p) = log [p/(1 − p)]: the converted probability of arseniasis, \( \beta_{0j} \): level 1 random intercept; \( \beta_{i} \); : treated as the effect of level 2 explanatory variable \( x_{ij} \) in linear functions; \( e_{0ij} \): residual of level 1 (individual), i = 1, 2… for level 1; \( u_{0j} = \beta_{0j} - \beta_{0} \): difference between logit units of level 2 and the total logits. The random variances were divided into two components \( \left( {\sigma_{{u_{0} }}^{2} + \sigma_{eij}^{2} } \right) \). \( \sigma_{eij}^{2} \): variance from the individuals; \( \sigma_{{u_{0} }}^{2} \): variance of level 2, the higher \( \sigma_{{u_{0} }}^{2} \) means the higher aggregating in level.

Similarly, the other possible risk factors can be added in the models to evaluate their effect on the arseniasis with or without the controlling of individual or family variables.

All statistic analyses were performed with SAS 9.1 software package. Fitting of the multilevel logistic regression was performed with the PROC NLMIXED procedure of the package. The judgments whether the variable would be incorporated into the model for risk assessment was based on the comparison of the statistic increases of goodness of fit of −2log-likelihood test (test of level: α = 0.05) with different models.

Results

General description of exposed population in the village

A total of 702 subjects (thereof 369 males) in 178 families were actually included in the present study. The total registered permanent residents listed on the local government record then were 731 (male 382 and female 349). Twenty-nine subjects (male 13 and female 16) were absent in the village during the investigation. The rate of losses was 4.0% (male 3.4% and female 4.6%). Among all the subjects investigated, 60.8% were adults; the remainders were infants, primary school pupils or high school students. Totally 58.1% of the 702 villagers were of ethnic Han origin, 41.9% were of Hmong origin. Families with 3 or more members accounted for 82.6% (147/178) of all the families. Overall 157 villagers were diagnosed and registered as arseniasis patients before/during the investigation. The crude prevalence of arseniasis in the village was recorded as 22.4%.

All the subjects enrolled in the present investigation covered almost all the members of three major patrilineal clans (two are of ethnic Han origin and one is of Hmong origin). In addition, a few of sporadic families (mostly, of Hmong origin) in the target village were also included.

Notable variances of arseniasis prevalence were observed not only between the clans, but also between different subclans of clan G. A significant lower prevalence was recorded in the subclan G2, compared with G1 or G3 members, although the whole G clan has been living in the same big family for decades. Some families of different subclans even share the same kitchen or the same living room.

Univariate logistic regression model of arseniasis cases

Table 1 displays the results of univariate logistic regression. The association of the socioeconomic status and parameters of indoor exposure to arsenic with the arseniasis prevalence in the residents of target village is displayed. Among the demographic parameters, ethnicity (the prevalence in ethnic Han residents was significantly higher than that in Hmong residents), gender (men suffered from arseniasis more than women), age (older villagers showed a higher prevalence), clan consanguinity (the prevalence among the members of ethnic Han clans B, G1, and G3 were found significantly higher than that of the ethnic Hmong clan P members) were significantly associated with arseniasis risk. Nevertheless, per capita annual income, with the exception of the highest income group (per capita annual income >1,000 Chinese Yuan, an equivalent to about 145 US$/person/year then) and the education level, with the only exception of senior high school group, showed mostly no impact on the arseniasis prevalence.

Some parameters related with As exposure were finally proved to significantly impact on the arseniasis prevalence. Smokers had a markedly elevated arseniasis risk than non-smokers. The longer a family used the local high As coal or the longer traditional poorly or unventilated stoves were used, the higher was the chance of the family members to suffer from arseniasis. At the same time, another two parameters, i.e. the type of drinking water sources and the family kitchen area, failed to show a significant association with the arseniasis risk.

The univariate logistic regression also found that tea drinking habits, alcohol consumption, and vegetable consuming status had no impacts on the excess prevalence of arseniasis among the villages (data not listed in Table 1).

It is worthwhile to mention the high level of superposition of fluorosis with arseniasis. About 70% of the diagnosed fluorosis cases were arseniasis patients, too. The individuals diagnosed as arseniasis patients would face a nearly 10 times higher risk to suffer from fluorosis, too. All the fluorosis cases were diagnosed only by dental symptoms, no skeletal fluorosis case was ever diagnosed in the village.

Multivariate analysis of arseniasis prevalence

Fitting the data of total 702 subjects (178 families) in two-level logistic regression model, the probability of arseniasis prevalence is:

$$ {\text{Logit }}\left( p \right) = - 1. 5 1 9 1+ 1. 2 2 10 \times {\text{family}} . $$

The parameter of level 2 (family) was 1.2210 with a standard error 0.4596. The fixed effect of level 2 (family) on the arseniasis occurrence was highly significant (p = 0.0086). It, thus, suggested that there was a significant aggregating effect from level 2 (family). The estimated OR of families for arseniasis was 3.39 (95% CI 1.38–8.34).

Similarly, the variates that have been proved by univariate logistic regression to be significantly associated with arseniasis prevalence were put into further multivariate analysis.

Table 2 displays the multivariate analysis of arseniasis prevalence in all exposed villagers, in ethnic Han and in Hmong, respectively. After adjusting for confounding factors, the two-level multivariate logistic regression analysis confirmed the family aggregation of arseniasis cases not only in all exposed families together (with both ethnicities combined), but also in ethnic Han families.

Table 2 Multilevel logistic regression model analysis on the risk of arseniasis

The data also revealed the significant association of individuals’ ethnicity with arseniasis (p = 0.001). The ethnic Han farmers suffered much more frequently from arseniasis, compared with their Hmong neighbors (OR: 15.18, 95% CI 3.45–67.35). However, no association of gender or per capita annual income with arseniasis prevalence could be confirmed.

The arseniasis prevalence was proved to be related with the exposure duration, e.g. either the period the family burnt high As local coal indoor (p = 0.0001) or with the duration of using a local poorly or even unventilated traditional stove in the house (p = 0.0001). Every additional 10 years of usage will result in a 1.85-fold increase of arseniasis cases for high As coal burning (OR 1.85, 95% CI 1.29–2.66) and a 2.38-fold higher prevalence for traditional stove utilizing (OR 2.38, 95% CI 1.65–3.49). Smoking habits were proved among the factors that significantly increase the risk of arseniasis, too. Smokers, no matter what ethnicities they are, face a markedly increased risk, compared to their non-smoking fellow villagers (p = 0.0001, OR 5.42, 95% CI 2.25–12.93).

No statistically significant differences could be reached for family kitchen area, type of drinking water source, the stove type the family is using now, or the point of time the family turned to using the well-ventilated stove on the arseniasis risk in the multivariate analysis.

When variance analysis with multilevel model was performed on the higher level (family level), statistical significance was reached (p < 0.05).

Discussion

The causality of indoor exposure due to indoor burning of high arsenic-content coal with the typical symptoms of arseniasis (namely, skin lesions) has been shown by several studies (Jin et al. 2003; Zheng et al. 2005; Liu et al. 2002). A variance in the individual susceptibility to chronic arsenic poisoning has been suggested by Vahter (2000). Hopenhayn-Rich et al. (1996b) reported on a population chronically exposed to high levels of arsenic in drinking water in northern Chile where ethnicity- and gender-dependent variations were found in individuals’ methylating potential. The present study confirms the view that either the indoor exposure or the individual’s hereditary background (ethnicity or clan consanguinity) significantly influenced the arseniasis prevalence in the exposed population. The data from our previous work suggested the Hmong ethnicity was less susceptible to arseniasis either in the comparison of two neighboring ethnic clans, one ethnic Han and one Hmong (Lin et al. 2006) or the least susceptible among all four local ethnic groups in the endemic township (a total of 11,153 residents, covering all four local ethnicities: Han, Bouyei, Hui, Hmong), of which the target village of present investigation is a part of (Chen et al. 2009). A parallel conducted work found that ethnic Hmong clan members inhaled more As from polluted indoor air and ingested more arsenic via daily food than their Han neighbors. Hair and urine samples from Hmong individuals also showed higher As body burden. The exposure duration for both ethnic clans is quite similar (Lin et al. 2006).

Family aggregation was confirmed in a two-level multivariate logistic regression analysis in all the exposed villagers. However, the aggregation could be validated in the exposed ethnic Han subjects only. It might be the case that it was, actually, the reflection of ethnic aggregation of diagnosed arseniasis cases in the village.

Interestingly, the data analysis also indicated that smoking is positively associated with an elevated risk of arseniasis in the exposed rural population. Since inhalation served as one of the major exposure routes of inorganic As and smoking considerably changed the breathing behavior. The arsenic content of cigarettes that is reported to be between 500 and 900 ng per gram processed tobacco (Hoffmann and Hoffmann 1997) may have also contributed to the excess prevalence of arseniasis among the villagers.

Various authors have reported dose–response relationships between cancer risk and As concentration of drinking water supply (Chen et al. 1985, 1986; Chiou et al. 1995; Hopenhayn-Rich et al. 1996a, b, 1998; Tondel et al. 1999; Rahman et al. 2006). In this unique exposure scenario, the multiple exposure routes might be much more complex and may vary significantly from time to time and from case to case. The pollution situation has been improving since the last decade, as a series of administrative or technical countermeasures to fight As pollution has been pushed forward (Zhou et al. 1993; Lin et al. 2006; An et al. 2007). No proper historical exposure parameter for each ethnic clan, each subclan, any family or any individual at the moment of case diagnosis or at the year skin lesion symptom first onset that could be traced back, there is no possibility to include the indoor As exposure level or the internal As load of exposed individuals or of exposed families at the time of diagnosis or at the time the arseniasis symptoms emerged into current logistic regression analysis.

It would be surprising in the first sight that the area of kitchens in the exposed farmers’ houses failed to display an association with excess arseniasis prevalence. It might be due to the fact that the indoor air concentration of As then was very high. In 1991, when the only overall field investigation ever was held in this endemic village, the total As in indoor air samples in the kitchens was found as: 0.455 ± 0.304 mg/m3 (range: 0.046–0.840 mg/m3; Zhou et al. 1993). The extremely high exposure level might saturate the detoxification capacities and other defending mechanisms of exposed individuals toward inorganic As. The indoor air concentration of total As exceeded by far the levels the Chinese National Criteria (GB) requires (<0.03 mg/m3).

It would be also surprising that the per capita annual income of the exposed individuals was not associated with arseniasis prevalence. Although the deviation in the annual income among local farmers had been expanded in recent years, the overall level is still too low. The data collected from our questionnaire query show that Hmong residents in the village had an annual income of about 722 ± 389 Chinese Yuan/person/year, while their Han neighbors had a slightly higher income at the level of 990 ± 854 Yuan (Lin et al. 2006). At the time of the investigation, this was equivalent to a level less than $0.5/day/person. The observed income differences in the study area were not meaningful for the real variances of living standard among the villagers.

It is logical that the drinking water source showed no impact on the excess prevalence of arseniasis in target village. For decades, the total As concentration in the drinking water sources in this village has never been detected to exceed the level the Chinese National Criteria requires (<0.05 mg/l; Liu et al. 2002; An et al. 2007; Lin et al. 2006, 2007). Water samples collected from all 4 drinking water sources in this village during this study showed As concentrations within the range of 0.014–0.025 mg/l (0.0181 ± 0.0049 mg/l; Lin et al. 2006). However, the mean value detected slightly exceeded the WHO provisional guideline for drinking water (0.01 mg/l; WHO 2004).

It is also worthwhile to draw attention on the observed high co-morbidity of arseniasis and dental fluorosis in the investigated villagers. Coal of the Guizhou Province is rich in some trace elements like arsenic and fluorine (Finkelman et al. 1999). Most probably, the unusual high level of co-morbidity of arseniasis and dental fluorosis is based on the fact that both hazardous elements share the same exposure route: namely, by inhalation and/or by ingestion of contaminated food.

Present analysis was conducted in a multiethnic, hyperendemic village where three major patrilineal clans of different ethnic origins live together and have been proved to be exposed to indoor burning of high As-content coal at the similar levels and for similar time duration (Lin et al. 2006). The intact diagnosis and medical surveillance record of the residents in the village since the 1990s provided a firm data base to ensure a rational analysis. However, due to the relatively small sample size, further work with a larger exposure population in the same endemic area is expected.