Intended Use and Clinical Application

Humans are constantly being exposed to metals in the environment. High-dose exposures of some of these metals, such as lead, mercury, arsenic, and cadmium, can cause clinical toxicity [13]. Assessing human exposure to metals is essential to characterizing the hazard; assessment can be accomplished by evaluating the environmental source of the exposure, such as air, water, or food, or by evaluating the amount of the metal absorbed by the body. The quantification of chemicals or their metabolites in clinical specimens, such as blood and urine, by using instrumental analysis in the laboratory is known as biomonitoring [4].

Biomonitoring can help diagnose, treat, and monitor patients and populations for diseases and disorders caused by exposure to harmful chemicals. In the clinical setting, laboratory tests are used in conjunction with the patient’s history, circumstances surrounding the health concern, and the physical examination to diagnose diseases or disorders and make recommendations for the patient. In a population, laboratory tests can be used in the management of diseases and disorders by monitoring changes in trends after an intervention, such as pharmacotherapy or cessation of further exposure to a chemical. In addition, biomonitoring can characterize the prevalence of specific chemical exposure in a population, identify groups in the population with varying levels of exposure to a chemical, and assist with prioritizing research activities when resources are limited.

The Environmental Health Laboratory at the Centers for Disease Control and Prevention (CDC) supports a National Biomonitoring Program that develops analytical methods to measure metals such as mercury (total and speciated), arsenic (total and speciated), cadmium, lead, cobalt, tungsten, uranium, molybdenum, antimony, and other trace, toxic, and essential metals. For each of these metals, CDC estimates their concentrations in blood or urine in the general U.S. population by age, sex, and race or ethnicity. This information is available in the National Report on Human Exposure to Environmental Chemicals [5]. Two examples of the use of these laboratory methods to monitor populations or assess patients for exposure to these metals are shown below.

Population Monitoring

Blood-lead concentrations in young children (aged 1 to 5 years) in the general U.S. population have been monitored in the National Health and Nutrition Examination Survey (NHANES) since 1976 [6] because lead is a known neurotoxicant without a “safe blood level” in young children. Over the years, the data from NHANES have demonstrated the following: lead in gasoline was the major source of exposure to lead among children, removal of lead from gasoline significantly reduced childhood exposure to lead, and risk factors associated with higher blood-lead levels in young children included race (non-Hispanic black children) and family income (a poverty income ratio <1.3) [7]. The percent of young children with a blood-lead level ≥10 μg/dL has progressively declined from 88 % in 1976–1980 to 0.8 % in 2007–2010 [7]. The estimated geometric mean for blood-lead level in young children surveyed during 2007–2010 was 1.3 μg/dL (95 % CI = 1.3–1.4). A blood-lead level that indicates a higher than average lead exposure in young children is the upper reference interval value defined as the 97.5th percentile of the distribution for young children (aged 1 to 5 years) from two consecutive survey cycles of NHANES. Continued efforts to prevent harmful lead exposure among young children are necessary and should ensure that homes are lead safe, reduce lead content in environmental sources, and increase awareness of lead hazards and nutritional interventions that can decrease lead absorption from the gut.

Patient Assessment

In 2010, a health investigation documented severe lead poisoning in young children from artisanal gold mining regions in northwest Nigeria [2]. From May 2009 to May 2010, 118 of 463 children younger than 5 years of age died of presumed lead poisoning in two villages that were surveyed. Seizures prior to death were observed in 97 of the 118 children who died. Venous blood-lead test results confirmed lead poisoning in symptomatic children and determined which children required chelation therapy. The median blood-lead concentrations for children from the two villages were 144 and 86 μg/dL, respectively. The blood-lead concentrations varied from 37 to 445 μg/dL in the combined test results from the two villages. Mortality before and after the health and environmental interventions were estimated as 43 and <1 %, respectively.

Laboratory systems for the performance of clinical testing are complicated because many steps, such as specimen collection, laboratory testing, and reporting of results, are involved in the process to produce a test result. This process includes three phases: pre-analytical, analytical, and post-analytical. To achieve the desired outcome from the laboratory test result, it is essential for the physician or public health officer to communicate with the laboratory representative to clarify the requirements and expectations at these phases because they can affect the value of the test result or its interpretation. The purpose of this article is to provide the physician with additional information on the considerations for clinical laboratory testing for metals.

Pre-analytical Factors to Clinical Laboratory Testing

There are several conditions or factors that impact eventual interpretation of laboratory test results, including patient characteristics, collection equipment, and other items outside of the laboratory’s control. Some of these items, such as the use of a chelator prior to the collection of the clinical specimen for a urine mobilization or challenge test, were addressed in other papers published in the December 2013 issue of the Journal of Medical Toxicology. Other pre-analytical factors include patient selection (healthy vs. active disease), patient preparation for sampling (fasting, time of day, exercise, tobacco and alcohol use, medications, pregnant, body mass index, altitude, posture), specimen collection (type of specimen, time since exposure to chemical, steady state, anticoagulants, preservatives, tourniquet time, container used, order of draw), specimen processing (initial separation and centrifugation), specimen storage (freeze/thaw cycles, light and temperature sensitivity), and specimen transport [8]. The careful selection and preparation of patients can minimize biological variation. Biological variation or individuality, including age (metabolism, renal elimination based on glomerular filtration rate, blood volume), sex (skeletal muscle mass, volume of distribution, hormones), temporal variation (daily or monthly), and genetic polymorphism (for arsenic methylation) can affect test results or reference values [9]. A “best practice” is to obtain the laboratory representative’s preference for these factors before specimens are collected.

At CDC, the following procedures are recommended to avoid potential contamination of blood or urine during collection. When multiple tubes of blood are being collected for other tests, the first sample of blood should be used to measure metals. Also, pre-screened or certified trace-metal-free phlebotomy equipment and specimen vials should be used to collect blood and urine because the presence of metal contaminants in collection vials and medical devices, such as indwelling urinary collection devices, can cause falsely elevated measurements.

After collection, blood specimens should be stored and shipped at refrigerated temperatures (∼4 °C). Urine specimens that need to be shipped overnight should be frozen as soon as possible and shipped frozen (using dry ice) to minimize interspecies conversions. For example, urine specimens that will be used to measure arsenic should be flash-frozen. Urine specimens that will be used to measure mercury are added to a collection vial containing a preservative that prevents the reduction of inorganic mercury to elemental mercury, which can volatize into the atmosphere.

Clinical Specimen

Blood and urine are commonly used to measure for metals in humans. Testing hair for metal exposure is not recommended because of the potential for external contamination and other limitations that are mentioned elsewhere [10]. The preference for either blood or urine depends on the physicochemical properties and toxicokinetics of the metal, the time between exposure and specimen collection, and the availability of an analytical method to quantify the metal in the specimen.

Blood is preferred to assess an acute exposure (hours) and urine is preferred to assess a subacute (days) or chronic (weeks) exposure because metal salts tend to be cleared quickly from the central compartment and eliminated by the kidneys. For metals with a long biological half-life, such as lead, cadmium, and methyl mercury, blood is the preferred specimen to assess for the level of exposure in the body because of the decreased variability of the concentration in blood compared to urine. For example, methyl mercury is minimally metabolized by the body to inorganic mercury; thus, it is primarily eliminated in the feces and not in the urine [11, 12]. In the gut, bacteria converts methyl mercury to inorganic mercury [13]. The availability of reference values or decision limits for the test result is another consideration in selecting the type of clinical specimen to use for testing.

When using a urine specimen, it can be a sample collected over 24 h, a timed spot sample or a randomly collected spot sample. Each of these approaches has its benefits and limitations that partially determine the sampling strategy used for specimen collection [14]. For example, urine spot samples can yield variable results owing to the diurnal variation in the rate of elimination by the kidneys. A urinary creatinine-corrected concentration of a metal can be used to adjust for urinary dilution (water content) when only a spot sample of urine is available because of logistical considerations. The urinary creatinine-corrected measurement is best used to compare test results for properly collected urine specimens from the same patient or among patients of similar body size because muscle mass independently contributes to the urinary creatinine concentration and can bias these results.

Special Concerns with Specific Metals

Mercury exists in three different forms: inorganic, organic, and elemental. This is significant because each form has a different type of clinical toxicity and a preferred laboratory test to measure it in the body (Table 1). For example, organic mercury, such as methyl mercury, is a neurotoxicant and whole blood is preferred to measure it. Inorganic mercury such as mercuric chloride is a nephrotoxicant and urine is preferred to measure it. Elemental mercury is absorbed by the body in the elemental form, which is oxidized to inorganic mercury and excreted in the urine. The exposure to toxic levels of elemental mercury vapor in humans can lead to neurotoxicity from the distribution of elemental mercury into the central nervous system compartment. Because of these differences among the various types of mercury, measuring specific species of mercury, such as inorganic mercury and methyl mercury, instead of total mercury is preferred.

Table 1 Clinical specimens and analytical methods for laboratory testing of mercury and arsenic species at the Centers for Disease Control and Prevention

Arsenic is commonly categorized as inorganic or organic because of the differences in health outcomes resulting from the exposure to these chemicals. For example, inorganic arsenic is a known human carcinogen and organic arsenic is not. Seafood is a common source of human exposure to organic arsenic, such as arsenobetaine, arsenocholine, trimethylarsine oxide, and arsenosugars, in the diet. The body metabolizes inorganic arsenic in succession as follows: arsenate (+5) to arsenite (+3) to monomethylarsonic acid (MMA) to dimethylarsinic acid (DMA); thus, MMA and DMA are considered metabolites of inorganic arsenic. Because of these differences among the various types of arsenic, measuring the specific species of arsenic instead of total arsenic is preferred.

Analytical Factors to Clinical Laboratory Testing

General Analytical Method

Biomonitoring requires sensitive (ultra-trace or trace) analytical methods because the concentrations of environmental chemicals in the body tend to be very low. In addition, a method must be precise, accurate, specific, able to detect multiple analytes, high throughput, and rugged to support a biomonitoring program. Mass spectrometry is the detection system of choice because of its high specificity and sensitivity and ability to selectively detect several analytes during a single analytical run. Special precautions to avoid contaminating specimens and samples are necessary when measuring metals because metals are ubiquitous in the environment.

At CDC, inductively coupled plasma mass spectrometry (ICP-MS) is used to measure trace concentration of multiple elemental metals in a liquid sample preparation of a clinical specimen, such as urine or whole blood [15]. Briefly, the instrumental analysis process is as follows: a diluted liquid sample containing an internal standard is aerosolized using argon gas and converted to ionized atoms using plasma energy (6,000 to 8,000 K); the ionized atoms pass through ion optics, a Dynamic Reaction Cell™ (DRC), a quadrupole mass filter; and the ionized atoms are selectively counted at the detector. The DRC™ can minimize or eliminate polyatomic interferences resulting from the carrier argon gas when measuring cadmium, manganese, selenium, and arsenic. The intensity of the detected ions correlates with the concentration of the elemental metal in the liquid sample after comparing the signal ratio of the ions to the internal standard with that from a concentration response curve based on calibration standards. Internal standards are added to the specimen, blanks, and calibrators. Quality-control materials at high and low cutoff values are included with each run to evaluate the accuracy and precision of the analytical process, to determine if the analytical system performed within accepted parameters for accuracy and precision, and to identify potential time-related trends for these analytical metrics.

A chromatography separation step is introduced before the ICP-MS step when speciation of metals, such as mercury or arsenic, is desired. High-pressure liquid chromatography can be used to separate selected arsenic species (arsenic acid, arsenous acid, MMA, DMA, arsenobetaine, arsenocholine, and trimethylarsine oxide) [16]. Triple spike isotopic dilution with the use of solid-phase micro extraction to deliver a sample to gas chromatography coupled to ICP-DRC™-MS can be used to measure selected inorganic and organic mercury species [17].

Quality Assurance and Quality Control

Analytical methods are validated by defining the method’s accuracy, precision, specificity, sensitivity, linearity and range, limit of detection (LOD), stability of the analyte in the tested specimen, and ruggedness. In addition to these performance metrics, quality assurance (QA) and quality control (QC) are necessary to detect systematic failures during an analytical run and to ensure that the desired performance requirements are met over time and among laboratories. Control measures to ensure the accuracy and reliability of the test result can be both internal and external QC and can include proficiency testing. An example of an internal control measure is repeat measurements of samples with known values to confirm the validity of the analytical run and to measure analytical precision. QA includes promoting, monitoring, and evaluating the overall laboratory testing process. QA is typically accomplished by auditing the test management (e.g., collecting and identifying specimen and reporting test results), quality control, and personnel performance and training [18].

Post-analytical Factors to Clinical Laboratory Testing

LOD

When conducting ultra-trace or trace analysis, it is important to consider the reporting and the mathematical approach to handle test results below the analytical method’s LOD. The LOD defines the lower limit of the analytical method. It is the lowest concentration that the analytical method can reliably detect and is mathematically determined by a defined probability that the concentration is different than zero. This approach to estimate the LOD is subject to type 1 (alpha or false positive) and type 2 (beta or false negative) errors. Although laboratories desire to minimize these types of errors in estimating LODs, a large sample size of measurements can be required to accomplish this task. For example, CDC uses the Taylor method (LOD = 3S 0, where S 0 is the extrapolated standard deviation [SD] as the concentration of the analyte approaches zero based on a series of measurements at low concentrations plotted against their calculated SDs) when the need to estimate the false-negative rate is remote or when the assay is used infrequently [18].

Several different approaches exist to estimate laboratory test results below the LOD (<LOD) that can be used to derive population estimates, such as a reference interval. If comparing reference values from two different sources, clarifying the approach that was used would be helpful because it can affect the estimate. Fixed and multiple imputations are two approaches to impute results <LOD, and their benefits and limitations are discussed elsewhere [19].

Reference Values and Intervals

Reference values and intervals are comparison data used to interpret a patient’s laboratory test result to conduct a health screening and to diagnose, treat, and monitor a disease. Although a common term for reference value is “normal value,” the latter is no longer preferred because it can be ambiguous. For example, does normal mean physiologic (healthy) or background levels of exposure to a chemical? What about test results that vary with age? Thus, the term reference value is more appropriate than normal value [20]. In the medical setting, a laboratory reference value and interval help the clinician to create a comprehensive perspective for diagnosing and managing a disease. The laboratory test result is used in conjunction with the history and physical examination to arrive at a diagnosis. The order of increasing objective information is as follows: history, physical examination, and laboratory test result. Comparing the laboratory test result to a “healthy population” contributes to only one aspect of the medical evaluation. In the public health setting, reference values and intervals are used as a health screening tool and have a more important role than those in the medical setting. In public health, the purpose of laboratory testing is to identify persons with the potential for a disease by demonstrating that their laboratory test results are outside the reference limit (value or interval) or are unusual. An unusual laboratory test result causes suspicion for, but does not determine a diagnosis of a disease. As mentioned earlier, laboratory tests are used in conjunction with a history and physical examination to arrive at a diagnosis.

Thus, the primary purpose of a reference value and interval is to identify a person with an unusual laboratory test result. A clinician uses a reference value to help diagnose and manage (treat and monitor) a disease. An epidemiologist can use a reference value to identify the prevalence of disease (case finding), a group at risk for a disease (health screening), or a group in the population with an unusual laboratory test result, such as a blood or urinary concentration of an environmental chemical. The most important consideration when using reference values is to select the relevant reference population based on the intended application. For a reference value and interval to be of maximum use, the intended use must be known in advance.

Types of Reference Values

Several types of reference values exist and each has its advantages and disadvantages. Some of which will be discussed here. Population-based reference value is the most common tool used to help interpret a laboratory test result and is considered the convention for reference values and intervals. For a test that has a high individuality or inter-individual variability, population-based reference value is not generally a good basis for comparison to identify people with subclinical disease; a test result can lie outside of the person’s usual interval and remain within the value for the population. Thus, a population-based reference value is not the best basis for comparison to determine if a change in a patient’s laboratory test result has occurred (different from making a diagnosis). In this instance, the reference change value can be used (see “Appendix”). This is another reason that a laboratory test is used in conjunction with the history and physical examination to make decisions about the medical condition of a patient.

A conventional or population-derived reference interval encompasses the central 95 % of the distribution of values for the reference population, or typically, the healthy population, which is a two-sided estimate that is bounded by the 2.5th and 97.5th percentiles [21]. Thus, 5 % of the healthy population will be outside of the interval—2.5 % below and 2.5 % above the reference interval—and can be considered to have an “unusual value.” A single estimate at the 95th percentile of the distribution for the population is an example of a one-sided estimate. It should be noted that values outside of the reference interval do not suggest the presence of disease because these values still belong to the distribution of the population of “healthy” persons. In general, reference individuals are identified to form a reference population, which forms a reference sample group that is used to define a reference distribution to define a reference interval and then a reference limit.

For example, NHANES is a national survey designed to represent the general U.S. population [22]. It is the only national representative survey for which blood and urine specimens are collected from participants. Data are presented for the total population and for groups characterized by age, sex, and race or ethnicity. The survey consists of a cross-sectional, complex, multi-stage probability sampling design to generate data that is representative of the general U.S. population. Survey participants are selected based on the U.S. Census data for the year of interest and by dividing the USA into primary sampling units (PSUs) that are typically at the county level. PSUs are further divided into segments or city blocks. Households in each segment and persons in each household are selected randomly within designated age, sex, and race/ethnicity screening subdomains. Five thousand persons are selected and examined each year at 15 locations (2 years per cycle). Certain groups in the population are oversampled depending on the survey design. Oversampling increases the reliability and precision of estimates of health status indicators for these groups in the population. Each participant in the survey is assigned a sample weight, which represents the participant in the survey based on the number of persons in the population. Also, the sample weight represents the sampling design, non-responses, and under-represented groups in the population. Thus, sample weights are used when characterizing the distribution of the data for groups in the population to yield precise and unbiased national estimates.

CDC’s National Biomonitoring Program develops analytical methods to improve the laboratory diagnosis and detection of unsafe exposures to environmental chemicals and nutritionally related diseases and disorders [23]. The National Biomonitoring Program measures environmental chemicals and biochemical indicators of diet and nutrition in blood and urine of NHANES participants and uses these results to derive reference values and intervals for these measurements. Data are presented for the total population and for groups characterized by age, sex, and race or ethnicity. Environmental chemicals’ geometric means, estimates of selected percentiles including a one-sided estimate at the 95th percentile for the distribution of the population, sample sizes used for the calculation, and the 2-year survey period are reported in the data tables. Nutritional indicators are also reported and shown in the data tables using a two-sided estimate, such as at the 10th and 90th percentiles of the distribution for the population. A two-sided estimate is important because a too low or a too high concentration of a nutritional indicator such as iodine can cause adverse health outcomes. For additional information on these surveys, please refer to the National Exposure Report on Human Exposure to Environmental Chemicals [5] and the National Report on Biochemical Indicators of Diet and Nutrition [24].

Clinically fixed values are known commonly as cutoffs and are based typically on clinical outcomes or biological findings that allow for a meaningful interpretation. In the latter sense, these values are known as decision limits and direct the clinician to intervene because these values are associated with a known risk for disease (e.g., hemoglobin A1C and risk for diabetic retinopathy).

Laboratories are encouraged to set up their own reference values and intervals for their tests [8]. They can derive their own reference values or compare their values based on a small group of reference individuals to original reference values. For difficult-to-sample groups in the population, such as the elderly, neonates, and children, the laboratory can use well-defined or characterized, peer-reviewed information about the source of the data, such as selection of participants and analytical method.

Factors to Consider When Interpreting Laboratory Test Results

Factors that should be considered when interpreting laboratory test results are endogenous (age, sex, metabolism dependent on genetic polymorphisms), exogenous (diet, exercise, smoking, alcohol, medication), laboratory (specimen collection, storage, transportation, and analytical method), and post-analytical (units, imputing <LOD [19]) [8]. For example, the use of nutritional folate supplements can decrease blood MMA and increase urinary DMA concentrations in persons with low plasma folate concentration (<9 nmol/L) and chronic exposure to inorganic arsenic from drinking water [25]. Analytical bias (because of the efficiency of extraction or the sensitivity of the method or the choice of the biological matrix) can shift the distribution of the population’s test results to the right (positive) or left (negative) of the actual values. For example, a shift to the right will increase the false positive and decrease the false-negative rates using a cutoff value for a disease.

The next interpretive step is to compare the patient’s laboratory test result to the same or a similar group in the population. Intra-individual and inter-individual variations, or biological variability, should be considered also when interpreting laboratory test results. Biological variability determines our approach to comparing values because it can affect the performance of a laboratory test to screen, diagnose, or identify cases in a population [26]. Laboratory test results tend to vary between and within persons. The value for a person only spans a small portion of the range of values for the population for a laboratory test result with a high degree of individuality (CVi < CVg, where the intra-individual variance is represented by the coefficient of variation [CVi] and the inter-individual variance is represented by the coefficient of variation [CVg]). Thus, reference individuals can have values at the limits or at the middle of the reference interval for the reference population. However, the value for a person spans most of the range of values for the population for a laboratory test result that has a low degree of individuality (CVi > CVg).

For example, persons with values at the limit for a laboratory test that has a high degree of individuality can have values that vary from below the limit to above the limit because of different sources of variations. A value above the limit can be interpreted as unusual or as a new disorder, depending on the definition of the limit; however, the value is actually within that specific person’s usual variation. Thus, when evaluating for unusual individual test results, reference values based on laboratory tests with a low degree of individuality can be more useful than tests with a high degree of individuality. The degree of individuality for selected laboratory tests [2730] has been characterized using the index of individuality [26, 31] (see “Appendix”). If a reference value for a laboratory test is not sensitive to monitor for change in a specific person, a clinically based fixed criteria or a reference change value can be used to evaluate for change in the test result for that person [32].

Conclusion

Measuring the metal content in clinical specimens is the preferred approach to assess for exposure to these chemicals in humans. However, pre-analytical, analytical, and post-analytical factors must be considered in the clinical laboratory assessment of metals because these items can affect the test result and its interpretation. These factors include those endogenous and exogenous (e.g., lifestyle) to the host, those related to the analytical method and specimen collection, and those related to the mathematical analysis used to report results. The primary purpose of a reference value and interval is to identify persons with unusual laboratory test results. The most important consideration when using reference values is to select the relevant reference population based on the intended use of the test result.