Introduction

The advent of molecular, 16S rRNA gene-based methodologies has greatly facilitated investigations related to the ecology of complex microbial communities. Among them, denaturing gradient gel electrophoresis (DGGE) has been used extensively to profile microbial ecosystems present in a wide range of environments, including the human and animal gastrointestinal tract [13, 43, 46]. A literature search via ISI Web of Knowledge using “DGGE” as a keyword for the title and defining the time span between 1993, when the technique was introduced in the field of environmental microbiology [24], and 2009 generated about 720 publications. The popularity of DGGE is partly because like any other molecular approach, it does not suffer from culturability-imposed restrictions. Although it cannot provide phylogenetic information directly, it generates an instantaneous “snapshot” of a microbial population at that particular time and in response to a specific treatment. If carefully standardized to allow comparisons across multiple gels, DGGE offers a relatively rapid, inexpensive, and accurate alternative for large-scale comparative investigations [1, 5, 7, 25, 37].

Despite its widespread application, DGGE, like any other PCR-based fingerprinting methodology, has some well-documented limitations. Insufficient DNA extraction and preferential PCR amplification [44], co-migration of DNA fragments [14] with different nucleotide composition, and formation of multiple bands from a single species [29] can all mask the true diversity of bacterial communities. Since there is no straightforward remedy to eliminate these inherent difficulties, the best way to compensate for their occurrence is to process all samples via a highly reproducible and standardized methodology. In this way, it can be assumed that any such effects occur homogeneously, and thus, comparisons between DGGE patterns can yield statistically valid information [8]. However, a standardized methodology cannot entirely eliminate the variability arising from technical aspects of DGGE. Gel-to-gel variation in migration patterns, caused by even subtle differences during preparation of the denaturing gradients, is a widely acknowledged problem [36, 25, 32, 41]. In particular, it poses a major obstacle in experiments where the number of samples requires simultaneous comparisons across many gels [15, 31, 32]. Without rigorous alignment between all gels generated within a trial, what appears to be the method’s strength—that is, the analysis of a large number of samples—could be heavily compromised.

Commercially available software packages such as GelCompar II or BioNumerics (Applied Maths), TotalLab (TL) 120 (Phoretix 1D Advanced, Non-Linear Dynamics), and Quantity One (Bio-Rad), are invaluable in studies where DGGE or other similar molecular fingerprinting techniques are used to monitor shifts in microbial communities [11, 16, 17, 42]. They are particularly suited for within-gel comparisons as they can correct for distortions in a relatively easy and user-friendly way. Problems arise when comparisons of complex DGGE profiles collected across a large number of gels are required. For this scenario, different commercial programs take different approaches with variable results [2]. Standardized methodology is still not available and generally demands significant user intervention and supervision according to the software in use [3, 34]. Moreover, most commercially available software packages offer only limited means of statistical analysis in large datasets where dendrogram-based clustering or Dice similarity coefficient comparisons may not be the best way to analyze the data [2, 36].

In this study, we present a two-step methodology to address the problem of gel-to-gel variation during DGGE analysis. We then demonstrate how this methodology can be applied to large-scale DGGE analysis to generate accurately aligned, good quality densitometric profile data that can be statistically analyzed via powerful multivariate approaches, such as principal component analysis (PCA). Our findings suggest that similar approaches should be applied to exploit the full potential of DGGE as a tool for high-throughput fingerprinting of complex microbial communities.

Materials and Methods

Samples and Collection

Fecal samples from approximately 120 pullets of 7 to 8 weeks old were collected over two occasions as part of a much larger animal trial (held in the Department of Clinical Veterinary Science, University of Bristol) and used for the purposes of this study. Samples were produced by free defecation and stored at −20°C until further analysis.

DNA Extraction

Genomic DNA was extracted from the avian fecal samples using the QIAamp Stool Mini Kit (Qiagen, West Sussex, UK) following the general guidelines of the manufacturer with one major modification. To start the extraction, a larger amount of fecal material (500 mg instead of the 200 mg recommended) was mixed with a proportionally larger volume of lysis buffer (4 mL instead of the 1.2 mL recommended). The purity and concentration of the extracted DNA was measured via NanoDrop ND-1000 (Nanodrop Technologies, Wilmington, USA), and DNA quality was checked by agarose gel electrophoresis and staining with SYBR Green I (Sigma-Aldrich, Gillingham, UK).

PCR Amplification

The highly variable V3 region of the 16S rRNA gene (positions 341–534 on the Escherichia coli gene) was amplified using primers 341F (5′-CCT ACG GGA GGC AGC AG-3′), with a GC clamp (5′-CGC CCG CCG CGC GCG GCG GGC GGG GCG GGG GGC ACG GGG GG-3′) incorporated at the 5′ end, and 534R (5′-ATT ACC GCG GCT GCT GG-3′) [21]. This set of primers is specific to bacterial 16S rDNA and yields amplicons of approximately 193 nucleotides long. PCR amplification was performed using the HotMaster Taq Polymerase kit (5 Prime, Nottingham, UK). Each PCR reaction contained 2 U Taq Polymerase, 2× HotMaster buffer, 400 µM of each dNTP, 20 pmol each primer, 0.2 µg/µL bovine serum albumin, and 200 ng of DNA template. The final volume of the reaction mixture was adjusted to 100 μL with ultrapure water. Reactions were amplified in a thermocycler (Thermo Electron Co., Basingstoke, UK) using the following program: 94°C for 2 min, 30 cycles of 94°C for 20 s, 58°C for 10 s, 65°C for 20 s, and 65°C for 10 min last extension. PCR products were analyzed by electrophoresis on 1.5% agarose gel and visualized using SYBR Green I. Finally, they were purified using SureClean (Bioline, London, UK) and DNA concentration measured via NanoDrop ND-1000.

Denaturing Gradient Gel Electrophoresis

16S rRNA gene amplicons of the variable V3 region from different bacteria originally present in the fecal samples were separated via DGGE using the D-Code system (Bio-Rad, Hemel Hempsted, UK). Gel solutions were prepared using acrylamide–bisacrylamide (37.5:1) and combining appropriate volumes of 0%, 40%, and 60% of denaturants (Severn Biotech Ltd, Kidderminster, UK). Separation of PCR products (200 ng) was achieved via 8% polyacrylamide gel containing an increasing linear gradient of denaturants of 40–58% (100% denaturant corresponds to 7 M urea and 40% deionised formamide). Electrophoresis was carried out in 0.5× TAE buffer at 55 V at a constant temperature of 59°C, and each run was terminated at 775 voltage hours (approximately 14 h). Gels were stained with SYBR Green I (1:30,000 in 0.5× TAE buffer) for 40 min and destained in distilled water for a further 40 min. Gels were scanned via Pharos FX Molecular Imager (Bio-Rad) to the highest photomultiplier tube voltage without saturating DGGE bands.

Standards

To optimize and evaluate the accuracy of comparisons within and across gels, three types of standards, designated “synthetic,” “reference,” and “clone ladder,” were introduced in all gels. The synthetic standard was produced by V3 16S rRNA gene PCR amplification of DNA extracted from four different pig fecal samples, and this was loaded in four lanes spaced regularly across each gel. The reference standard was produced by V3 16S rRNA gene PCR amplification of DNA extracted from a single pig fecal sample, different from those used for the synthetic standard, and this was loaded in a single, roughly central lane on each gel. Finally, the “clone ladder” standard was constructed after mixing equal volumes of V3 16S rRNA gene PCR products from clones representing 14 predominant intestinal bacterial strains. It was also loaded once toward the middle of each gel. The choice of standards originating from pig fecal samples was made on the basis of profile complexity. Their DGGE profiles contained a large number of discrete bands spanning the entire gradient, rendering them ideal for within- and between-gel alignment. Moreover, the use of these standards had already been tested in our laboratory for smaller scale DGGE analysis (unpublished data).

Analysis of DGGE Gel Images

Step 1

TL120 v2006 (Phoretix 1D Advanced Software, NonLinear Dynamics, Newcastle, UK) was used to convert individual DGGE lanes to densitometric profiles. These profiles were then subjected to a series of standard functions provided in the software, namely background subtraction, band detection, and Rf calibration. Rf is a measurement of position along the lane, relative to its length. To correct for within-gel distortion, which usually appears in the form of gel curvature, the software automatically generates a series of smooth curves which represent points of equal Rf values across the gel. User input is required to attribute specific Rf values to these curves (using the synthetic sample lanes, as described below). This function is the backbone of accurate within- and between-gel alignments as discussed later in this study.

Step 2

Output from TL120 comprised the Rf values of all detected bands, their intensities, and their corresponding positions (measured in “pixels”) in the original profiles (uncorrected for distortion). The software does not provide access to complete corrected profiles, but the raw profile data were extracted. Together with the information on band positions measured on the Rf and pixel scales, these complete profiles can be aligned onto common axes and hence collated to form a single data matrix. This was carried out using Matlab R2008a (The Mathworks, Inc., Cambridge, MA), a programming language that is ideally suited to manipulation of large data matrices. The procedure is described in more detail below, along with the some of the multivariate analyses that were applied to the collated data matrix.

Results

The methodology described in this study was developed as a prerequisite to extract information from a large DGGE dataset. Two hundred and forty-five DGGE profiles were generated in total, which accommodated across 13 gels. The raw image of one of these gels is shown in Fig. 1a, while images from all other gels have been included in Electronic Supplementary Material, Fig. S1. All gels had the same structure with regards to lane occupancy; specifically, 19 samples and six standards were profiled on each gel. The use of appropriate standards was vital for developing the methodology. The synthetic standard generated a complex DGGE pattern with bands spanning the entire length of the gel. Its role was to guarantee efficient correction of within-gel distortion and, equally importantly, to facilitate precise between-gel alignment. The reference standard yielded a similarly complex DGGE pattern, and provided a set of independent “test” data, from which the effectiveness of the alignment procedure could be evaluated.

Figure 1
figure 1

a Image of a representative DGGE gel showing the loading regime of V3 PCR products from DNA extracted from chicken fecal samples. Lanes “S,” “R,” and “CL” correspond to synthetic, reference, and clone ladder standards, while lanes 119 correspond to avian fecal samples. b Image of the same gel showing how the Rf calibration was performed in the TL120 software. Rf lines have been “anchored” on 21 bands (white squares) across all four synthetic standard profiles. Rf values shown on the left are the mean values from all 13 gels

The first step in the alignment procedure was the analysis of profiles using standard functions in TL120. Of these, the Rf calibration was the most sensitive and labor-intensive process. For each gel, the software generated a series of broadly horizontal curves. One of these was manually “locked” to span the four instances of each of 21 distinct bands present in the synthetic standard lanes (Fig. 1b). An Rf value was assigned manually to each curve; this was the mean of the Rf values calculated across all gels from an initial pass through the alignment function. The upper and the lower curves, defining the gel area of interest, were by default given the Rf values of 0 and 1, respectively, while all other curves would be assigned with values between 0 and 1. Manually setting the Rf labels ensured that equivalent bands from different gels would all have the same Rf values after correction, a key step toward meaningful comparison across gels. Once the 21 “anchor” lines were set, effectively defining a common Rf scale for all gels, the software performed Rf position calculation for all remaining standard and sample lanes.

Although the synthetic standard lanes would typically contain more than 50 bands, we are demonstrating the method’s performance by using 21 bands for the alignment process. However, the large number of bands present on the synthetic standard lanes enabled us to test the effect of using different number of bands as “anchors” on the alignment outcome. As discussed later in the study, we tested the accuracy of the method by using no bands at all (just the extreme positions of the data), two (the upper and the lower bands only), five (in 0.25 increments), nine (in 0.1 increments), 21 (in 0.05 increments), and 47 (in 0.025 increments) bands.

The second step in the alignment procedure, carried out within the Matlab environment, comprised “piecewise” linear interpolation of the raw data profiles. Each “piece” comprised a region of profile delimited by the exported band positions as measured in pixels. This was interpolated (a Matlab script is provided as Electronic Supplementary Material, Fig. S2) to new abscissae delimited by the exported, corrected Rf values and spaced at intervals of 0.001 Rf units. Figure 2 illustrates the procedure. The effect is to stretch and shrink regions of the profiles such that after interpolation, all profiles contain the same number of data points (1001) and can be plotted with respect to common abscissae (chosen for convenience to correspond to an Rf range of 0 to 1 at intervals of 0.001).

Figure 2
figure 2

Magnified part of a typical DGGE profile, raw (a) and interpolated (b). Filled circles indicate identified peak maxima. Note how interpolation stretches or shrinks different pieces of the profile, effectively changing the relative location of peaks along the position scale. Grayscale images of the same profile (before and after interpolation) are shown in c

This process effectively performs correction for distortion and alignment onto a common Rf scale such that profiles can be collated across gels into a single data matrix, which can then be readily analyzed via any statistical approach. However, another source of unwanted between-gel variance is related to overall intensity, which can vary systematically with gel due to the subjective control of the photomultiplier voltage at the imaging stage. A straightforward means of mitigating this effect was to scale each profile to fixed intensity limits, which were chosen to be 0 (no peak) and 1 (the highest peak).

The success of the two-step methodology can be assessed by comparing the DGGE profiles of the reference standard from all gels before and after performing the alignment procedure (Fig. 3a, b). Before alignment, there is substantial variability caused by the “gel signature”[5], such that the profiles could even be perceived as originating from different samples, were their true provenance not known (Fig. 3a). However, interpolation using multiple bands during Rf calibration as anchors resulted in excellent alignment between all reference samples (Fig. 3b). The alignment process was then repeated using different number of anchors on the synthetic standard in an effort to estimate the minimum number required for reliable comparisons across gels. Calculation of Pearson’s correlation coefficient between all pairs of reference standards was employed to compare the accuracy of alignment (Fig. 3c). Best results were obtained with nine or more bands; however, the similarity between profiles was greater than 90% even using as few as five bands regularly spaced along the gradient. The efficacy of the methodology is also apparent in the aligned DGGE profiles originating from the avian fecal samples (Fig. 4). Visual inspection of the major band clusters (which we can assume are present in a substantial proportion of the data) shows that they appear to follow nearly straight lines.

Figure 3
figure 3

Comparison between reference standard profiles from all gels before interpolation (a) and after piecewise interpolation using multiple anchors (b). c Box plots showing median values of Pearson’s correlation coefficient between profiles of the reference standard after using different number of bands as anchors for alignment. Error bars and dots represent 95% confidence intervals and outliers, respectively

Figure 4
figure 4

All DGGE data after alignment shown in grayscale image format. LanesS,” “R,” and “CL” correspond to synthetic, reference, and clone ladder standards. Arrows indicate major band clusters present across the majority of the experimental data

We also compared the alignment accuracy of this two-step approach with that obtained after using an image analysis software program to perform both steps (normalization and alignment). Employing exactly the same standards (synthetic and reference standards), we used GelCompar II (the most popular commercial software program) to align and compare across a smaller set of data (six gels). The alignment outcomes obtained after using both approaches were comparable (data not shown), suggesting that provided that the particular set of inter-gel standards is used, both approaches deliver similar results.

However, the very nature of the two-step approach (aligning and retaining complete densitometric profiles for subsequent statistical analysis) offer some distinct advantages over the traditional approach employed by image analysis software programs, that of using “band patterns” for data interrogation. First, proprietary software programs, such as GelCompar or Bionumerics and TL120, initially perform a “band detection” step during which specific peaks of the densitometric profile are detected as discrete bands according to certain parameters set by the user. This is a key function for generating band presence/absence or intensity information data matrices, which, in turn, will be used for constructing similarity dendrograms and applying dimension-reducing techniques, such as PCA, partial least squares analysis, and non-metric multidimensional scaling analysis. This process is not without problems; this is illustrated in Fig. 5b. Bands were detected in all reference profiles following an approach very similar to that used from image analysis software programs. No matter what parameters were chosen, it was impossible to prevent some bands of low intensity and/or indistinct peak resolution from being differentially detected between profiles despite all the profiles originating from the same single standard. When the data are viewed as vectors (Fig. 5a), it becomes evident that these small bands represent small peaks, and it is minor variations in their shapes/intensities which causes them to go undetected by the software. It is actually a common practice for the software operator to visually inspect the outcome of this step and manually add or delete bands if necessary. Apart from introducing an extra source of variation in the analysis, visual inspection of large DGGE datasets can be a very intensive and time laborious process. On the contrary, the two-step approach presented here does not rely on band detection since subsequent statistical analysis is performed on the aligned, full densitometric profiles.

Figure 5
figure 5

Aligned DGGE profiles from all reference standards presented as whole vectors (a) and band patterns (b). Brackets have been used to define areas where bands have been differentially detected

The second problem when using banding patterns to generate presence/absence or intensity information matrices is associated with “position determination” or “band matching.” During this step, common bands between different lanes/profiles are identified. Image analysis software packages allow the user to set a “tolerance” parameter on the peak positions, which essentially represents the proximity between two bands to declare a match; the smaller is the value, the more stringent is the alignment process. However, the tolerance itself has then become a critical parameter in the analysis protocol, and one which is generally set somewhat subjectively by the user, and is also highly dataset-specific. In contrast, using whole vectors to perform multivariate analysis of large datasets effectively accommodates the issue of tolerance on peak position, provided that any residual misalignment is less than a typical bandwidth.

The ability of the methodology to compare across multiple gels and to use whole vectors for multivariate analysis was examined by PCA on the whole dataset when aligned using either two, five, or 21 bands of the synthetic standard (Fig. 6). The spread of the points corresponding to the reference/clone ladder standards and the separation between the different groups of profiles give a visual indication of the intrinsic experimental error. In the case of two bands, the outcome is clearly suboptimal, as it is difficult to separate the samples from the standards at all, despite the very obvious differences in their profiles (Fig. 6a). On the contrary, using five or 21 bands resulted in clear clustering between the four different types of profiles (Fig. 6b, c). However, the alignment achieved by 21 bands (nine and 47 bands produce very similar PCA plots and data are not shown) appears to be more accurate since clone ladder profiles form a more coherent group and synthetic profiles are more readily distinguished from their reference counterparts. These, admittedly, small differences should be evaluated considering the nature of the analysis; they may have significant effect when whole profiles are used for multivariate analysis.

Figure 6
figure 6

PCA scores plot of all DDGE data: a after piecewise interpolation using the first (Rf = 0) and the last (Rf = 1) bands of the synthetic standard as anchors, b after piecewise interpolation using five bands of the synthetic standard as anchors (spaced regularly in 0.25 Rf increments), and c after piecewise interpolation using 21 bands of the synthetic standard as anchors (spaced regularly in 0.05 Rf increments). Dashed circles denote differences in the coherence of groups between b and c. Key to symbols: Circles for avian fecal samples, filled squares for synthetic standards, triangles for reference standards, and filled triangles for clone ladder standards. Note that PCA was applied to DGGE data in the form of whole vectors rather than band presence/absence or intensity information

Discussion

Our aim was to develop a robust methodology that would guarantee reliable alignment across multiple gels and large datasets obtained after molecular profiling of complex microbial communities via DGGE analysis. At the same time, it should present the data in a “flexible” format, allowing for statistical analysis via a wide range of multivariate approaches. This was achieved by following a two-step approach. DGGE fingerprints were first calibrated by TL120 using appropriate standards, and the data obtained were then interpolated via Matlab to generate a new, aligned dataset. While TL120 is widely used to analyze DGGE profiles [10, 12, 19, 33, 40], this is the first time, to our knowledge, that these two software packages have been used in conjunction to optimize alignment across a large DGGE dataset. However, we are convinced that using the particular proprietary software programs is definitely not the only way to extract good quality information from large DGGE (or any 1D electrophoretic method) datasets. By carefully selecting the appropriate standards, this two-step approach could be adopted to any image analysis software package that allows extraction of the raw or, even better, the normalized data profiles, such as GelCompar II, and any statistical program suitable for analyzing highly dimensional datasets, such as “R” (R Core Development Team, 2004).

Generating identical profiles on different gels is technically very difficult. Hand-casting gradient gels will always introduce variability [4, 25, 26, 28] even when the whole process, from sample collection to DGGE analysis, is highly standardized and reproducible. Correcting for gel-to-gel variation is a critical step if comparisons across large datasets are to be achieved with a high degree of confidence [8, 32, 38]. Our study demonstrates that the same PCR product (reference standard) can produce dissimilar profiles when screened in different gels. This problem is further highlighted by the acknowledged need to statistically analyze data obtained from different DGGE runs by treating each gel as a statistical block [15] or by introducing a “gel” factor in regression analysis [30]. In the same study [30], DGGE and terminal restriction fragment length polymorphism (T-RFLP), another method for characterizing complex microbial communities [18, 22, 39], were compared in their efficiency to profile bacterial populations present in grassland soils. The major advantage of T-RFLP was the ability to reliably profile large numbers of samples, while DGGE analysis was hampered by gel-to-gel variation.

Our work suggests that DGGE can be used to profile large number of samples in an effective way, provided that appropriate inter-gel standards have been carefully selected. Since such standards are not commercially available [26, 38], researchers have to produce their own; mixtures of PCR products from 16S rRNA gene fragments from clones [26], individual bacterial strains [38], or even excised DGGE bands [20] have all been used to facilitate comparisons between gels. In one of the few studies dealing with this issue in more detail, Neufeld and Mohn [26] tested the use of fluorophore-labeled standards included in every single lane of the gel. It was concluded that this approach facilitated normalization of profiles obtained from different gels. However, it appears that the use of conventional standards remains the most popular choice for aligning within- and between-DGGE gels.

But what should be the characteristics of a good standard? Surprisingly, this issue is under-investigated. It has been suggested that a good standard should generate not only bands that span the entire gradient but a sufficient number of bands too, since within-gel distortion is not uncommon in DGGE analysis [25]. We constructed a standard according to the criteria described above by mixing four, different “real” samples; however, an “artificial” standard consisted of a sufficient number of clones with different migration patterns could also be used. Any such standard would be very versatile; irrespective of its origin and provided that the denaturing gradient would not have to be dramatically different, it could be used in studies targeting any region of the 16S rRNA gene or even a different gene.

Considering the level of complexity in fecal DGGE profiles, it is reasonable to assume that involving more bands in the analysis offers better control of the alignment process and, ultimately, higher confidence for comparisons between DGGE profiles acquired from different gels. On the other hand, gel-to-gel normalization involving a large number of bands, at least in TL120 and GelCompar II, can be labor-intensive and thus prone to human errors. Therefore, we tested the efficacy of the alignment process by using different numbers of bands common between gels. We concluded that as long as they are evenly distributed along the gradient, using between 9 and 21 bands results in excellent alignment across gels without rendering the process prohibitively labor-intensive. However, it should be emphasized that the number of bands needed to serve as anchors for efficient gel-to-gel alignment is highly dataset-specific. In this study, using even five bands only was shown to perform well; such finding could be explained by the fact that all 13 gels did not suffer from distortion or massive differences in DGGE migration patterns (as shown in Electronic Supplementary Material, Fig. S1).

After correcting for gel-to-gel variation, the next logical question is how to evaluate the technical reproducibility between gels. Very few studies have employed PCR-DGGE to compare microbial composition between samples profiled on different gels; unfortunately, they do not provide information on how gel-to-gel alignment is achieved or how its precision was evaluated. The vital role of image analysis software packages in the analysis of DGGE data should always be accompanied by critical supervision and user intervention, especially in experiments that generate large datasets [3, 34]. For example, it is well known that comparisons between DGGE profiles yield different results upon changing the tolerance parameter in any image analysis software program. Furthermore, comparing the similarities of the inter-gel standards (synthetic standards) used to align within and between gels is not a reliable indicator of precision since it is the bands of these very same standards that have already served as anchors for across-gel alignment. The utilization of a second, independent standard in each gel (reference standard) addresses the issue of not only evaluating the success of the alignment procedure but also of establishing the best tolerance parameters for use during band matching. Although including such a standard in large-scale molecular fingerprinting-based experiments has been recommended [9, 21], the potential benefits of this approach have been seriously underestimated.

The diagram obtained after plotting PC scores from the first three factors is indicative of an excellent alignment. However, a small amount of variation is still present since replicates from all types of standard, albeit very close to each other, are not completely superimposed. We suspect that this difference is caused mainly by intensity variation, though normalized, rather than misalignment. Nonetheless, because the new, aligned set of data is generated by a powerful statistical package, rather than an image analysis program, we can apply a wide variety of statistical approaches directly. This is not the case for the majority of commercial image analysis software packages that are somewhat restrictive in the statistical analysis they offer [2]. Moreover, our approach enables us to use whole DGGE profiles for applying dimension-reducing methods, such as PCA, rather than band presence/absence data and/or a corresponding database of intensity information. Band detection algorithms may cause differential detection of small, not sharp enough bands, neglecting them as “noise.” Although “whole vector” analysis is common in other disciplines involving high dimensional data, such as analytical chemistry [35, 45], there are only a handful of instances [17, 23, 27] where this approach has been used to interrogate DGGE data.

In conclusion, we report the development of a two-step methodology to efficiently align across a large dataset produced after DGGE analysis of V3 PCR products originated from avian feces. Preparatory analysis of DGGE profiles by a commercially available image analysis software package (TL120) followed by a simple interpolation step in Matlab generated accurately aligned data, appropriately pretreated for multivariate analysis. We also demonstrate that successful application of this methodology relies on careful selection of appropriate standards. Our work not only improves the analytical procedures to extract good quality information from DGGE datasets but also demonstrates the so far underestimated, potential for DGGE as a high-throughput profiling method. We finally suggest that published studies should always provide information on the steps taken to ensure that profiles from different gels can be meaningfully compared.