Introduction

With the advent of new techniques like next gene sequencing and microarray based diagnostics, it is but natural for clinicians to look out for a safe and cost effective method to attract all strata of clients and improve business. Customers are always looking for non-invasive methods of screening, diagnosis or hospital treatments. This is true especially in the case of infants and elderly people. However, in most hospital settings, drawing of blood is a common procedure. Considering the pain of needle prick, many people shy away from routine tests.

Different types of buccal cell collection have been in use like mouth wash, cytobrush, spit type cards, etc. but most of these methods are carried out by the subject under either minimal supervision or no supervision at all [14]. The first source of variability creeps in here. Some scrape hard enough and some hardly do, resulting in variable amounts of cells.

The second issue with buccal cells is that of DNA contamination [5]. The procedures strictly define that the subject rinse his or her mouth thoroughly before collecting buccal cells and restrain from eating anything at least 2 hours prior to the swabbing but an overwhelming number of people do not exactly follow the procedures leading to contamination of non-human DNA in buccal samples and other noise producing artifacts. Often food particles remain in the mouth which contaminates the buccal cells and leads to erroneous results [6]. Quantitative PCR using human primers can detect this issue but use of such DNA for large scale genotyping is time consuming and challenging [7].

Research scientists get tempted to use buccal cell DNA for large scale genotyping studies due to easier handling procedures, for instance, an advantage of buccal cells or saliva over blood is that it can be dried over cards and sent by mail. This is especially useful when samples are collected for a multicentric study [3].

But the results have to be carefully evaluated loci by loci to be accurate and most often this is a problematic issue because the DNA derived from buccal cells is often degraded leading to poorer yields and inefficient primer specific amplification leading to errors in genotyping calls or no calls at all [8, 9].

Very limited reports are available on the quality assessment of buccal DNA versus blood DNA in microarray experimental use [10, 11]. One study talks of the lag time between collection and extraction and how it affects the quality of genomic DNA [6].

Our study aims to look at these issues from a realistic angle and we provide our opinion on the use of buccal cell DNA for important biological experiments.

Materials and Methods

Subjects

The subjects were enrolled for this study out of a larger study group on SNP association with a lipid disorder disease. Informed consents were obtained from all the participants and this study complied with the declaration of Helsinki and the protocol was approved by the UMMC’s (University Malaya Medical Centre) ethics committee [Ref: 546.16].

Blood samples were collected for all the participants in the large study. However, buccal and blood samples were collected only for a group of 16 subjects to do this experiment. Blood samples were drawn by a hospital nurse using 10 ml purple EDTA vacutainer tube. Each of the participants was asked to give a buccal swab from each side of the cheek. They were given two cotton swabs and two test tubes labeled with the patient ID. They were asked to scrape the inside of cheek with the swab firmly six times. An SOP (standard operating procedure) for buccal cell extraction made by us was given to each subject for reading and following. The samples were brought back to the lab and air dried for 20 min before being processed.

DNA Extraction Method

Buccal

The buccal swab was placed in 2 ml centrifuge tube and 400 μl Phosphate-buffered saline, pH 7.1 was added to sample. The Qiagen protocol was followed except for minor modifications (QIAamp DNA Blood Mini Kit- Cat. no 51106). The incubation time at 56°C was increased to 20 min instead of 10 min., 150 μl buffer AE was added for elution, incubated at room temperature for 10 min and then centrifuged at 8,000 rpm for 1 min. The DNA eluted was stored at 4°C until used.

Blood

Blood DNA was extracted following the Qiagen protocol provided by the manufacturer (QIAamp DNA Blood Mini Kit- Cat. no 51106).

Agarose Gel Electrophoresis

6 μl of each sample DNA was electrophoresed on a 0.8% gel containing gel red at 60 V in TBE buffer (0.089 M Tris Base; 0.089 M Boric Acid; 0.002 M EDTA, Disodium Salt, Dihydrate; Final pH 8.3) at constant current for 40 min. A UVP, (Model: ChemiDoc-It 410) coupled to an ultraviolet transilluminator was used to take a digital picture of the gel and the quality of the DNA was evaluated (Fig. 1). DNA integrity was also determined by visual inspection for degradation. DNA degradation was shown by fragmentation of the buccal cell DNA samples, compared against a known molecular weight marker (Gene Ruler 1 kb DNA ladder, Fermentas Life Sciences, Inc.) with visible bands of lengths 10,000, 8,000, 6,000, 5,000, 4,000, 3,500, 3,000, 2,500, 2,000, 1,500, 1,000, 750, 500 and 250 bp. Degradation was observed in 3–4 buccal samples, evident by the long smear as compared to the sharp bands near the well in case of blood samples. Sample number 6 and 10 showed maximum degradation. Buccal DNA quantity is less by gel even though the spectrophotometric readings are high.

Fig. 1
figure 1

Electrophoretic analysis of genomic DNA from blood and buccal cell samples DNA. 6 μl of DNA was loaded on a 1% agarose gel stained by GelRed(R) was visualized using a GelDoc system. Top panel shows blood DNA samples of 16 subjects identified by the numbers above the lanes. Bottom panel shows buccal samples of the same 16 subjects. DNA of subjects 6 and 10 are badly degraded. DNA of other subjects is fairly good but the amount is very low

DNA Yield and Integrity

The yield of the DNA was estimated using a BioRad spectrophotometer and the results have been tabulated (Table 1).

Table 1 Comparison of buccal and blood DNA yields and quality

Microarray Based Genotyping

Genotyping was performed on the Illumina platform using GGGT Assay, which is capable of multiplexing up to 1,536 SNPs in a single reaction. All assays were performed on 32-array Universal BeadChip according to the manufacturer’s protocol and were carried out in compliance with MIAME guidelines [12, 13].

All the raw intensity data from our custom GGGT microarray assays were fed to the Illumina GenomeStudio to decipher the true allele calls which does automated genotype clustering and calling and allows data to be visualized for further analysis. Overall genotype call rate was 70% and above with allelic data successfully generated for 84% of the SNP loci (1,292 out of 1,536). All genotyped SNPs were assessed for deviation from Hardy–Weinberg equilibrium using GenomeStudio. Any sample with a call rate less than 70% was discarded.

The SNPs were evaluated by cluster separation score and then visually evaluated for call integrity. Genotyping is deemed successful by evaluating a score called GC score (a GC score ranges from 0 to 1 and reflects the proximity within a cluster plot of intensities of that genotype to the centroid of the nearest cluster). All genotypes with GC score below 0.25 were considered as failures.

The objective of the study was to compare the yield and quality of DNA obtained from matched buccal swab and blood samples. In addition, the performance of the samples was assessed in microarray.

Statistical Analysis

Pairwise statistical analysis was done for buccal and blood samples. Since our ultimate objective was to compare the genotype calls based on DNA quality, the results were scored as a call or a no call. The SNPs which gave no calls for blood was excluded before doing the statistical analysis since blood is considered the reference type here.

Concordance of genotype calls between blood and buccal samples from the same individual was evaluated using % concordance and the Kappa statistic, which measures the agreement between methods exceeding that expected by chance. Percent concordance and Kappa statistics were calculated only among genotypes called in both samples being compared, excluding missing data using available online tool. (Ref:http://faculty.vassar.edu/lowry/VassarStats.html).

Results

We sought to address the suitability of amplified buccal swab DNA for high-throughput genotyping by doing a blood-buccal comparison from same subjects. DNA was successfully extracted from all the samples, blood and buccal, provided by the volunteers. A total of 200 μl of blood was drawn from each subject and two buccal swabs were collected to extract DNA. The yield of the DNA in each sample was estimated by spectroscopic method which also gave the purity of the nucleic acid (Table 1).

The yield of the blood DNA ranged from 30.36 to 100 μg/μl which is as expected from the claims of Qiagen kit manufacturers. However, the buccal DNA yields showed a very broad zone with values as low as 18 ng/μl to as high as 433 ng/μl. Large DNA yields in buccal samples could be due to the presence of exogenous DNA. It is therefore advisable to take larger starting material in case of buccal samples to prevent inferior quality microarray results.

The DNA was loaded on a 0.8% agarose gel to check integrity, degradation and RNA contamination (Fig. 1).

Similarly, the purity of blood DNA seemed to be good when compared to buccal DNA. The average purity in case of buccal was only 1.3 (95% CI_0.67–1.95) (Table 1).

Of the entire 16 buccal DNA genotyped, all except four were genotyped successfully (75% genotypes called).In the blood samples, average number of SNPs that could be genotyped were (71.22%) and in buccal, the average number that could be genotyped were only (56.89%).

Depending on the shapes of the clusters and their relative distance to each other, a statistical score is devised (the GenTrain score). This score is combined with several penalty terms (for example low intensity, mismatch between existing and predicted clusters) in order to make up the training (“GenTrain”) score. The GenTrain score, along with the cluster positions and shapes for each SNP, is saved for use by the calling algorithm. The theta value [2/πTan-1 (Cy5/Cy3)], indicates the allelic angle. Theta values near 0 (left side of graph) are homozygotes for allele A, and theta values near 1 (right side of graph) are homozygotes for allele B; heterozygotes fall between these two groups. R is the signal intensity. Genoplots of four SNP’s from a matched buccal and blood sample with a p50 GC ratio 0.50:0.53 is shown to demonstrate a failed SNP (Fig. 2; Table 2).

Fig. 2
figure 2

Genoplots of four SNP’s from a matched buccal and blood sample. In case of the buccal calls, the R values are very low, due to which a valid call could not be determined by the Genome Studio software

Table 2 Quality metrics of four SNP’s from buccal and blood sample from same subject

To assess the reliability of the genotyping experiments, the percentage of agreement (i.e., genotype concordance) and unweighted Cohen’s kappa statistic (i.e., percentage of agreement above and beyond chance alone) were calculated. Genotypes of paired buccal and blood samples were compared with one another. Only called genotypes were used for the comparisons; all missing data were excluded. The values of the calculated metrics are listed in Table 3.

Table 3 Kappa statistics for the matched blood and buccal DNA

Discussion

It was noted from this study that there is no consistency in the quality of DNA derived from buccal cells of different individuals. Though a lab personnel was supervising the buccal swabbing procedure, it was hard to establish at that moment if swabbing pressure and technique was good enough or not. In some DNA degradation is more profound than others. Adequate quality check is to be done prior to doing microarray hybridization. To avoid such failures it would be more prudent to use blood DNA for expensive methodologies like microarrays than unpredictable buccal DNA.

Buccal cell collection as a source of DNA was initially put forward as an efficient means of cost-effective DNA collection. Apart from being non invasive, cells can be easily collected on FTA cards and sent by mail [2, 4].

Rinsing of the mouth thoroughly prior to collecting buccal cells is very important. Chances are very high of the presence of contaminant DNA if mouth is inadequately rinsed. DNA quality of the mouth varies from person to person. In some cases the DNA is more prone to early degradation. At least 10% of saline mouth wash samples were degraded in a high myopia study [14]. Often cells recovered from mouth are superficial ones in the process of apoptosis. About 30% of the cells collected from healthy subjects with non inflammatory mucosa are apoptotic [15]. Diet also plays an important role in defining individual oral flora. Life style habits also lead to differences in desquamation of the oral mucosa [16]. The oral flora is extremely diverse and subject specific based on individual habits, like diet, eating habits, brushing habits, smoking habits etc. [17, 18]. Smoking has various influences on the oral mucosa. Cancer in the oral cavity usually begins due to irritation by cigarette products to be smoked. These irritants cause white lesions. Smoking can also cause abnormalities in the oral cavity such as the tongue, gums, mouth mucosa, teeth and palate in the form of nicotine stomatitis and fungal infections [19, 20]. All of these factors can cause DNA damage [21, 22].

DNA should be checked for contaminating non-human DNA by some suitable method. We recommend increasing the total amount of DNA used as starting material in case of buccal DNA to a much larger amount as recommended by Illumina, at least greater than 250 ng. More buccal swabs can be taken from an individual. Blood has the advantage that yield of DNA is good and in case of a failed experiment, enough reserve samples is available to repeat the experiment which is not possible with buccal samples, especially for cost intensive and sensitive methods like microarray. Our conclusion is that buccal cell DNA is not a suitable alternative to blood, for expensive genotyping experiments and it is not worth compromising the results to save a few dollars and obtain a larger sample size.