Ultra-Low Coverage Sequencing as the Most Accurate Library Quantification Method Prior to Target Sequencing

Krasnenko, A. Yu.; Stetsenko, I. F.; Klimchuk, O. I.; Demkin, V. V.; Rakitko, A. S.; Surkova, E. I.; Druzhilovskaya, O. S.

doi:10.3103/S089141681902006X

Ultra-Low Coverage Sequencing as the Most Accurate Library Quantification Method Prior to Target Sequencing

EXPERIMENTAL WORKS
Published: 14 October 2019

Volume 34, pages 118–123, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Molecular Genetics, Microbiology and Virology Aims and scope Submit manuscript

Ultra-Low Coverage Sequencing as the Most Accurate Library Quantification Method Prior to Target Sequencing

Download PDF

A. Yu. Krasnenko^1,2,3,
I. F. Stetsenko^1,3,
O. I. Klimchuk¹,
V. V. Demkin⁴,
A. S. Rakitko^1,3,5,
E. I. Surkova¹ &
…
O. S. Druzhilovskaya³

132 Accesses
Explore all metrics

Abstract

Accurate library quantification is very important during post-pooling captured target sequencing. There are a number of methods available to quantify libraries prior to sequencing, but no gold standard for the quantification of libraries exists. In this study, we compared common library quantification methods (Labchip, Qubit 3.0, qPCR with three primer sets) with ultra-low coverage sequencing (MiSeq with and without insert size correction). Cost, time and quantification accuracy were considered. We found that Qubit and MiSeq were better than qPCR and LabChip at predicting the final concentration. Also we revealed that MiSeq with insert size correction was the most accurate method for library quantification prior to target sequencing. This method allows for correction shifts in the ratio due to enrichment. Ultra-low coverage sequencing by Illumina MiSeq is the most accurate method for library quantification prior to pooling and post-pooling target enrichment.

Droplet Digital™ PCR Next-Generation Sequencing Library QC Assay

Fluorescent amplification for next generation sequencing (FA-NGS) library preparation

Article Open access 28 January 2020

Robust Sub-nanomolar Library Preparation for High Throughput Next Generation Sequencing

Article Open access 04 May 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

INTRODUCTION

Accurate equimolar pooling is important for the equal distribution of reads among samples in a single batch [1]. The unequal combination of libraries leads to the biased representation of certain libraries over others. Underrepresented libraries will need to be resequenced, costing time and money. Overrepresentation of libraries can result from the generation of more sequence data than required, wasting sequence capacity and decreasing the number of samples per batch. Considering a fixed price per sequencing run, it is economically sound to pool more samples in each target sequencing run with perfectly equal concentrations.

Post-pooling target enrichment is more cost-effective than pre-pooling enrichment, but it can cause unpredictable shifts in the ratios between samples in the same enrichment batch. Bacterial contamination of initial samples (e.g., extracted from saliva samples), differences in the library insert length distribution and many other factors cannot be simultaneously considered using common library quantification methods.

The current methods for DNA library quantification use a variety of techniques including UV absorption (e.g., Nanodrop, Thermo Fisher Scientific, USA) [2, 3], intercalating dyes (e.g., Qubit, Invitrogen, USA) [4, 5], capillary electrophoresis (Agilent Bioanalyzer 2100, Agilent Technologies Inc, USA) [6], 5'‑hydrolysis probes (e.g., Taqman probes) coupled with quantitative PCR (e.g., qPCR assays by Roche) [7, 8] or droplet digital emulsion PCR (ddPCR, Bio-Rad Inc, USA) [9]. These common methods have several limitations and may provide inaccurate results [10]. For example, UV spectrophotometers detect not only DNA but also UV-absorbing materials such as RNA, protein and phenol and are not sensitive enough to detect small amounts of DNA [11]. Fluorometric methods that only detect double-stranded DNA, such as Qubit, potentially overinflate the actual concentration of a library due to the binding of the dyes with partially ligated double-stranded libraries and adapter dimers. PicoGreen also binds with dsDNA, but this method is not specific for human DNA; any animal, bacterial or fungal DNA co-purified with the human DNA of interest will contribute to the final reading and could give a falsely high DNA quantification. Several studies indicate that qPCR is the most effective method for library quantification [12–16].

Because the economic outcome of post-pooling capture target sequencing experiments depends on the library quantification accuracy, it is crucial to choose the most accurate, reliable and reproducible method. In this study, we compared several library quantification methods by their accuracy and cost to finally select the best method for library quantification prior to pooling before target capture and Illumina sequencing.

MATERIALS AND METHODS

DNA extraction was performed from both blood and saliva samples of patients using the QIAmp DNA mini kit (Qiagen, Germany) according to the manufacturer’s instructions. All samples were obtained with informed consent. DNA libraries were prepared using the NEBNext Ultra DNA Library Prep Kit for Illumina (New England Biolabs, MA, USA). The study design was a comparison of several techniques used for the quantification of libraries prior to pooling and target sequencing, including LabChip (PerkinElmer Inc., MA, USA), Qubit 3.0 (Thermo Fisher Scientific, MA, USA), several qPCR approaches, and Illumina MiSeq (with and without insert size correction according to our study [17].

Qubit 3.0

Quantification using Qubit 3.0 was performed according to the manufacturer’s recommended protocol.

Labchip

Quantification using Labchip was performed according to the manufacturer’s recommended protocol. We estimated library concentrations with fragment sizes ranging from 200 to 1000 bp. This allowed us to exclude fragments that were too short and too long. Fragments that are too short drop out in enrichment, while fragments that are too long do not participate in sequencing due to the peculiarities of cluster generation during Illumina sequencing.

qPCR Quantification

The library quantification was performed using the StepOnePlus real-time PCR System (Thermo Fisher Scientific, MA, USA) with SYBRGreen I. The cycling conditions were 95°C for 5 min followed by 45 cycles of 95°C for 20 s, 62°C for 20 s and 72°C for 55 s. We used the following amplification primers:

(1) P5/P7—This primer set was used to detect Illumina-compatible libraries irrespective of their insert size and sequence. This is the most common principle for quantifying sequencing libraries, such as in the QIAseq™ Library Quant Assay Kit, NEBNext Library Quant Kit for Illumina, KAPA Library Quantification Kit Illumina platforms, PerfeCTa NGS Library Quantification Kit for Illumina and other commercially available kits.

P5 AATGATACGGCGACCACCGA

P7 CAAGCAGAAGACGGCATACGAGAT

(2) GHRf/GHRr—Both primers anneal to the human GHR gene; thus, we detected the amount of human DNA irrespective of the presence of Illumina sequencing adapters.

GHRf CCCCTCTAAGGAGTGTAGCA

GHRr CTTTTGGTGCCTGGTAAGTT

(3) P5/GHRf—The P5 primer anneals to the Illumina adapter, and GHRf anneals to the human GHR gene. This allowed us to detect Illumina-compatible library fragments containing GHR gene fragments.

P5 ATGATACGGCGACCACCGA

GHRf CCCCTCTAAGGAGTGTAGCA

Ultra-Low Coverage Illumina Sequencing

The libraries were sequenced using MiSeq [18] with 150 bp PE reads on average. Reads were considered if they mapped to the human genome. We then calculated the relative concentration of the samples in the pool.

Ultra-Low Coverage Illumina Sequencing with Insert Size Correction

Fragments with different insert lengths are enriched with different efficiencies [17, 19]. Therefore, we corrected the number of reads obtained for each sample by MiSeq sequencing using coefficients reflecting the enrichment efficiency of fragments with specific lengths.

Post-Capture Pooling and Target Sequencing

After the quantification libraries were pooled, enrichment was performed with SureSelectXT2 Focused Exome (Agilent Technologies, CA, USA). Exome sequencing was performed using a HiSeq 2500 (Illumina, CA, USA). Reads were filtered and mapped to the human genome. The final distribution of reads was considered standard, as the purpose of this work was to determine the most accurate prediction of the data output from exome sequencing.

Statistical Analysis

We used log-transformation to reduce the skewness. We applied the Shapiro–Wilk test to ensure the data had a normal distribution after outlier removal. We used the Student’s t-test to check for bias. To estimate the accuracy, we compared the quantification results from the studied methods with the HiSeq results. The associations between the relative HiSeq concentration and the quantification methods were evaluated by Pearson correlation and linear regression.

RESULTS

In this study, we compared several methods for library quantification, including Labchip, Qubit 3.0, qPCR with three primer sets, Illumina MiSeq and Illumina MiSeq with insert size correction. For each method, we analyzed the accuracy (Fig. 1), cost per sample and time (Table 1).

Table 1. Comparison of cost and time for the studied library quantification methods

Full size table

We used the library concentration determined by HiSeq as the reference library concentration. All methods were compared by their ability to predict this concentration. A correlation analysis revealed that for 4 quantification methods (GHR qPCR, Qubit, MiSeq and MiSeq with insert size correction) the p-value is less than 0.05, which can be interpreted as an association (Fig. 1).

Generally, Qubit and MiSeq were better than qPCR and LabChip at predicting the final concentration. Thus, these methods were chosen for further comparison.

In the additional investigation, the data from Qubit and MiSeq were analyzed using linear regression. The best correlation with HiSeq was revealed for MiSeq with insert size correction (R² = 85.63%, P < 0.001). There was a strong correlation between HiSeq and MiSeq data without insert size correction (R² = 80.48%, P < 0.001) and Qubit (R² = 81.12%, P < 0.001).

By comparing the accuracy of the different quantification methods, we revealed that MiSeq with insert size correction was the most accurate method for library quantification prior to post-pooling capture exome sequencing.

DISCUSSION

The various instruments for library quantification vary in accuracy, reproducibility and sensitivity, as well as in labor intensity, speed and cost. A reliable and accurate quantification strategy will permit investigators to fully utilize sequencer capacity, reducing the costs of sequencing even further. Therefore, the basic chemistry of NGS requires that a narrow input range of library fragments be prepared for sequencing.

Many studies have previously compared different NGS library quantification methods and shown contradictory results [16, 20–25]. Hussing with colleagues quantified dsDNA oligos and revealed that BioAnalyzer, TapeStation and Qubit instruments give concentrations closest to the expected [21]. Katsuoka with colleagues showed that MiSeq works as an effective quantification method, but authors have not compared MiSeq with other methods [22]. There is no comparative analysis of methods for library quantification prior to pooling before target capture and Illumina sequencing.

To examine the most accurate and suitable methods for library quantification prior target sequencing, we compared four quantification methods, including LabChip, Qubit, quantitative PCR (qPCR) with three primer sets and Illumina MiSeq. Quantification using MiSeq was performed using 2 methods, with and without insert size correction. We have applied 7 different approaches for estimating the quantity of reads and have compared these estimates with the HiSeq data. We revealed that MiSeq data correlated most strongly with those obtained by HiSeq. This was confirmed by the linear regression analysis. MiSeq and insert size correction combined led to improved correlations with HiSeq data.

In addition to the actual library quantification, low-depth MiSeq sequencing allows us to determine the library insert size distribution with high details; we have previously shown that this affects the library enrichment efficiency, and therefore the relative library representation in the resulting enriched pool [17]. The enrichment efficiency differences caused by the insert length distribution allowed us to further improve the prediction accuracy of the library concentration in the final pool.

When comparing the cost and time required for the different methods, MiSeq is costlier and more time consuming than the other quantification methods. However, more hands-on time and a higher price for more accurate quantification may be preferred compared to a higher risk of large variations in library coverage, especially in clinical and forensic genetic laboratories.

Using MiSeq to quantify NGS libraries decreases overall sequencing costs by ensuring an accurate quantification upfront, which minimizes the need to re-run or repeat sequencing of samples. Nevertheless, our work also reported comparable quality results from the Qubit assay, suggesting that this method can be used when one has a clean and homogenous library with no primer dimer problems.

CONCLUSIONS

In summary, this work offers a comparative analysis of library quantification methods and reveals that MiSeq sequencing is the most accurate, reliable and reproducible method for library quantification prior to post-pooling capture target sequencing.

REFERENCES

Sham, P., Bader, J., Craig, I., O’Donovan, M., and Owen, M., DNA pooling: a tool for large-scale association studies, Nat. Rev. Genet., 2002, vol. 3, no. 11, pp. 862–871. https://doi.org/10.1038/nrg930
Article CAS PubMed Google Scholar
McGown, E., UV absorbance measurements of DNA in microplates, BioTechniques, 2000, vol. 28, no. 1, pp. 60–64. https://doi.org/10.2144/00281bm11
Article CAS PubMed Google Scholar
Ponti, G., Maccaferri, M., Manfredini, M., Kaleci, S., Mandrioli, M., Pellacani, G., Ozben, T., Depenni, R., Bianchi, G., Pirola, G., and Tomasi, A., The value of fluorimetry (Qubit) and spectrophotometry (NanoDrop) in the quantification of cell-free DNA (cfDNA) in malignant melanoma and prostate cancer patients, Clin. Chim. Acta, 2018, vol. 479, pp. 14–19. https://doi.org/10.1016/j.cca.2018.01.007
Article CAS PubMed Google Scholar
Ahn, S., PicoGreen quantitation of DNA: Effective evaluation of samples pre- or post-PCR, Nucleic Acids Res., 1996, vol. 24, no. 13, pp. 2623–2625.
Article CAS Google Scholar
Vitzthum, F., Geiger, G., Bisswanger, H., Brunner, H., and Bernhagen, J., A Quantitative fluorescence-based microplate assay for the determination of double-stranded DNA using SYBR Green I and a standard ultraviolet transilluminator gel imaging system, Anal. Biochem., 1999, vol. 276, no. 1, pp. 59–64. https://doi.org/10.1006/abio.1999.4298
Article CAS PubMed Google Scholar
Panaro, N.J., Yuen, P.K., Sakazume, T., Fortina, P., Kricka, L.J., and Wilding P., Evaluation of DNA fragment sizing and quantification by the agilent 2100 bioanalyzer, Clin. Chem., 2000, vol. 46, no. 11, pp. 1851–1853.
CAS PubMed Google Scholar
Bunce, M., Oskam, C.L., and Allentoft, M.E., Quantitative real-time PCR in a DNA research, Methods Mol. Biol., 2012, vol. 840, pp. 121–132. https://doi.org/10.1007/978-1-61779-516-916
Article CAS PubMed Google Scholar
Mardis, E. and McCombie, R.W., Library quantification using SYBR Green quantitative polymerase chain reaction (qPCR), Cold Spring Harbor Protoc., 2017, no. 6. https://doi.org/10.1101/pdb.prot094714
Aigrain, L., Gu, Y., and Quail, M.A., Quantitation of next generation sequencing, library preparation protocol efficiencies using droplet digital PCR assays–a systematic comparison of DNA library preparation kits for Illumina sequencing, BMC Genomics, 2016, vol. 17, p. 458. https://doi.org/10.1186/s12864-016-2757-4
Article CAS PubMed PubMed Central Google Scholar
Haque, K.A., Pfeiffer, R.M., Beerman, M.B., Struewing, J.P., Chanock, S.J., and Bergen, A.W., Performance of high-throughput DNA quantification methods, BMC Biotechnol., 2003, vol. 3, p. 20. https://doi.org/10.1186/1472-6750-3-20
Article PubMed PubMed Central Google Scholar
Nielsen, K., Mogensen, H.S., Hedman, J., Niederstätter, H., Parson, W., and Morling, N., Comparison of five DNA quantification methods, Forensic Sci. Int.: Genet., 2008, vol. 2, pp. 226–230. https://doi.org/10.1016/j.fsigen.2008.02.008
Article Google Scholar
Buehler, B., Hogrefe, H.H., Scott, G., Ravi, H., Pabón-Peña, C., O’Brien, S., Formosa, R., and Happe, S., Rapid quantification of DNA libraries for next generation sequencing, Methods, 2010, vol. 50, no. 4, p. S15-8. https://doi.org/10.1016/j.ymeth.2010.01.004
Article CAS PubMed Google Scholar
Dang, J., Mendez, P., Lee, S., Kim, J.W., Yoon, J.H., Kim, T.W., Sailey, C.J., Jablons, D.M., and Kim, I.J., Development of a robust DNA quality and quantity assessment qPCR assay for targeted next-generation sequencing library preparation, Int. J. Oncol., 2016, vol. 49, no. 4, pp. 1755–1765. https://doi.org/10.3892/ijo.2016.3654
Article CAS PubMed PubMed Central Google Scholar
Hussing, C., Kampmann, M.L., Mogensen, H.S., Børsting, C., and Morling, N., Quantification of massively parallel sequencing libraries—a comparative study of eight methods, Sci. Rep., 2018, vol. 8, no. 1, p. 1110. https://doi.org/10.1038/s41598-018-19574-w
Article CAS PubMed PubMed Central Google Scholar
Meyer, M., Briggs, A.W., Maricic, T., Höber, B., Höffner, B., Krause, J., Weihmann, A., Pääbo, S., and Hofreiter, M., From micrograms to picograms: quantitative PCR reduces the material demands of high-throughput sequencing, Nucleic Acids Res., 2008, vol. 36, no. 1, p. e5. https://doi.org/10.1093/nar/gkm1095
Article CAS PubMed Google Scholar
Robin, J.D., Ludlow, A.T., La Ranger, R., Wright, W.E., and Shay, J.W., Comparison of DNA quantification methods for next generation sequencing, Sci. Rep., 2016, vol. 6, p. 24 067. https://doi.org/10.1038/srep24067
Article CAS Google Scholar
Krasnenko, A., Tsukanov, K., Stetsenko, I., Klimchuk, O., Plotnikov, N., Surkova, E., and Ilinsky, V., Effect of DNA insert length on whole-exome sequencing enrichment efficiency: an observational study, Adv. Genomics Genet., 2018, vol. 8, pp. 13–15. https://doi.org/10.2147/AGG.S162531
Article CAS Google Scholar
Liu, L., Li, Y., Li, S., Hu, N., He, Y., Pong, R., Lin, D., Lu, L., and Law, M., Comparison of next-generation sequencing systems, J. Biomed. Biotechnol., 2012, vol. 2012, p. 251 364. https://doi.org/10.1155/2012/251364
Article Google Scholar
Head, S.R., Komori, H.K., LaMere, S.A., Whisenant, T., Van Nieuwerburgh, F., Salomon, D.R., and Ordoukhanian, P., Library construction for next-generation sequencing: overviews and challenges, Biotechniques, 2014, vol. 56, no. 2, pp. 61–64, 66, 68. https://doi.org/10.2144/000114133
Article CAS PubMed PubMed Central Google Scholar
Brzobohatá, K., Drozdová, E., Smutný, J., Zeman, T., and Beňuš, R., Comparison of suitability of the most common ancient DNA quantification methods, Genet. Test. Mol. Biomarkers, 2017, vol. 21, no. 4, pp. 265–271. https://doi.org/10.1089/gtmb.2016.0197
Article CAS PubMed Google Scholar
Hussing, C., Kampmann, M.L., Mogensen, H.S., Børsting, C., and Morling, N., Comparison of techniques for quantification of next-generation sequencing libraries, Forensic Sci. Int.: Genet. Suppl. Ser., 2015, vol. 5, pp. e276–e278. https://doi.org/10.1016/j.fsigss.2015.09.110
Article Google Scholar
Katsuoka, F., Yokozawa, J., Tsuda, K., Ito, S., Pan, X., Nagasaki, M., Yasuda, J., and Yamamoto, M., An efficient quantitation method of next-generation sequencing libraries by using MiSeq sequencer, Anal. Biochem., 2014, vol. 466, pp. 27–29. https://doi.org/10.1016/j.ab.2014.08.015
Article CAS PubMed Google Scholar
Laurie, M.T., Bertout, J.A., Taylor, S.D., Burton, J.N., Shendure, J.A., and Bielas, J.H., Simultaneous digital quantification and fluorescence-based size characterization of massively parallel sequencing libraries, Biotechniques, 2013, vol. 55, no. 2, pp. 61–67. https://doi.org/10.2144/000114063
Article PubMed PubMed Central Google Scholar
Nakayama, Y., Yamaguchi, H., Einaga, N., and Esumi, M., Pitfalls of DNA quantification using DNA-binding fluorescent dyes and suggested solutions, PLoS One, 2016, vol. 11, no. 3, p. e0150528. https://doi.org/10.1371/journal.pone.0150528
Article CAS PubMed PubMed Central Google Scholar
White, R.A. III, Blainey, P.C., Fan, H.C., and Quake, S.R., Digital PCR provides sensitive and absolute calibration for high throughput sequencing, BMC Genomics, 2009, vol. 10, p. 116. https://doi.org/10.1186/1471-2164-10-116
Article CAS PubMed PubMed Central Google Scholar

Download references

ACKNOWLEDGMENTS

Not applicable.

Funding

This work was supported by the Ministry of Science and Higher Education of the Russian Federation (ID RFMEFI60716X0152).

Author information

Authors and Affiliations

Genotek Ltd., 105120, Moscow, Russia
A. Yu. Krasnenko, I. F. Stetsenko, O. I. Klimchuk, A. S. Rakitko & E. I. Surkova
Pirogov Russian National Research Medical University, 117997, Moscow, Russia
A. Yu. Krasnenko
Vavilov Institute of General Genetics, 119333, Moscow, Russia
A. Yu. Krasnenko, I. F. Stetsenko, A. S. Rakitko & O. S. Druzhilovskaya
Institute of Molecular Genetics, 123182, Moscow, Russia
V. V. Demkin
Moscow State University, Faculty of Mechanics and Mathematics, 119991, Moscow, Russia
A. S. Rakitko

Authors

A. Yu. Krasnenko
View author publications
You can also search for this author in PubMed Google Scholar
I. F. Stetsenko
View author publications
You can also search for this author in PubMed Google Scholar
O. I. Klimchuk
View author publications
You can also search for this author in PubMed Google Scholar
V. V. Demkin
View author publications
You can also search for this author in PubMed Google Scholar
A. S. Rakitko
View author publications
You can also search for this author in PubMed Google Scholar
E. I. Surkova
View author publications
You can also search for this author in PubMed Google Scholar
O. S. Druzhilovskaya
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

AYu, IF, OI, VV, AS, EI, OS met the International Committee of Medical Journal Editors (ICMJE) criteria for authorship. AYu, IF, OI, VV, AS and EI contributed to data collection and the first draft of the manuscript. AS also carried out the statistical analysis. OS was mentor. All authors read and approved the final manuscript.

Corresponding author

Correspondence to E. I. Surkova.

Ethics declarations

COMPLIANCE WITH ETHICAL STANDARDS

All research was approved by the ethics committee of Genotek Ltd. (08/2018). The patients or patient’s parents have provided written informed consent. The patients or patient’s parents gave written informed consent to studies and publication of sequencing data.

CONFLICT OF INTEREST

AYu, IF, OI, AS, EI are employees of commercial organization—Genotek Ltd. The authors declare that they have no other competing interests.

About this article

Cite this article

Krasnenko, A.Y., Stetsenko, I.F., Klimchuk, O.I. et al. Ultra-Low Coverage Sequencing as the Most Accurate Library Quantification Method Prior to Target Sequencing. Mol. Genet. Microbiol. Virol. 34, 118–123 (2019). https://doi.org/10.3103/S089141681902006X

Download citation

Received: 21 December 2018
Revised: 21 December 2018
Accepted: 15 April 2019
Published: 14 October 2019
Issue Date: April 2019
DOI: https://doi.org/10.3103/S089141681902006X

Keywords:

Use our pre-submission checklist

Avoid common mistakes on your manuscript.