Abstract
Systematic integration of microarrays from different sources increases statistical power of detecting differentially expressed genes and allows assessment of heterogeneity. The challenge, however, is in designing and implementing efficient analytic methodologies for combining data generated by different research groups and platforms. The widely used strategy mainly focuses on integrating preprocessed data without having access to the original raw data that yielded the initial results. A main disadvantage of this strategy is that the quality of different data sets may be highly variable, but this information is neglected during the integration.
We have recently proposed a quality-weighting strategy to integrate Affymetrix microarrays. The quality measure is a function of the detection p-values, which indicate whether a transcript is reliably detected or not on Affymetrix gene chip. In this study, we compare the proposed quality-weighted strategy with the traditional quality-unweighted strategy, and examine how the quality weights influence two commonly used meta-analysis methods: combining p-values and combining effect size estimates. The methods are compared on a real data set for identifying biomarkers for lung cancer.
Our results show that the proposed quality-weighted strategy can lead to larger statistical power for identifying differentially expressed genes when integrating data from Affymetrix microarrays.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Olkin I. Meta-Analysis: methods for combining independent studies. Editor's introduction. Statistical Science 1992;7: 226.
Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyan AM. Meta-analysis of microarrays: Inter-study validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Research 2002;62:4427–4433.
Choi JK, Yu U, Kim S, Yoo OJ. Combining multiple microarray studies and modeling inter-study variation. Bioinformatics, Suppl. 2003;19:i84–i90.
Moreau Y, Aerts S, Moor BD, Strooper BD, Dabrowski M. Comparison and meta-analysis of microarray data: From the bench to the computer desk. Trends in Genetics 2003;19:570–577.
Stevens JR, Doerge RW. Combining Affymetrix microarray results. BMC Bioinformatics 2005;6:57.
Hu P, Celia GMT, Beyene J. Integrative analysis of multiple gene expression profiles with quality-adjusted effect size models. BMC Bioinformatics 2005;6:128.
Hedges LV, Olkin I. Statistical Methods for Meta-analysis. Orlando, FL: Academic Press, 1995.
Kuo WP, Jenssen TK, Butte AJ, Ohno-Machado L, Kohane IS. Analysis of matched mRNA measurements from two different microarray technologies.Bioinformatics 2002;18:405–412.
Jarvinen AK, Hautaniemi S, Edgren H, Auvinen P, Saarela J, Kallioniemi OP, Monni O. Are data from different gene expression microarray platforms comparable? Genomics 2004;83:1164–1168.
Irizarry RA, Warren D, Spencer F, et al. Multiple-laboratory comparision of microarray platforms. Nature Methods 2005;2:345–350.
Tritchler D. Modelling study quality in meta-analysis. Statistics in Medicine 1999;18:2135–2145.
Affymetrix Microarray Suite User Guide, version 5. Retrieved July 25, 2005, from http://www.affymetrix.com/support/technical/manuals.affx 2001.
Beer DG, Kardia SL, Huang CC, Giordano TJ, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nature Medicine 2002;9:816–824.
Bhattacharjee A, Richards WG, Staunton J, et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. In: Proceedings of the National Academy of Sciences USA 2001;98:13790–13795.
Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Research 2003;31:e15.
Brigham HM, Gregory TK, Jeffrey S, Meena A, David B, Peter B, Daniel ZW, Thomas JM, Isaac SK, Zoltan S. Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene expression measurements. Nucleic Acids Research 2004;32:e74.
Jiang H, Deng Y, Chen H, Tao L, Sha Q, Chen J, Tsai C, Zhang S. Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes. BMC Bioinformatics 2004;5:81.
Knight K. Mathematical Statistics. Chapman & Hall/CRC Press, 2000.
Radmacher MD, McShane LM, Simon R. A paradigm for class prediction using gene expression profiles. Journal of Computational Biology 2002;9:505–511.
Tusher V, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. In: Proceedings of the National Academy of Sciences USA 2001;98:5116–5121.
Jain N, Thatte J, Braciale T, Ley K, O'Connell M, Lee JK. Local-pooled-error test for indentifying differentially expressed genes with a small number of replicated microarrays. Bioinformatics 2003;19:1945–1951.
Smyth GK. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, No. 1, Article 3, 2004.
SAS Institute Inc. The MEANS Procedure. Accessed July 25, 2005, from http://www.caspur.it/risorse/softappl/doc/sas_docs/proc/z0608466.htm 2003.
Satterthwaite FW. An approximate distribution of estimates of variance components. Biometrics Bull 1946;2:110–114.
Cooper H, Hedges LV. The Handbook of Research Aynthesis. New York: Russell Sage 1994.
DerSimonian R, Laird NM. Meta-analysis in clinical trials. Controlled Clinical Trials 1986;7:177–188.
Cochran BG. The combination of estimates from different experiments. Biometrics 1954, 10:101–129.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hu, P., Greenwood, C.M.T. & Beyene, J. Statistical Methods for Meta-Analysis of Microarray Data: A Comparative Study. Inf Syst Front 8, 9–20 (2006). https://doi.org/10.1007/s10796-005-6099-z
Issue Date:
DOI: https://doi.org/10.1007/s10796-005-6099-z