Abstract
Haplotype inference by pure parsimony (HIPP) is a well-known paradigm for haplotype inference. In order to assess the biological significance of this paradigm, we generalize the problem of HIPP to the problem of finding all optimal solutions, which we call complete HIPP. We study intrinsic haplotype features, such as backbone haplotypes and fat genotypes as well as equal columns and decomposability. We explicitly exploit these features in three computational approaches which are based on integer linear programming, depth-first branch-and-bound, and a hybrid algorithm that draws on the diverse strengths of the first two approaches. Our experimental analysis shows that our optimized algorithms are significantly superior to the baseline algorithms, often with orders of magnitude faster running time. Finally, our experiments provide some useful insights to the intrinsic features of this interesting problem.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Problem Instance
- Integer Linear Program
- Unique Haplotype
- Baseline Algorithm
- International HapMap Project
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Andrés, A.M., Clark, A.G., Boerwinkle, E., Sing, C.F., Hixson, J.E.: Assessing the accuracy of statistical haplotype inference with sequence data of known phase. Genet. Epi. 31, 659–671 (2007)
Bertolazzi, P., Godi, A., Labbé, M., Tininini, L.: Solving haplotyping inference parsimony problem using a new basic polynomial formulation. Comput. Math. Appl. 55(5), 900–911 (2008)
Brown, D.G., Harrower, I.M.: Integer Programming Approaches to Haplotype Inference by Pure Parsimony. IEEE/ACM Transactions on Computational Biology and Bioinformatics 3(2), 141–154 (2006)
Clark, A.G.: Inference of Haplotypes from PCR-Amplified Samples of Diploid Populations. Molecular Biology and Evolution 7, 111–122 (1990)
Climer, S., Jäger, G., Templeton, A.R., Zhang, W.: How Frugal is Mother Nature with Haplotypes? Bioinformatics 25(1), 68–74 (2009)
Climer, S., Zhang, W.: Searching for Backbones and Fat: A Limit-Crossing Approach with Applications. In: Proc. 18th National Conference on Artificial Intelligence (AAAI), pp. 707–712 (2002)
Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer, Berlin (2006)
Guo, J., Niedermeier, R.: Invitation to data reduction and problem kernelization. SIGACT News 38(1), 31–45 (2007)
Gusfield, D.: Inference of Haplotypes from Samples of Diploid Populations: Complexity and Algorithms. J. Computational Biology 8(3), 305–313 (2001)
Gusfield, D.: Haplotype Inference by Pure Parsimony. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 144–155. Springer, Heidelberg (2003)
Gusfield, D., Orzack, S.H.: Haplotype Inference. In: Handbook on Bioinformatics (2005)
Halldórsson, B.V., Bafna, V., Edwards, N., Lippert, R., Yooseph, S., Istrail, S.: A survey of computational methods for determining haplotypes. In: Istrail, S., Waterman, M.S., Clark, A. (eds.) DIMACS/RECOMB Satellite Workshop 2002. LNCS (LNBI), vol. 2983, pp. 26–47. Springer, Heidelberg (2004)
Lancia, G., Pinotti, C.M., Rizzi, R.: Haplotype Populations by Pure Parsimony: Complexity of Exact and Approximation Algorithms. INFORMS J. Computing 16(4), 348–359 (2004)
Lynce, I., Marques-Silva, J.: Efficient Haplotype Inference with Boolean Satisfiability. In: Proc. 21st National Conference on Artificial Intelligence (AAAI), pp. 104–109 (2006)
Lynce, I., Marques-Silva, J.: SAT in Bioinformatics: Making the Case with Haplotype Inference. In: Biere, A., Gomes, C.P. (eds.) SAT 2006. LNCS, vol. 4121, pp. 136–141. Springer, Heidelberg (2006)
Lynce, I., Marques-Silva, J., Prestwich, S.: Boosting Haplotype Inference with Local Search. Constraints 13(1-2), 155–179 (2008)
Niedermeier, R.: Invitation to Fixed-Parameter Tractability. Oxford University Press, Oxford (2006)
Orzack, S.H., Gusfield, D., Olson, J., Nesbitt, S., Subrahmanyan, L., Stanton Jr., V.P.: Analysis and Exploration of the Use of Rule-Based Algorithms and Consensus Methods for the Inferral of Haplotypes. Genetics 165, 915–928 (2003)
Slaney, J., Walsh, T.: Backbones in Optimization and Approximation. In: Proc. 17th Intern. Joint Conf. on Artificial Intelligence (IJCAI 2001), pp. 254–259 (2001)
Wang, L., Xu, Y.: Haplotype Inference by Maximum Parsimony. Bioinformatics 19(14), 1773–1780 (2003)
Zhang, W.: Phase transitions and backbones of 3-SAT and maximum 3-SAT. In: Walsh, T. (ed.) CP 2001. LNCS, vol. 2239, pp. 153–167. Springer, Heidelberg (2001)
Zhang, W.: Configuration Landscape Analysis and Backbone Guided Local Search: Part I: Satisfiability and Maximum Satisfiability. Artificial Intelligence 158(1), 1–26 (2004)
Zhang, W.: Phase Transitions and Backbones of the Asymmetric Traveling Salesman Problem. J. Artificial Intelligence Research 20, 471–497 (2004)
Zhang, W., Looks, M.: A Novel Local Search Algorithm for the Traveling Salesman Problem that Exploits Backbones. In: Proc. 19th International Joint Conference on Artificial Intelligence (IJCAI), pp. 343–350 (2005)
Homepage of Cplex, http://www.ilog.com/products/optimization/archive.cfm
The International HapMap Consortium: A Haplotype Map of the Human Genome. Nature 437, 1299–1320 (2005)
Supporting Information to this paper, http://www.cse.wustl.edu/~zhang/publications/supplemental/ChippSup.pdf
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jäger, G., Climer, S., Zhang, W. (2009). Complete Parsimony Haplotype Inference Problem and Algorithms. In: Fiat, A., Sanders, P. (eds) Algorithms - ESA 2009. ESA 2009. Lecture Notes in Computer Science, vol 5757. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04128-0_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-04128-0_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04127-3
Online ISBN: 978-3-642-04128-0
eBook Packages: Computer ScienceComputer Science (R0)