Abstract
We have developed several distinct combinatorial approaches to the haplotype inference problem. I will talk about a few of the most recent of these approaches. One approach, the “pure parsimony” approach is to find N pairs of haplotypes, one for each genotype, that explain the N genotypes and MINIMIZE the number of distinct haplotypes used. Solving this problem is NP-hard, however, for reasonable size data (larger than in general use today), the “pure-parsimony” solution can be efficiently found in practice. I will also talk about an approach that mixes pure-parsimony with Clark’s subtraction method for haplotyping. Simulations show that the efficiently of both methods depends positively on the level of recombination – the more recombination, the more efficiency, but the accuracy depends inversely on the level of recombination. I will also discuss a practical ways to greatly boost the accuracy of Clark’s subtraction method, and identify haplotype pairs with high confidence. This approach has been tested on molecularly determined data, which will be published along with the method. Comparisons are made with PHASE and HAPLOTYPER. I will also mention some recent developments in haplotype inference that are based on viewing the problem in the context of the perfect phylogeny problem. This builds on a near-linear-time algorithm to determine whether genotype (unphased) SNP data is consistent with the no-recombination, infinite sites coalescent model of haplotype evolution. Stated differently, whether there are haplotype pairs for the genotypes, which satisfy the 4-gamete condition for tree-form evolution. The algorithm finds in linear time an implicit representation of the set of all solutions to the problem. A detailed treatment of a simple alternative algorithm for that problem will be given in the talk by V. Bafna.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gusfield, D. (2004). Combinatorial Approaches to Haplotype Inference. In: Istrail, S., Waterman, M., Clark, A. (eds) Computational Methods for SNPs and Haplotype Inference. RSNPsH 2002. Lecture Notes in Computer Science(), vol 2983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24719-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-24719-7_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21249-2
Online ISBN: 978-3-540-24719-7
eBook Packages: Springer Book Archive