Abstract
Scaffold filling is a new combinatorial optimization problem in genome sequencing. The one-sided scaffold filling problem can be described as: given an incomplete genome I and a complete (reference) genome G, fill the missing genes into I such that the number of common (string) adjacencies between the resulting genome Iā² and G is maximized. This problem is NP-complete for genome with duplicated genes and the best known approximation factor is 1.33, which uses a greedy strategy. In this paper, we prove a better lower bound of the optimal solution, and devise a new algorithm by exploiting the maximum matching method and a local improvement technique, which improves the approximation factor to 1.25.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Angibaud, S., Fertin, G., Rusu, I., Thevenin, A., Vialette, S.: On the approximability of comparing genomes with duplicates. J. Graph Algorithms and ApplicationsĀ 13(1), 19ā53 (2009)
Blin, G., Fertin, G., Sikora, F., Vialette, S.: The exemplar breakpoint distance for non-trivial genomes cannot be approximated. In: Das, S., Uehara, R. (eds.) WALCOM 2009. LNCS, vol.Ā 5431, pp. 357ā368. Springer, Heidelberg (2009)
Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. In: Proc. 13th ACM-SIAM Symp. on Discrete Algorithms (SODA 2002), pp. 667ā676 (2002)
Chen, Z., Fowler, R., Fu, B., Zhu, B.: On the inapproximability of the exemplar conserved interval distance problem of genomes. J. Combinatorial OptimizationĀ 15(2), 201ā221 (2008)
Chen, Z., Fu, B., Xu, J., Yang, B., Zhao, Z., Zhu, B.: Non-breaking similarity of genomes with gene repetitions. In: Ma, B., Zhang, K. (eds.) CPM 2007. LNCS, vol.Ā 4580, pp. 119ā130. Springer, Heidelberg (2007)
Chen, Z., Fu, B., Zhu, B.: The approximability of the exemplar breakpoint distance problem. In: Cheng, S.-W., Poon, C.K. (eds.) AAIM 2006. LNCS, vol.Ā 4041, pp. 291ā302. Springer, Heidelberg (2006)
Goldstein, A., Kolman, P., Zheng, J.: Minimum common string partition problem: Hardness and approximations. In: Fleischer, R., Trippen, G. (eds.) ISAAC 2004. LNCS, vol.Ā 3341, pp. 484ā495. Springer, Heidelberg (2004); also in: The Electronic Journal of Combinatorics 12, paper R50 (2005)
Jiang, M.: The zero exemplar distance problem. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol.Ā 6398, pp. 74ā82. Springer, Heidelberg (2010)
Jiang, H., Zheng, C., Sankoff, D., Zhu, B.: Scaffold filling under the breakpoint distance. In: Tannier, E. (ed.) RECOMB-CG 2010. LNCS, vol.Ā 6398, pp. 83ā92. Springer, Heidelberg (2010)
Jiang, H., Zhong, F., Zhu, B.: Filling scaffolds with gene repetitions: Maximizing the number of adjacencies. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol.Ā 6661, pp. 55ā64. Springer, Heidelberg (2011)
Jiang, H., Zheng, C., Sankoff, D., Zhu, B.: Scaffold filling under the breakpoint and related distances. IEEE/ACM Trans. Bioinformatics and Comput. BiologyĀ 9(4), 1220ā1229 (2012)
MuƱoz, A., Zheng, C., Zhu, Q., Albert, V., Rounsley, S., Sankoff, D.: Scaffold filling, contig fusion and gene order comparison. BMC BioinformaticsĀ 11, 304 (2010)
Sankoff, D.: Genome rearrangement with gene families. BioinformaticsĀ 15(11), 909ā917 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, N., Jiang, H., Zhu, D., Zhu, B. (2013). An Improved Approximation Algorithm for Scaffold Filling to Maximize the Common Adjacencies. In: Du, DZ., Zhang, G. (eds) Computing and Combinatorics. COCOON 2013. Lecture Notes in Computer Science, vol 7936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38768-5_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-38768-5_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38767-8
Online ISBN: 978-3-642-38768-5
eBook Packages: Computer ScienceComputer Science (R0)