A Survey of Multiple Sequence Alignment Techniques

Wang, Xiao-Dan; Liu, Jin-Xing; Xu, Yong; Zhang, Jian

doi:10.1007/978-3-319-22180-9_52

Xiao-Dan Wang¹⁶,
Jin-Xing Liu¹⁶,
Yong Xu¹⁶ &
…
Jian Zhang¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9225))

Included in the following conference series:

International Conference on Intelligent Computing

2048 Accesses
4 Citations

Abstract

Multiple sequence alignment (MSA) is a basic step in many bioinformatics analyses, and also a NP-hard problem. In order to improve the speed, accuracy and cater to the requirement of large-scale sequences alignment, a wide variety of MSA methods and softwares have been subsequently developed. In this article, we will systematically review the wildly used methods and introduce their practical results on the benchmark Balibase 3.0 references. We come to the conclusion that computational complexity still is the bottleneck of MSA. We also consider future development of MSA methods with respect to applying of more different technologies and the prospect of parallelization of MSA.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Multiple Sequence Alignment

Sequence Alignment

Multiple Sequence Alignment Algorithms in Bioinformatics

Keywords

1 Introduction

With the further rapid development of new sequencing technology, the biological applications become more and more widely, including exposition of relationship between nucleosome positioning and DNA methylation [1], prediction of missense mutation or protein functionality [2, 3], the assembly of new genomes [4], crop breeding [5], and so on. For most of these applications, multiple sequence alignments are fundamental.

For $ N $ sequences of length $ L $, the exact way of computing an optimal alignment has a computational complexity of $ O(N^{L} ) $, which is excessive even for small number of sequences. Unfortunately, all sequencing technologies in production, such as Illumina, Helicos, SOLiD and Roche/454, can produce thousands or millions of sequences concurrently [6, 7]. In order to overcome this difficulty, many heuristic methods, including progressive methods [8] and iterative refinement methods [9] are developed.

This article aims to systematically review the recent advance of MSA methods. It is organized as follows. We first introduce the basic theory of heuristic methods and review the development of wildly used techniques, including Clustal, T-Coffee, MAFFT, MUSCLE and Kalign in Sect. 2, and then examine their programs on the benchmark Balibase 3.0 references [10], Oxbench [11] and Homestrad in Sect. 3. Finally, we discuss the future development of multiple sequence alignment in Sect. 4.

2 Overview

2.1 Theory

Progressive Method.

The progressive method is the first practical MSA construction strategy, and still composes the key of a majority of MSA programs by now. A progressive method usually is made up of four steps as follows [12]:

Step 1: Calculate a distance matrix for $ N $ input sequences. The element of this matrix is the distance of every pair of the input sequences, and there are many ways to messure distance, for example, angle cosine and Euclidean distance. In a exact way, $ \left( {\begin{array}{*{20}c} N \\ 2 \\ \end{array} } \right) $ pair-wise alignments are needed to count the numbers of matches, mismatches, and indels, which are then converted to the distance measures. This procedure is costly when $ N $ is large, as its time complexity is $ O(N^{2} L^{2} ) $;

Step 2: Construct a guide tree according to the distance matrix calculated in Step1 by a clustering analysis method. The most widely used method is UPGMA(Unweighted Pair-Group Method with Arithmetic means) [13] which takes computation time of $ O(N^{2} ) $ to construct the guide tree;

Step 3: In the guide tree, an external node represents each input sequence, while an internal node represents an MSA;

Step 4: Repeat Step1 and Step2 for the generated pair-wise alignments after construction of the initial MSA.

Iterative Refinement.

The progressive method is implemented using a “greedy algorithm” by what mistakes made at the initial alignment stages cannot be corrected later [14]. To overcome this defect, an effective approach relies on post process known as iterative refinement, which also consists of four steps as follow [12]:

Step 1: Construct an initial MSA;

Step 2: Divide the MSA constructed in Step1 into two groups, then get rid of the columns made up of nulls from each of the two groups;

Step 3: Realign the two groups produced in Step2 by a pair-wise sequence-to-group or group-to-group alignment method;

Step 4: Repeat Step2 and Step3 until no gain in the alignment score or the iterative times exceeding a predefined number.

Scoring Function.

A good scoring function is necessary to guarantee this procedure work accurately. The most widely used function is sum-of-pairs (SP) score [15] and weighted sum-of-pairs score (WSP) [16] with affine gaps.

For a sequence set $ A $ which is made up of $ N $ sequences of length $ L $, we define WSP as follow:

$$ \begin{aligned} \begin{array}{*{20}c} {WSP(A)} & { \, = \sum\limits_{1 \le i < j \le N} {w_{i,j} H(a_{i} ,a_{j} )} } \\ \end{array} \hfill \\ \begin{array}{*{20}c} {} & { = \sum\limits_{1 \le l \le L} {\sum\limits_{1 \le i < j \le N} {w_{i,j} [S(a_{i,l} ,a_{j,l} ) - v \cdot G(i,j,l)]} } } \\ \end{array} \hfill \\ \hfill \\ \end{aligned}, $$

(1)

where $ H(a_{i} ,a_{j} ) $ is the alignment score of a pair of sequences in $ A $, $ w_{i,j} $ is the weight corresponding to the pair sequences $ [a_{i} ,a_{j} ] $ ($ w_{i,j} = 1 $ is an unweighted case), $ S(a_{i,l} ,a_{j,l} ) $ is the match score of the pair sequences $ [a_{i} ,a_{j} ] $ at position $ l $, $ G(i,j,l) $ is a Boolean variable which is defined as follows, if a gap opens between $ a_{i} $ and $ a_{j} $ at position $ l $, $ G(i,j,l) = 1 $, else $ G(i,j,l) = 0 $, and $ v $ is the penalty of gap.

2.2 Alignment Technique

Clustal.

In 1988, the first Clustal program was written by Des Higgins [17], and a dynamic programming algorithm [18] and the progressive alignment strategy developed by Feng and Doolittle [8] were combined in this program. It used a word-based alignment algorithm [19] to calculate the distance matrix and UPGMA method was used to construct the guide tree. In 1992, ClustalV [20] implemented profile alignments to generate guide trees from the multiple alignment using the Neighbour-Joining (NJ) method [21]. In 1994, ClustalW [22] improved the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. In 1997, ClustalX [23] provided a visual interface, so that the multiple alignment can be displayed on the screen and all parameters were optional, which was a significant convenience to the user’s of evaluation. The latest member of Clustal series program is Clustal Omega [14], which can align virtually any number of protein sequences quickly and delivers accurate alignments. For constructing a guide tree, Clustal Omega uses a modified version of mBed [24] which has complexity of $ O(N\log N) $ and the guide tree is just as accurate as those from conventional methods. In Clustal Omega, the alignments are then computed using the very accurate HHalign package [25], which aligns two profile hidden Markov models [17].

T-Coffee.

The first T-Coffee (Tree-based Consistency Objective Function for alignment Evaluation) [26] version can be track back to 2000. It implemented progressive alignment with a consistency-based objective function [27] and tried to maximize the score between the final multiple alignment and a library of pair-wise aligned residue scores which is derived from a mixture of local and global pair-wise alignments. M-Coffee [28] is an extension of T-Coffee and uses consistency to estimate a consensus alignment, and a meta-method for assembling multiple sequence alignments (MSA) by combining the output of several individual methods into one single MSA. TCS (Transitive Consistency Score) [29] is a new extended version of the T-Coffee scoring scheme for overcoming the problem that homology and evolutionary modeling are sensitive to the underlying MSA accuracy, and it also can improve phylogenetic tree reconstruction.

MAFFT.

MAFFT [30] was a method for rapid multiple protein sequence alignment based on FFT (Fast Fourier Transform), first released in 2002. Homologous region were rapidly identified by the FFT. FFT converted an amino acid sequence to a sequence whose composition were volume and polarity values of each amino acid residue. The original MAFFT included two different heuristics, the progressive methods were FFT-NS-1 and FFT-NS-2 and the iterate refinement method was FFT-NS-i. In 2005, MAFFT version 5 [31] was released with improvement of accuracy by offering new iterative refinement options, H-INS-i, F-INS-i and G-INS-i. And MAFFT version 5 incorporated pair-wise alignment information into objective function. In 2007, MAFFT version 6 [32] improved accuracy of multiple ncRNA alignment with two techniques: the PartTree algorithm and the Four-way consistency objective function. In 2010, for speeding up program, two natural parallelization strategies (best-first and simple hill-climbing) were implemented for the iterative refinement stage based on MAFFT version 6, and a simple hill-climbing approach was selected as the default [33]. In 2012, two methods had been implemented as the ‘–add’ and ‘–addfragments’ options in the MAFFT package [34] for adding unaligned sequences into an existing multiple sequence alignment.

The newest version is MAFFT version 7 [35], it has options for adding unaligned sequences into an existing alignment, and beyond this, it has several new features, including adjustment of direction in nucleotide alignment, constrained alignment and parallel processing.

MUSCLE.

MUSCLE (MUltiple Sequence Comparison by Log-Expectation) [36] is a multiple sequence alignment method of protein sequences. MUSCLE uses two distance measures for each pair of sequences: a kmer distance (for an unaligned pair) and the Kimura distance (for an aligned pair). Guide tree is constructed using UPGMA. MUSCLE uses a profile function called log-expectation (LE) score. And MUSCLE includes three stages as follow:

Stage 1: Draft progressive. This stage includes four steps (similarity measure, distance estimate, tree construction, progressive alignment) and produces a rapid multiple alignment, while de-emphasizing accuracy.

Stage 2: Improved progressive. This stage also includes four steps (similarity measure, tree construction, tree comparison, progressive alignment). In the stage1, the main source of error is the k-mer distance measure, which leads to a suboptimal tree. MUSCLE therefore re-estimates the tree using the Kimura distance, which is more accurate but requires an alignment.

Stage 3: Refinement. This stage is made up of four steps (choice of bipartition, profile extraction, re-alignment, accept/reject). The third stage performs iterative refinement using a approximate tree-dependent restricted partitioning [21].

Kalign.

Kalign [31] was a MSA algorithm, which proposed in 2005. It also implemented progressive alignment. And unlike other progressive methods, Kalign employed Wu-Manber approximate string-matching algorithm [37] which made Kalign more accurate in aspect of distance estimation. In 2007, Emmanuelle Becher etc. proposed a tool called HMM-Kalign [38] for generating sub-optimal alignments. As the name implies, HMM-Kalign was based on original Kalign by implementing Hidden Markove Model. The newest inproved edition of Kalign was Kalign-LCS [39]. It applied the longest common subsequence (LLCS) in similarity measure step, and obtained a balance between accuracy and speed.

3 Practical Result

We examine ClustalW, Clustal Omega, T-Coffee, MAFFT:Auto, MAFFT:FFT-NS-1, MAFFT:G-INS-i, MUSCLE and Kalign on the benchmark Balibase 3.0 references, OXbench and Homestrad, respectively.

We evaluate the alignment results with BaliScore, including SP-score (Sum of Pairs score) which is the percentage of homologies in the reference alignment recovered in the estimated alignment and TC-score (Total column score) is the percentage of columns that are recovered entirely correctly in the estimated alignment (Tables 1, 2 and 3).

Table 1. Summary of the techniques described in the review

Full size table

Table 2. The SP-score of various individual methods on the benchmark Balibase 3.0 references

Full size table

Table 3. The TC-score of various individual methods on the benchmark Balibase 3.0 references

Full size table

From the results of SP-score and TC-score, we can see that all programs we examined are not sensitive to divergence of sequence. All programs suffer by the impact of a highly divergent “orphan” sequence, residue difference between groups, N/C-terminal extensions, and internal insertions to varying degrees, respectively. And on the whole, Clustal Omega and T-Coffee perform well, especially the results corresponding to T-Coffee are the best.

4 Conclusion and Future Development

In the past years, MSA achieved great development, and obtained good effect which applied in many biological applications. But there still is plenty room to improve multiple sequence alignment, especially in the respect of robustness and accuracy. In order to solve these problems, in one hand, we should continue to develop recent efficient MSA techniques, such as T-Coffee, in other hand we should transform the way of thinking and apply more techniques which are not just heuristic methods, even not just biological informatics technology to improve MSA.

Happily, many researchers devote themselves to develop MSA method. Sabari Pramanik and S.K. Setua [40] define a new form of chromosome representation, and deploy it on steady state Genetic Algorithm, then get better results. Siavash Mirarab, Nam Nguyen, and Tandy Warnow propose an algorithm called PASTA [41] to realize estimation of large-scale multiple sequence alignment. And there is a interesting method called Phylo [42], which is a human-based computing framework applying ‘‘crowd sourcing’’ techniques to solve the Multiple Sequence Alignment (MSA) problem. The key idea of Phylo is to convert the MSA problem into a casual game that can be played by ordinary web users with a minimal prior knowledge of the biological context. Cactus [43] caters to the phenomenon that much attention has been given to the problem of creating reliable multiple sequence alignments in a model incorporating substitutions, insertions, and deletions while far less attention has been paid to the problem of optimizing alignments in the presence of more general rearrangement and copy number variation.

Another trend of development is parallelization of MSA. Because of that MSA is a NP-hard problem and the huge amount of data, the programs of MSA are costly in the respect of time. Hence, it’s necessary to implement parallel solutions in MSA. Jucele F. A. et al. [44] present two parallel solutions using the BSP/CGM model, with MPI and CUDA implementations. And the results of this method show that the use of parallel processing allows the manipulation of more and larger sequences. Evandro A. Marucci et al. [45] propose a parallel algorithm for multiple sequence similarities calculation based on the k-mer counting method, and obtain a very good scalability and a nearly linear speedup.

References

Chodavarapu, R.K., Feng, S., Bernatavichute, Y.V., Chen, P.-Y., Stroud, H., Yu, Y., et al.: Relationship between nucleosome positioning and DNA methylation. Nature 466, 388–392 (2010)
Article Google Scholar
Hicks, S., Wheeler, D.A., Plon, S.E., Kimmel, M.: Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum. Mutat. 32, 661–668 (2011)
Article Google Scholar
Wang, P., Hu, L., Liu, G., Jiang, N., Chen, X., Xu, J., et al.: Prediction of antimicrobial peptides based on sequence alignment and feature selection methods. PLoS one 6, e18476 (2011)
Article Google Scholar
Brenchley, R., Spannagl, M., Pfeifer, M., Barker, G.L., D’Amore, R., Allen, A.M., et al.: Analysis of the bread wheat genome using whole-genome shotgun sequencing. Nature 491, 705–710 (2012)
Article Google Scholar
Varshney, R.K., Terauchi, R., McCouch, S.R.: Harvesting the promising fruits of genomics: applying genome sequencing technologies to crop breeding. PLoS Biol. 12, e1001883 (2014)
Article Google Scholar
Li, H., Homer, N.: A survey of sequence alignment algorithms for next-generation sequencing. Briefings Bioinform. 11, 473–483 (2010)
Article Google Scholar
Zhou, X., Ren, L., Meng, Q., Li, Y., Yu, Y., Yu, J.: The Next-generation sequencing technology and application. Protein Cell 1, 520–536 (2010)
Article Google Scholar
Feng, D.-F., Doolittle, R.F.: Progressive sequence alignment as a prerequisitetto correct phylogenetic trees. J. Mol. Evol. 25, 351–360 (1987)
Article Google Scholar
Hogeweg, P., Hesper, B.: The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J. Mol. Evol. 20, 175–186 (1984)
Article Google Scholar
Thompson, J.D., Koehl, P., Ripp, R., Poch, O.: BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins Struct. Funct. Bioinf. 61, 127–136 (2005)
Article Google Scholar
Raghava, G., Searle, S.M., Audley, P.C., Barber, J.D., Barton, G.J.: OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy. BMC Bioinf. 4, 47 (2003)
Article Google Scholar
Gotoh, O.: Heuristic Alignment Methods. Multiple Seq. Alignment Meth. 1079, 29–43 (2014)
Article Google Scholar
Kersters, K., De Ley, J., Sneath, P., Sackin, M.: Numerical taxonomic analysis of agrobacterium. J. Gen. Microbiol. 78, 227–239 (1973)
Article Google Scholar
Sievers, F., Wilm, A., Dineen, D., Gibson, T.J., Karplus, K., Li, W., et al.: Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol. Syst. Biol. 7, 539 (2011)
Article Google Scholar
Altschul, S.F.: Gap costs for multiple sequence alignment. J. Theor. Biol. 138, 297–309 (1989)
Article MathSciNet Google Scholar
Altschul, S.F., Carroll, R.J., DJ, L.: Weights for Data Related by a Tree. J. Mol. Biol. 207, 647–653 (1989)
Article Google Scholar
Eddy, S.R.: Profile hidden markov models. Bioinformatics 14, 755–763 (1998)
Article Google Scholar
Myers, E.W., Miller, W.: Optimal alignments in linear space. Comput. Appl. Biosci. CABIOS. 4, 11–17 (1988)
Google Scholar
Wilbur, W.J., Lipman, D.J.: Rapid similarity searches of nucleic acid and protein data banks. Proc. Natl. Acad. Sci. 80, 726–730 (1983)
Article Google Scholar
Higgins, D.G.: CLUSTAL V: multiple alignment of DNA and protein sequences. Comput. Anal. Seq. Data 25, 307–318 (1994)
Article Google Scholar
Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)
Google Scholar
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)
Article Google Scholar
Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G.: The CLUSTAL_X windows Interface: Flexible Strategies for Multiple Sequence Alignment Aided by Quality Analysis Tools. Nucleic Acids Res. 25, 4876–4882 (1997)
Article Google Scholar
Blackshields, G.S.F., Shi, W., Wilm, A., Higgins, D.G.: Sequence embedding for fast construction of guide trees for multiple sequence alignment. Algorithms Mol Biol. 5, 21 (2010)
Article Google Scholar
Söding, J.: Protein homology detection by HMM–HMM comparison. Bioinformatics 21, 951–960 (2005)
Article Google Scholar
Notredame, C., Higgins, D.G., Heringa, J.: T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217 (2000)
Article Google Scholar
JD, K.: The maximum weight trace problem in multiple sequence alignment. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds.) CPM 1993. LNCS, vol. 684, pp. 106–119. Springer, Heidelberg (1993)
Chapter Google Scholar
Wallace, I.M., O’Sullivan, O., Higgins, D.G., Notredame, C.: M-Coffee: combining multiple sequence alignment methods with t-coffee. Nucleic Acids Res. 34, 1692–1699 (2006)
Article Google Scholar
Chang, J.-M., Di Tommaso, P., Notredame, C.: TCS: A New Multiple Sequence Alignment Reliability Measure to Estimate Alignment Accuracy and Improve Phylogenetic Tree Reconstruction. Molecular Biology and Evolution. msu117(2014)
Google Scholar
Katoh, K., Misawa, K., K.-I, K., Miyata, T.: MAFFT: a novel method for rapid multiple sequence alignment based on fast fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002)
Article Google Scholar
Katoh, K., Kuma, K.-i, Toh, H., Miyata, T.: MAFFT Version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518 (2005)
Article Google Scholar
Katoh, K., Toh, H.: Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework. BMC Bioinform. 9, 212 (2008)
Article Google Scholar
Katoh, K., Toh, H.: Parallelization of the MAFFT multiple sequence alignment program. Bioinform. 2, 1899–1900 (2010)
Article Google Scholar
Katoh, K., Frith, M.C.: Adding unaligned sequences into an existing alignment using MAFFT and LAST. Bioinform. 28, 3144–3146 (2012)
Article Google Scholar
Katoh, K., Standley, D.M.: MAFFT multiple sequence alignment software Version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013)
Article Google Scholar
Edgar, R.C.: MUSCLE: multiple aequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004)
Article Google Scholar
Wu, S., Manber, U.: Fast text searching: allowing errors. Commun. ACM 35, 83–91 (1992)
Article Google Scholar
Becker, E., Cotillard, A., Meyer, V., Madaoui, H., Guérois, R.: HMM-Kalign: a tool for generating sub-optimal HMM alignments. Bioinform. 23, 3095–3097 (2007)
Article Google Scholar
Deorowicz, S., Debudaj-Grabysz, A., Gudyś, A.: Kalign-LCS — a more accurate and faster variant of kalign2 algorithm for the multiple sequence alignment problem. In: Gruca, A., Czachórski, T., Kozielski, S. (eds.) Man-Machine Interactions 3. AISC, vol. 242, pp. 499–506. Springer, Heidelberg (2014)
Google Scholar
Pramanik, S., Setua, S.: A steady state genetic algorithm for multiple sequence alignment. In: International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1095–1099. IEEE (2014)
Google Scholar
Mirarab, S., Nguyen, N., Warnow, T.: PASTA: ultra-large multiple sequence alignment. In: Sharan, R. (ed.) RECOMB 2014. LNCS, vol. 8394, pp. 177–191. Springer, Heidelberg (2014)
Chapter Google Scholar
Kawrykow, A., Roumanis, G., Kam, A., Kwak, D., Leung, C., Wu, C., et al.: Phylo: a citizen science approach for improving multiple sequence alignment. PLoS one 7, e31362 (2012)
Article Google Scholar
Paten, B., Earl, D., Nguyen, N., Diekhans, M., Zerbino, D., Haussler, D.: Cactus: algorithms for genome multiple sequence alignment. Genome Res. 21, 1512–1528 (2011)
Article Google Scholar
Vasconcellos, J.F., Nishibe, C., Almeida, N.F., Cáceres, E.N.: Efficient parallel implementations of multiple sequence alignment using BSP/CGM model. In: Proceedings of Programming Models and Applications on Multicores and Manycores, 103. ACM (2014)
Google Scholar
Marucci, E.A., Zafalon, G.F., Momente, J.C., Neves, L.A., Valêncio, C.R., Pinto, A.R. et al.: An Efficient Parallel Algorithm for Multiple Aequence Aimilarities Calculation Using a Low Complexity Method. BioMed research international (2014)
Google Scholar

Download references

Acknowledgement

This work was supported by Shenzhen Municipal Science and Technology Innovation Council (Grant No. CXZZ20140904154910774, Grant No.JCYJ20140417172417174, Grant No. JCYJ20140904154645958, Grant No. JCYJ20130329151843309) and China Postdoctoral Science Foundation funded project (Grant No. 2014M560264).

Author information

Authors and Affiliations

Bio-Computing Research Center, Shenzhen Graduate School, Harbin Institute of Technology, Heilongjiang, China
Xiao-Dan Wang, Jin-Xing Liu, Yong Xu & Jian Zhang

Authors

Xiao-Dan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Xing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jian Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-Dan Wang .

Editor information

Editors and Affiliations

Tongji University, Shanghai, China
De-Shuang Huang
Polytecnic of Bari, Bari, Italy
Vitoantonio Bevilacqua
University of Wollongong, North Wollongong, New South Wales, Australia
Prashan Premaratne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, XD., Liu, JX., Xu, Y., Zhang, J. (2015). A Survey of Multiple Sequence Alignment Techniques. In: Huang, DS., Bevilacqua, V., Premaratne, P. (eds) Intelligent Computing Theories and Methodologies. ICIC 2015. Lecture Notes in Computer Science(), vol 9225. Springer, Cham. https://doi.org/10.1007/978-3-319-22180-9_52

Download citation

DOI: https://doi.org/10.1007/978-3-319-22180-9_52
Published: 11 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22179-3
Online ISBN: 978-3-319-22180-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Survey of Multiple Sequence Alignment Techniques

Abstract

Similar content being viewed by others

Multiple Sequence Alignment

Sequence Alignment

Multiple Sequence Alignment Algorithms in Bioinformatics

Keywords

1 Introduction

2 Overview

2.1 Theory

Progressive Method.

Iterative Refinement.

Scoring Function.

2.2 Alignment Technique

Clustal.

T-Coffee.

MAFFT.

MUSCLE.

Kalign.

3 Practical Result

4 Conclusion and Future Development

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Survey of Multiple Sequence Alignment Techniques

Abstract

Similar content being viewed by others

Multiple Sequence Alignment

Sequence Alignment

Multiple Sequence Alignment Algorithms in Bioinformatics

Keywords

1 Introduction

2 Overview

2.1 Theory

Progressive Method.

Iterative Refinement.

Scoring Function.

2.2 Alignment Technique

Clustal.

T-Coffee.

MAFFT.

MUSCLE.

Kalign.

3 Practical Result

4 Conclusion and Future Development

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation