Computational Molecular Biology

Multiple Sequence Alignment

Doug Brutlag

October 23, 2008

Armougom, F., Moretti, S., Poirot, O., Audic, S., Dumas, P., Schaeli, B., et al. (2006). Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-Coffee. Nucleic Acids Res, 34(Web Server issue), W604-608.

Campagne, F. and Maigret, B. (1998). Multiple sequence alignment in HTML: colored, possibly hyperlinked, compact representations. J Mol Graph Model, 16(1), 6-10, 34-15.

Carrillo, H. and Lipman, D. (1988). SIAM J. Appl. Math., 48, 1073-1082.

Corpet, F. (1988). Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res., 16, 10881-10890.

Dolz, R. (1994). GCG: production of multiple sequence alignment. Methods Mol Biol 24 , 83-99.

Eddy, S. R. (1995). Multiple alignment using hidden Markov models. Ismb, 3, 114-20.

Eisen, J. A. (1997). The Genetic Data Environment. A user modifiable and expandable multiple sequence analysis package. Methods Mol Biol, 70, 13-38.

Feng, D. F., and Doolittle, R. F. (1987). Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351-360.

Feng, D. F. and Doolittle, R. F. (1996). Progressive Alignment of Amino Acid Sequences and Construction of Phylogenetic Trees from Them. Methods in Enzymology, 266, 368-382.

Galas, D.J., Eggert, M. and Waterman, M.S. (1985). Rigorous pattern-recognition methods for DNA sequences. Analysis of promoter sequences from Escherichia coli. J. Mol. Biol. 186, 117-128.

Gotoh, O. (1993). Optimal alignment between groups of sequences and its application to multiple sequence alignment. Comput Appl Biosci 9 (3), 361-70.

Gotoh, O. (1996). Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J Mol Biol, 264(4), 823-38.

Henikoff, S., & Henikoff, J. G. (1997). Embedding strategies for effective use of information from multiple sequence alignments. Protein Sci, 6(3), 698-705.

Higgins, D. G. and Sharp, P. M. (1988). CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene, 73(1), 237-44.

Higgins, D. G. and Sharp, P. M. (1989). Fast and sensitive multiple sequence alignments on a microcomputer. Comput Appl Biosci, 5(2), 151-3.

Higgins, D. G., Bleasby, A. J. and Fuchs, R. (1992). Clustal V: improved software for multiple sequence aligment. CABIOS, 8(2), 189-191.

Higgins, D. G. (1994). CLUSTAL V: multiple alignment of DNA and protein sequences. Methods Mol Biol 25 , 307-18.

Higgins, D. G., Thompson, J. D. and Gibson, T. J. (1996). Using CLUSTAL for Multiple Sequence Alignments. Methods in Enzymology, 266, 383-401.

Hughey, R., & Krogh, A. (1996). Hidden Markov models for sequence analysis: extension and analysis of the basic method. Comput Appl Biosci, 12(2), 95-107.

Jeanmougin, F., Thompson, J. D., Gouy, M., Higgins, D. G. and Gibson, T. J. (1998). Multiple sequence alignment with Clustal X. Trends Biochem Sci, 23(10), 403-405.

Johnson, M. S., and Doolittle, R. F. (1986). A method for the simultaneous alignment of three or more amino acid sequences. J. Mol. Evol. 23, 267-278.

Karlin, S. and Ghandour,G.(1985). Comparative statistics for DNA and protein sequences: Multiple sequence analysis. Proc. Natl. Acad. Sci. USA 82, 6186-6190.

Karlin, S., Morris, D., Ghandour, G., and Leung, M. Y. (1988). Efficient algorithms for molecular sequence analysis. Proc. Natl. Acad. Sci. U. S. A. 85, 841-845.

Lawrence, C. E., Altschul, S. F., Boguski, M. S., Liu, J. S., Neuwald, A. F. and Wootton, J. C. (1993). Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262 (5131), 208-14.

Lipman, D. J., Altschul, S. F. and Kececioglu, J. D. (1989). A tool for multiple sequence alignment. Proc Natl Acad Sci U S A, 86(12), 4412-5.

Martinez H.M. (1983) An efficient method for finding repeats in molecular sequences. Nucleic Acids Res. 11, 4629-4634.

Martinez, H. M. (1988). A flexible multiple sequence alignment program. Nucleic. Acids. Res. 16, 1683-1691.

Morgenstern, B., Atchley, W. R., Hahn, K. and Dress, A. (1998). Segment-based scores for pairwise and multiple sequence alignments. Ismb, 6, 115-121.

Murata, M., Richardson, J. S., and Sussman, J. L. (1985). Simultaneous comparison of three protein sequences. Proc. Natl. Acad. Sci. U. S. A. 82, 3073-3077.

Myers, G., Selznick, S., Zhang, Z., & Miller, W. (1996). Progressive multiple alignment with constraints. J Comput Biol, 3(4), 563-72.

Notredame, C., Higgins, D. G., & Heringa, J. (2000). T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol, 302(1), 205-217.

O'Sullivan, O., Suhre, K., Abergel, C., Higgins, D. G., & Notredame, C. (2004). 3DCoffee: combining protein sequences and structures within multiple sequence alignments. J Mol Biol, 340(2), 385-395.

Poirot, O., O'Toole, E., & Notredame, C. (2003). Tcoffee@igs: A web server for computing, evaluating and combining multiple sequence alignments. Nucleic Acids Res, 31(13), 3503-3506.

Russell, R. B. and Barton, G. J. (1992). Multiple protein sequence alignment from tertiary structure comparison: assignment of global and residue confidence levels. Proteins, 14(2), 309-23.

Shibuya, T. and Imai, H. (1997). New flexible approaches for multiple sequence alignment. J Comput Biol, 4(3), 385-413.

Shibuya, T. and Imai, H. (1997). Enumerating suboptimal alignments of multiple biological sequences efficiently. Pac Symp Biocomput, , 409-420.

Sobel, E., and Martinez, H. M. (1986). A multiple sequence alignment program. Nucleic. Acids. Res. 14, 363-374.

Sonnhammer, E. L. et al. (1998). Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res, 26(1), 320-322.

Srinivasarao, G. Y., Yeh, L. S., Marzec, C. R., Orcutt, B. C., Barker, W. C. and Pfeiffer, F. (1999). Database of protein sequence alignments: PIR-ALN. Nucleic Acids Res, 27(1), 284-285.

Subbiah, S. and Harrison, S. C. (1989). A method for multiple sequence alignment with gaps. J Mol Biol, 209(4), 539-48.

Taylor, W. R. (1986). Identification of protein sequence homology by consensus template alignment. J. Mol. Biol. 188, 233-258.

Taylor, W. R. (1987). Multiple sequence alignment by a pairwise algorithm. Comput. Appl. Biosci. 3, 81-87.

Taylor, W. R. (1996). Multiple Protein Sequence Alignment: Algorithms and Gap Insertion. Methods in Enzymology, 266, 343-367.

Thompson, J. D., Higgins, D. G. and Gibson, T. J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22 (22), 4673-80.

Thompson, J. D. et al. (1997). The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res, 25(24), 4876-4882.

Vingron, M. and Argos, P. (1989). A fast and sensitive multiple sequence alignment algorithm. Comput Appl Biosci, 5(2), 115-21.

Vingron, M., & Sibbald, P. R. (1993). Weighting in sequence space: a comparison of methods in terms of generalized sequences. Proc Natl Acad Sci U S A, 90(19), 8777-81.

Wallace, I. M., O'Sullivan, O., Higgins, D. G., & Notredame, C. (2006). M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res, 34(6), 1692-1699.

Waterman, M., Arratia, R. and Galas, D.J. (1984). Pattern Recognition in Several Sequences: Consensus and Alignment. Bull. Math. Biol. 46, 515-527.

Waterman, M. S. (1986). Multiple sequence alignment by consensus. Nucleic. Acids. Res. 14, 9095-9102.

Zhang, C. and Wong, A. K. (1997). A genetic algorithm for multiple molecular sequence alignment. Comput Appl Biosci, 13(6), 565-581.

Back to Lecture

Back to Syllabus