Computational Molecular Biology

October 13, 2009

Doug Brutlag

Alignment of Biological Sequences

 

 

 

Allison, L., Wallace, C. S. and Yee, C. N. (1992). Finite-state models in the alignment of macromolecules. J Mol Evol, 35 (1), 77-89.

Apostolico, A., & Giancarlo, R. (1998). Sequence alignment in molecular biology. J Comput Biol, 5(2), 173-196.

Batzoglou, S. (2005). The many faces of sequence alignment. Brief bioinform, 6(1), 6-22.

Brown, D. G., Li, M., & Ma, B. (2004). A tutorial of recent developments in the seeding of local alignment. J Bioinform Comput Biol, 2(4), 819-842.

Cantalloube, H., Labesse, G., Chomilier, J., Nahum, C., Cho, Y. Y., Chams, V., Achour, A., Lachgar, A., Mbika, J. P., Issing, W. and et al. (1995). Automat and BLAST: comparison of two protein sequence similarity search programs. Comput Appl Biosci 11 (3), 261-72.

Chao, K. M., Zhang, J., Ostell, J. and Miller, W. (1995). A local alignment tool for very long DNA sequences. Comput Appl Biosci 11 (2), 147-53.

Dayhoff, M. Schwartz, R. M. and Orcutt, B. C. (1978). A model of evolutionary change in Proteins. Atlas of Protein Structure 1978, 345-352

Dayhoff, M. O., Barker, W. C. and Hunt, L. T. (1983). Establishing Homologies in Protein Sequences, in Methods in Enzymology, 91, 524-545.

DeLisi, C. and Kanehisa, M. (1984). Assessing the Significance of Local Sequence Homologies. Mathematical Biosciences 69, 77-85.

Doolittle, R. and Fairchild. (1981). Similar amino acid sequences: chance or common ancestry? Science 214, 149-158.

Doolittle, R. F. (1986). Of Urfs and Orfs: A Primer on How to Analyze Derived Amino Acid Sequences. Mill Valley, California: University Science Books.

Doolittle, R. F. (1994). Protein sequence comparisons: searching databases and aligning sequences. Curr Opin Biotechnol, 5(1), 24-28.

Edwards, Y. J., & Cottage, A. (2003). Bioinformatics methods to predict protein structure and function. A practical approach. Mol Biotechnol, 23(2), 139-166.

Feng, D.F., Johnson, M.S. and Doolittle, R.F. (1985). Aligning amino acid sequences: comparison of commonly used methods. J. Mol. Evol. 21, 112-125.

Frazer, K. A., Elnitski, L., Church, D. M., Dubchak, I., & Hardison, R. C. (2003). Cross-species sequence comparisons: a review of methods and available resources. Genome Res, 13(1), 1-12.

Giegerich, R. (2000). A systematic approach to dynamic programming in bioinformatics. Bioinformatics, 16(8), 665-677.

Gribskov, M. (1994). Profile analysis. Methods Mol Biol 25 , 247-66.

Grice, J. A., Hughey, R. and Speck, D. (1995). Parallel sequence alignment in limited space. Ismb 3 , 145-53.

Hardison, R., & Miller, W. (1993). Use of long sequence alignments to study the evolution and regulation of mammalian globin gene clusters. Mol Biol Evol, 10(1), 73-102.

Henikoff, S. (1996). Scores for sequence searches and alignments. Curr Opin Struct Biol, 6(3), 353-360.

Heringa, J. (2000). Computational methods for protein secondary structure prediction using multiple sequence alignments. Curr Protein Pept Sci, 1(3), 273-301.

Huang, X. Q., Hardison, R. C. and Miller, W. (1990). A space-efficient algorithm for local similarities. Comput Appl Biosci 1990 6(4), 373-81.

Katayama, S., Kanamori, M., & Hayashizaki, Y. (2004). Integrated analysis of the genome and the transcriptome by FANTOM. Brief Bioinform, 5(3), 249-258.

Krogh, A., Brown, M., Mian, I. S., Sjolander, K. and Haussler, D. (1994). Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol 235 (5), 1501-31.

Landes, C. and Risler, J. L. (1994). Fast databank searching with a reduced amino-acid alphabet. Comput Appl Biosci 10 (4), 453-4.

Lawrence, C. E. and Reilly, A. A. (1990). An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Proteins, 7 (1), 41-51.

Metzler, D. (2006). Robust E-values for gapped local alignments. J Comput Biol, 13(4), 882-896.

Needleman, S. B. and Wunsch, C. D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443-453.

Pearson, W. R. and Miller, W. (1992). Dynamic programming algorithms for biological sequence comparison. Methods Enzymol, 210, 575-601.

Pearson, W. R., & Sierk, M. L. (2005). The limits of protein sequence comparison? Curr Opin Struct Biol, 15(3), 254-260.

Pearson, W. R. (1991). Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics, 11 (3), 635-50.

Pearson, W. R. (1995). Comparison of methods for searching protein sequence databases. Protein Sci 4 (6), 1145-60.

Phillips, A. J. (2006). Homology assessment and molecular sequence alignment. J Biomed Inform, 39(1), 18-33.

Probst, W. C., Snyder, L. A., Schuster, D. I., Brosius, J., & Sealfon, S. C. (1992). Sequence alignment of the G-protein coupled receptor superfamily. DNA Cell Biol, 11(1), 1-20.

Rechid, R., Vingron, M. and Argos, P. (1989). A new interactive protein sequence alignment program and comparison of its results with widely used algorithms. Comput Appl Biosci, 5 (2), 107-13.

Resenchuk, S. M. and Blinov, V. M. (1995). ALIGNMENT SERVICE: creation and processing of alignments of sequences of unlimited length. Comput Appl Biosci 11 (1), 7-11.

Reeck, G. R., de Haen, C., Teller, D. C., Doolittle, R. F., Fitch, W. M., Dickerson, R. E (1987). "Homology" in Proteins andNucleic Acids: A Terminology Muddle and a Way out of It. Cell 50, 667.

Searls, D. B. and Murphy, K. P. (1995). Automata-theoretic models of mutation and alignment. Ismb 3 , 341-9.

Smith, T. F. (1999). The art of matchmaking: sequence alignment methods and their structural implications. Structure, 7(1), R7-R12.

Smith, T. F. and Waterman, M. (1981). Identification of common molecular subsequences. J. Mol. Biol. 147, 195-197.

Smith, T., Waterman, M. and Fitch, W. (1981). Comparative biosequence metrics. J. Mol. Evol. 18, 38-46.

Streletc, V. B., Shindyalov, I. N., Kolchanov, N. A. and Milanesi, L. (1992). Fast, statistically based alignment of amino acid sequences on the base of diagonal fragments of DOT-matrices. Comput Appl Biosci, 8 (6), 529-34.

Vinga, S., & Almeida, J. (2003). Alignment-free sequence comparison-a review. Bioinformatics, 19(4), 513-523.

Vingron, M. (1996). Near-optimal sequence alignment. Curr Opin Struct Biol, 6(3), 346-352.

Waterman, M. S., Eggert, M. and Lander, E. (1992). Parametric sequence comparisons. Proc Natl Acad Sci U S A, 89 (13), 6090-3.

Go To Top

Scoring Systems

Allison, L. (1993). Normalization of affine gap costs used in optimal sequence alignment. J Theor Biol 161 (2), 263-9.

Altschul, S. F. (1993). A protein alignment scoring system sensitive at all evolutionary distances. J Mol Evol 36 (3), 290-300.

Benner, S. A., Cohen, M. A. and Gonnet, G. H. (1993). Empirical and structural models for insertions and deletions in the divergent evolution of proteins. J Mol Biol 229 (4), 1065-82.

Brutlag, D. L., Dautricourt, J. P., Maulik, S. and Relph, J. (1990). Improved sensitivity of biological sequence database searches. Comput Appl Biosci 6 (3), 237-45.

Gonnet, G. H., Cohen, M. A. and Benner, S. A. (1992). Exhaustive Matching of the Entire Protein Sequence Database. Science 256 (5062), 1443-5.

Henikoff, S. and Henikoff, J. G. (1993). Performance evaluation of amino acid substitution matrices. Proteins, 17(1), 49-61.

Henikoff, S. (1996). Scores for Sequence Searches. Current Opinion in Structural Biology 6 (3), 353-360.

Johnson, M. S., Overington, J. P. and Blundell, T. L. (1993). Alignment and searching for common protein folds using a data bank of structural templates. J Mol Biol 231 (3), 735-52.

Jones, D. T., Taylor, W. R. and Thornton, J. M. (1992). The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci, 8 (3), 275-82.

Luthy, R., McLachlan, A. D. and Eisenberg, D. (1991). Secondary structure-based profiles: use of structure-conserving scoring tables in searching protein sequence databases for structural similarities. Proteins 10 (3), 229-239.

Overington, J., Donnelly, D., Johnson, M. S., Sali, A. and Blundell, T. L. (1992). Environment-specific amino acid substitution tables: tertiary templates and prediction of protein folds. Protein Sci 1 (2), 216-26.

Schwartz, R. M. and Dayhoff, M. O. (1979). Matrices for Detecting Distant Relationships. Atlas of Protein Structure 5 (Suppl. 3), 353-358.

Vogt, G., Etzold, T. and Argos, P. (1995). An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited. J Mol Biol, 249(4), 816-31.

Wilbur, W. J. (1985). On the PAM matrix model of protein evolution. . Mol Biol Evol 2 (5), 434-47.

Zhu, Z. Y., Sali, A. and Blundell, T. L. (1992). A variable gap penalty function and feature weights for protein 3-D structure comparisons. Protein Eng 5 (1), 43-51.

Go To Top  

Aligning Sequences to Structures

Bryant, S. H. and Altschul, S. F. (1995). Statistics of sequence-structure threading. Curr Opin Struct Biol 5 (2), 236-44.

Casari, G., Sander, C. and Valencia, A. (1995). A method to predict functional residues in proteins. Nat Struct Biol 2 (2), 171-8.

Diederichs, K. (1995). Structural superposition of proteins with unknown alignment and detection of topological similarity using a six-dimensional search algorithm. Proteins 23 (2), 187-95.

Fischer, D., Rice, D., Bowie, J. U. and Eisenberg, D. (1996). Assigning amino acid sequences to 3-dimensional protein folds. Faseb J 10 (1), 126-36.

Godzik, A. and Skolnick, J. (1994). Flexible algorithm for direct multiple alignment of protein structures and sequences. Comput Appl Biosci 10 (6), 587-96

Holm, L. and Sander, C. (1993). Protein structure comparison by alignment of distance matrices. J Mol Biol 233 (1), 123-38.

Holm, L. and Sander, C. (1996). The FSSP database: fold Classification based on structure-structure alignment of proteins. Nucleic Acids Res. 24 (1), 206-209.

Lathrop, R. H. and Smith, T. F. (1996). Global optimum protein threading with gapped alignment and empirical pair score functions. J Mol Biol 255 (4), 641-65.

Miller, R. T., Jones, D. T. and Thornton, J. M. (1996). Protein fold recognition by sequence threading: tools and assessment techniques. Faseb J 10 (1), 171-8.

Rost, B. and Sander, C. (1994). Structure prediction of proteins--where are we now? Curr Opin Biotechnol 5 (4), 372-80.

Rost, B. (1995). TOPITS: threading one-dimensional predictions into three-dimensional structures. Ismb 3 , 314-21.

Sayle, R., Saqi, M., Weir, M. and Lyall, A. (1995). PdbAlign, PdbDist and DistAlign: tools to aid in relating sequence variability to structure. Comput Appl Biosci 11 (5), 571-3.

Schneider, R. and Sander, C. (1996). The HSSP database of protein structure-sequence alignments. Nucleic Acids Res. 24 (1), 201-205.

Wilmanns, M. and Eisenberg, D. (1995). Inverse protein folding by the residue pair preference profile method: estimating the correctness of alignments of structurally compatible sequences. Protein Eng 8 (7), 627-39.

 Go To Top