Computational
Molecular Biology

October 13, 2009
Doug
Brutlag
Alignment
of Biological Sequences
Allison,
L., Wallace, C. S. and Yee, C. N. (1992). Finite-state models in the
alignment of macromolecules. J Mol Evol, 35 (1), 77-89.
Apostolico, A., & Giancarlo, R. (1998). Sequence alignment in molecular biology. J Comput Biol, 5(2), 173-196.
Batzoglou, S. (2005). The many faces of sequence alignment. Brief bioinform, 6(1), 6-22.
Brown, D. G., Li, M., & Ma, B. (2004). A tutorial of recent developments in the seeding of local alignment. J Bioinform Comput Biol, 2(4), 819-842.
Cantalloube,
H., Labesse, G., Chomilier, J., Nahum, C., Cho, Y. Y., Chams, V.,
Achour, A., Lachgar, A., Mbika, J. P., Issing, W. and et al. (1995).
Automat and BLAST: comparison of two protein sequence similarity
search programs. Comput Appl Biosci 11 (3), 261-72.
Chao,
K. M., Zhang, J., Ostell, J. and Miller, W. (1995). A local alignment
tool for very long DNA sequences. Comput Appl Biosci 11 (2),
147-53.
Dayhoff, M. Schwartz, R. M. and Orcutt, B. C. (1978). A model of
evolutionary change in Proteins. Atlas of Protein Structure 1978,
345-352
Dayhoff,
M. O., Barker, W. C. and Hunt, L. T. (1983). Establishing Homologies
in Protein Sequences, in Methods in Enzymology, 91, 524-545.
DeLisi, C. and Kanehisa, M. (1984). Assessing the Significance of
Local Sequence Homologies. Mathematical Biosciences 69, 77-85.
Doolittle,
R. and Fairchild. (1981). Similar amino acid sequences: chance or
common ancestry? Science 214, 149-158.
Doolittle, R. F. (1986). Of Urfs and Orfs: A Primer on How to
Analyze Derived Amino Acid Sequences. Mill Valley, California:
University Science Books.
Doolittle, R. F. (1994). Protein sequence comparisons: searching databases and aligning sequences. Curr Opin Biotechnol, 5(1), 24-28.
Edwards, Y. J., & Cottage, A. (2003). Bioinformatics methods to predict protein structure and function. A practical approach. Mol Biotechnol, 23(2), 139-166.
Feng,
D.F., Johnson, M.S. and Doolittle, R.F. (1985). Aligning amino acid
sequences: comparison of commonly used methods. J. Mol. Evol. 21,
112-125.
Frazer, K. A., Elnitski, L., Church, D. M., Dubchak, I., & Hardison, R. C. (2003). Cross-species sequence comparisons: a review of methods and available resources. Genome Res, 13(1), 1-12.
Giegerich, R. (2000). A systematic approach to dynamic programming in bioinformatics. Bioinformatics, 16(8), 665-677.
Gribskov,
M. (1994). Profile analysis. Methods Mol Biol 25 , 247-66.
Grice,
J. A., Hughey, R. and Speck, D. (1995). Parallel sequence alignment
in limited space. Ismb 3 , 145-53.
Hardison, R., & Miller, W. (1993). Use of long sequence alignments to study the evolution and regulation of mammalian globin gene clusters. Mol Biol Evol, 10(1), 73-102.
Henikoff, S. (1996). Scores for sequence searches and alignments. Curr Opin Struct Biol, 6(3), 353-360.
Heringa, J. (2000). Computational methods for protein secondary structure prediction using multiple sequence alignments. Curr Protein Pept Sci, 1(3), 273-301.
Huang,
X. Q., Hardison, R. C. and Miller, W. (1990). A space-efficient
algorithm for local similarities. Comput Appl Biosci 1990 6(4),
373-81.
Katayama, S., Kanamori, M., & Hayashizaki, Y. (2004). Integrated analysis of the genome and the transcriptome by FANTOM. Brief Bioinform, 5(3), 249-258.
Krogh,
A., Brown, M., Mian, I. S., Sjolander, K. and Haussler, D. (1994).
Hidden Markov models in computational biology. Applications to
protein modeling. J Mol Biol 235 (5), 1501-31.
Landes,
C. and Risler, J. L. (1994). Fast databank searching with a reduced
amino-acid alphabet. Comput Appl Biosci 10 (4), 453-4.
Lawrence,
C. E. and Reilly, A. A. (1990). An expectation maximization (EM)
algorithm for the identification and characterization of common sites
in unaligned biopolymer sequences. Proteins, 7 (1), 41-51.
Metzler, D. (2006). Robust E-values for gapped local alignments. J Comput Biol, 13(4), 882-896.
Needleman,
S. B. and Wunsch, C. D. (1970). A general method applicable to the
search for similarities in the amino acid sequence of two proteins.
J. Mol. Biol. 48, 443-453.
Pearson,
W. R. and Miller, W. (1992). Dynamic programming algorithms for
biological sequence comparison. Methods Enzymol, 210,
575-601.
Pearson, W. R., & Sierk, M. L. (2005). The limits of protein sequence comparison? Curr Opin Struct Biol, 15(3), 254-260.
Pearson,
W. R. (1991). Searching protein sequence libraries: comparison of the
sensitivity and selectivity of the Smith-Waterman and FASTA
algorithms. Genomics, 11 (3), 635-50.
Pearson,
W. R. (1995). Comparison of methods for searching protein sequence
databases. Protein Sci 4 (6), 1145-60.
Phillips, A. J. (2006). Homology assessment and molecular sequence alignment. J Biomed Inform, 39(1), 18-33.
Probst, W. C., Snyder, L. A., Schuster, D. I., Brosius, J., & Sealfon, S. C. (1992). Sequence alignment of the G-protein coupled receptor superfamily. DNA Cell Biol, 11(1), 1-20.
Rechid,
R., Vingron, M. and Argos, P. (1989). A new interactive protein
sequence alignment program and comparison of its results with widely
used algorithms. Comput Appl Biosci, 5 (2), 107-13.
Resenchuk,
S. M. and Blinov, V. M. (1995). ALIGNMENT SERVICE: creation and
processing of alignments of sequences of unlimited length. Comput
Appl Biosci 11 (1), 7-11.
Reeck,
G. R., de Haen, C., Teller, D. C., Doolittle, R. F., Fitch, W. M.,
Dickerson, R. E (1987). "Homology" in Proteins andNucleic Acids: A
Terminology Muddle and a Way out of It. Cell 50, 667.
Searls,
D. B. and Murphy, K. P. (1995). Automata-theoretic models of mutation
and alignment. Ismb 3 , 341-9.
Smith, T. F. (1999). The art of matchmaking: sequence alignment methods and their structural implications. Structure, 7(1), R7-R12.
Smith,
T. F. and Waterman, M. (1981). Identification of common molecular
subsequences. J. Mol. Biol. 147, 195-197.
Smith,
T., Waterman, M. and Fitch, W. (1981). Comparative biosequence
metrics. J. Mol. Evol. 18, 38-46.
Streletc,
V. B., Shindyalov, I. N., Kolchanov, N. A. and Milanesi, L. (1992).
Fast, statistically based alignment of amino acid sequences on the
base of diagonal fragments of DOT-matrices. Comput Appl Biosci, 8
(6), 529-34.
Vinga, S., & Almeida, J. (2003). Alignment-free sequence comparison-a review. Bioinformatics, 19(4), 513-523.
Vingron, M. (1996). Near-optimal sequence alignment. Curr Opin Struct Biol, 6(3), 346-352.
Waterman,
M. S., Eggert, M. and Lander, E. (1992). Parametric sequence
comparisons. Proc Natl Acad Sci U S A, 89 (13), 6090-3.
Go To Top
Scoring
Systems
Allison,
L. (1993). Normalization of affine gap costs used in optimal sequence
alignment. J Theor Biol 161 (2), 263-9.
Altschul,
S. F. (1993). A protein alignment scoring system sensitive at all
evolutionary distances. J Mol Evol 36 (3), 290-300.
Benner,
S. A., Cohen, M. A. and Gonnet, G. H. (1993). Empirical and
structural models for insertions and deletions in the divergent
evolution of proteins. J Mol Biol 229 (4), 1065-82.
Brutlag,
D. L., Dautricourt, J. P., Maulik, S. and Relph, J. (1990). Improved
sensitivity of biological sequence database searches. Comput Appl
Biosci 6 (3), 237-45.
Gonnet,
G. H., Cohen, M. A. and Benner, S. A. (1992). Exhaustive Matching of
the Entire Protein Sequence Database. Science 256 (5062),
1443-5.
Henikoff,
S. and Henikoff, J. G. (1993). Performance evaluation of amino acid
substitution matrices. Proteins, 17(1), 49-61.
Henikoff,
S. (1996). Scores for Sequence Searches. Current Opinion in
Structural Biology 6 (3), 353-360.
Johnson,
M. S., Overington, J. P. and Blundell, T. L. (1993). Alignment and
searching for common protein folds using a data bank of structural
templates. J Mol Biol 231 (3), 735-52.
Jones,
D. T., Taylor, W. R. and Thornton, J. M. (1992). The rapid generation
of mutation data matrices from protein sequences. Comput Appl Biosci,
8 (3), 275-82.
Luthy,
R., McLachlan, A. D. and Eisenberg, D. (1991). Secondary
structure-based profiles: use of structure-conserving scoring tables
in searching protein sequence databases for structural similarities.
Proteins 10 (3), 229-239.
Overington,
J., Donnelly, D., Johnson, M. S., Sali, A. and Blundell, T. L.
(1992). Environment-specific amino acid substitution tables: tertiary
templates and prediction of protein folds. Protein Sci 1 (2),
216-26.
Schwartz, R. M. and Dayhoff, M. O. (1979). Matrices for Detecting
Distant Relationships. Atlas of Protein Structure 5 (Suppl. 3),
353-358.
Vogt,
G., Etzold, T. and Argos, P. (1995). An assessment of amino acid
exchange matrices in aligning protein sequences: the twilight zone
revisited. J Mol Biol, 249(4), 816-31.
Wilbur,
W. J. (1985). On the PAM matrix model of protein evolution. . Mol
Biol Evol 2 (5), 434-47.
Zhu,
Z. Y., Sali, A. and Blundell, T. L. (1992). A variable gap penalty
function and feature weights for protein 3-D structure comparisons.
Protein Eng 5 (1), 43-51.
Go To Top
Aligning Sequences to
Structures
Bryant,
S. H. and Altschul, S. F. (1995). Statistics of sequence-structure
threading. Curr Opin Struct Biol 5 (2), 236-44.
Casari,
G., Sander, C. and Valencia, A. (1995). A method to predict
functional residues in proteins. Nat Struct Biol 2 (2),
171-8.
Diederichs,
K. (1995). Structural superposition of proteins with unknown
alignment and detection of topological similarity using a
six-dimensional search algorithm. Proteins 23 (2), 187-95.
Fischer,
D., Rice, D., Bowie, J. U. and Eisenberg, D. (1996). Assigning amino
acid sequences to 3-dimensional protein folds. Faseb J 10 (1),
126-36.
Godzik,
A. and Skolnick, J. (1994). Flexible algorithm for direct multiple
alignment of protein structures and sequences. Comput Appl Biosci 10
(6), 587-96
Holm,
L. and Sander, C. (1993). Protein structure comparison by alignment
of distance matrices. J Mol Biol 233 (1), 123-38.
Holm,
L. and Sander, C. (1996). The FSSP database: fold Classification
based on structure-structure alignment of proteins. Nucleic Acids
Res. 24 (1), 206-209.
Lathrop,
R. H. and Smith, T. F. (1996). Global optimum protein threading with
gapped alignment and empirical pair score functions. J Mol Biol 255
(4), 641-65.
Miller,
R. T., Jones, D. T. and Thornton, J. M. (1996). Protein fold
recognition by sequence threading: tools and assessment techniques.
Faseb J 10 (1), 171-8.
Rost,
B. and Sander, C. (1994). Structure prediction of proteins--where are
we now? Curr Opin Biotechnol 5 (4), 372-80.
Rost,
B. (1995). TOPITS: threading one-dimensional predictions into
three-dimensional structures. Ismb 3 , 314-21.
Sayle,
R., Saqi, M., Weir, M. and Lyall, A. (1995). PdbAlign, PdbDist and
DistAlign: tools to aid in relating sequence variability to
structure. Comput Appl Biosci 11 (5), 571-3.
Schneider,
R. and Sander, C. (1996). The HSSP database of protein
structure-sequence alignments. Nucleic Acids Res. 24 (1),
201-205.
Wilmanns,
M. and Eisenberg, D. (1995). Inverse protein folding by the residue
pair preference profile method: estimating the correctness of
alignments of structurally compatible sequences. Protein Eng 8 (7),
627-39.
Go To Top