Result: Exact computation of protein structure similarity

Title:

Exact computation of protein structure similarity

Authors:

CHEW, L. Paul

Source:

Computational geometry (SCG'06) (June 5-7, 2006, Sedona AZ). :468-474

Publisher Information:

New York NY: ACM Press, 2006.

Publication Year:

2006

Physical Description:

print, 27 ref 1

Original Material:

INIST-CNRS

Subject Terms:

Computer science, Informatique, Sciences exactes et technologie, Exact sciences and technology, Sciences appliquees, Applied sciences, Informatique; automatique theorique; systemes, Computer science; control theory; systems, Automatique théorique. Systèmes, Control theory. Systems, Modélisation et identification, Modelling and identification, Algorithme évolutionniste, Evolutionary algorithm, Algoritmo evoluciónista, Biologie moléculaire, Molecular biology, Biología molecular, Chaîne caractère, Character string, Cadena carácter, Géométrie algorithmique, Computational geometry, Geometría computacional, Optimisation, Optimization, Optimización, Programmation discrète, Discrete programming, Programación discreta, Programmation dynamique, Dynamic programming, Programación dinámica, Protéine, Protein, Proteína, Raisonnement basé sur cas, Case based reasoning, Razonamiento fundado sobre caso, Similitude, Similarity, Similitud, Solution optimale, Optimal solution, Solución óptima, Structure protéine, Protein structure

Document Type:

Conference Conference Paper

File Description:

text

Language:

English

Author Affiliations:

Computer Science Dept Cornell University, Ithaca, NY 14853, United States

Access URL:

http://pascal-francis.inist.fr/vibad/index.php?action=search&terms=18297856

Rights:

Copyright 2006 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS

Notes:

Computer science; theoretical automation; systems

Accession Number:

edscal.18297856

Database:

PASCAL Archive

Further Information

A protein can be considered as a string (on the alphabet of 20 amino acids) or as a structure (each protein folds into a particular 3D configuration). Consider the following string-based problem: Given two protein strings that are not necessarily similar in their entirety, determine the most similar contiguous substrings, one from each protein. The exact meaning of most similar here is determined by the user; it is based on user-specified scores for character vs. character similarity and for character vs. space similarity. It is important to allow for spaces or gaps because evolutionary changes to proteins often involve insertion or deletion of one or more individual amino acids. For this kind of string-based similarity, the most-similar substrings can be determined in time O(mn) using Dynamic Programming (DP). The goal here is to design an algorithm for similarity of protein structures as opposed to protein strings. The inspiration for our algorithm is drawn from the DP-based similarity algorithm for strings. Instead of comparing sequences of characters, we compare sequences of vectors. One complication for working with structures instead of strings is the problem of orientation: basically, two structures that have similar shape can look different if they are at different orientations. Algorithmically, this means that we must establish the optimal orientations for our two proteins as well as finding the similar subsequences. In other words, an algorithm for similarity of structures involves both discrete optimization (to find the corresponding subsequences) and continuous optimization (to find the optimal orientation). Interestingly, if the correspondence is given then the optimal orientation (for that correspondence) is easy to find, and if the the orientation is given then the optimal correspondence (for that orientation) is easy to find. The challenge is to accomplish both optimizations at once. Note that the technique presented here produces a globally optimal solution; there are no approximations or assumptions of randomness.

Result: Exact computation of protein structure similarity

Further Information

Links

Additional functions