Treffer: Alternative target functions for protein structure prediction with neural networks

Title:
Alternative target functions for protein structure prediction with neural networks
Source:
Data mining and knowledge discovery : theory, tools, and technology VI (Orlando FL, 12-13 April 2004)SPIE proceedings series. :100-107
Publisher Information:
Bellingham WA: SPIE, 2004.
Publication Year:
2004
Physical Description:
print, 10 ref
Original Material:
INIST-CNRS
Subject Terms:
Electronics, Electronique, Computer science, Informatique, Optics, Optique, Physics, Physique, Telecommunications, Télécommunications, Sciences exactes et technologie, Exact sciences and technology, Sciences appliquees, Applied sciences, Informatique; automatique theorique; systemes, Computer science; control theory; systems, Logiciel, Software, Organisation des mémoires. Traitement des données, Memory organisation. Data processing, Traitement des données. Listes et chaînes de caractères, Data processing. List processing. Character string processing, Gastropoda, Helicidae, Invertebrata, Mollusca, Pulmonata, Algorithme rétropropagation, Backpropagation algorithm, Algoritmo retropropagación, Base donnée, Database, Base dato, Bioinformatique, Bioinformatics, Bioinformática, Carbone, Carbon, Carbono, Décomposition fonction, Function decomposition, Descomposición función, Découverte connaissance, Knowledge discovery, Descubrimiento conocimiento, Fouille donnée, Data mining, Busca dato, Helix, Modèle 3 dimensions, Three dimensional model, Modelo 3 dimensiones, Méthode moindre carré, Least squares method, Método cuadrado menor, Perceptron multicouche, Multilayer perceptrons, Protéine, Protein, Proteína, Réseau fédérateur, Backbone, Eje troncal, Réseau multicouche, Multilayer network, Red multinivel, Réseau neuronal, Neural network, Red neuronal, Résidu, Residue, Resíduo, Structure protéine, Protein structure, Structure secondaire, Secondary structure, Estructura secundaria
Document Type:
Konferenz Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Department of Computer Science, Georgia State University, Atlanta, GA 30303, United States
Department of Biology, Georgia State University, Atlanta, GA 30303, United States
Rights:
Copyright 2004 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems
Accession Number:
edscal.16107001
Database:
PASCAL Archive

Weitere Informationen

The prediction and modeling of protein structure is a central problem in bioinformatics. Neural networks have been used extensively to predict the secondary structure of proteins. While significant progress has been made by using multiple sequence data, the ability to predict secondary structure from a single sequence and a single prediction network has stagnated with an accuracy of about 75%. This implies that there is some limit to the accuracy of the prediction. In order to understand this behavior we asked the question of what happens as we change the target function for the prediction. Instead of predicting a derived quantity, such as whether a given chain is a helix, sheet or turn, we tested whether a more directly observed quantity such as the distance between a pair of α-carbon atoms could be predicted with reasonable accuracy. The α-carbon atom position is central to each residue in the protein and the distances between them in sequence define the backbone of protein. Knowledge of the distances between the α-carbon atoms is sufficient to determine the three dimensional structure of the protein. We have trained on distance data derived from the complete protein structure database (pdb) using a multi-layered perceptron (MLP) feedforward neural network with back propagation. It shows that the root of mean square error is 4.4 Å while the mean of actual output is 11.5 Å with orthogonal coding of protein primary sequence. Other coding schemes including BLOSUM62 coding and linear coding were tested with another two target functions of cutoff accuracy and correlation coefficient. The best correlation coefficient was achieved with BLOSUM62 coding scheme and the cutoff accuracy reached about 60%.