Treffer: Knowledge Extraction for RNA Secondary Structure Prediction Using Heterogeneous and Dynamic Programming.

Title:
Knowledge Extraction for RNA Secondary Structure Prediction Using Heterogeneous and Dynamic Programming.
Authors:
Gruzewski, Mateusz1 (AUTHOR), Palkowski, Marek1 (AUTHOR) mpalkowski@zut.edu.pl
Source:
Procedia Computer Science. 2025, Vol. 270, p3372-3381. 10p.
Database:
Supplemental Index

Weitere Informationen

RNA folding can be compared to the process of knowledge extraction, as it involves extracting hidden information from the RNA nucleotide sequence to predict its 3D structure. Just like in knowledge extraction from large datasets, RNA folding uses various algorithms and mathematical models to analyze input data (RNA sequence) and uncover how the RNA molecule adopts its final form. Thus, RNA folding aims to understand how genetic information encoded in the nucleotide sequence translates into a functional molecular structure. RNA folding algorithms focus on constructing a large matrix using dynamic programming, which accounts for the majority of the computations. The Nussinov-like algorithms for RNA folding are fundamental non-serial polyadic dynamic programming codes (NPDP) for testing non-uniform dependency analysis, multi-core CPUs and GPUs, loop tiling transformations, and source-to-source techniques, as well as manual approaches. In this article, we first analyze the achievements in optimizing this code over the past two decades, discussing fully automated methods based on loop tiling and the polyhedral model, manual methods involving transposition, row- and square blocking, as well as the Four Russians technique, its variations, and hybrids that reduce the algorithm's complexity. Second, we propose a novel method that extends the polyhedral model through dependency analysis, combines features of loop tiling and manual techniques, and can be applied to both GPUs and CPUs. Moreover, the adopted approach allows for optimizing other similar bioinformatics codes. Our research is based on multi-core processors and the OpenMP and OpenCL standards to demonstrate usability across multiple platforms. We will also discuss its simplicity and performance compared to other reference models. Finally, we conduct an experimental study on six multi-core machines, including both CPUs and GPUs, to demonstrate that the proposed approach outperforms related methods. [ABSTRACT FROM AUTHOR]