Treffer: Improving Software Defect Detection With LSTM-Based Semantic Modeling and Class Imbalance Handling

Title:
Improving Software Defect Detection With LSTM-Based Semantic Modeling and Class Imbalance Handling
Source:
IEEE Open Journal of the Computer Society, Vol 6, Pp 1501-1511 (2025)
Publisher Information:
IEEE
Publication Year:
2025
Collection:
Directory of Open Access Journals: DOAJ Articles
Document Type:
Fachzeitschrift article in journal/newspaper
Language:
English
DOI:
10.1109/OJCS.2025.3613134
Accession Number:
edsbas.70FC35AC
Database:
BASE

Weitere Informationen

Software Defect Prediction (SDP) plays a vital role in maintaining software quality, especially as modern systems grow in size and complexity. Traditional SDP models that rely on static code metrics often fail to capture the semantic and contextual relationships inherent in source code, limiting their prediction accuracy and ability to generalize across projects. In this study, we propose a Deep Learning (DL)-based approach that combines Long Short-Term Memory (LSTM) networks with semantic feature extraction to improve the effectiveness of defect prediction. Our method utilizes representations derived from Abstract Syntax Trees (ASTs) to capture structural and contextual information from the code. To address the challenge of class imbalance—common in SDP datasets—we apply the Synthetic Minority Oversampling Technique (SMOTE) and cost-sensitive learning, enhancing the model’s sensitivity to defective code components. Experiments on the PROMISE dataset, covering multiple versions of Java projects, show that our approach significantly outperforms models based solely on static metrics. Comparative analysis with recent studies further highlights the strengths of our method in capturing long-range code dependencies and improving defect detection accuracy. These results support the potential of integrating LSTM-based semantic modeling and class imbalance handling to advance the state of SDP.