Treffer: A Cross-Project Defect Prediction Approach Based on Code Semantics and Cross-Version Structural Information.

Title:
A Cross-Project Defect Prediction Approach Based on Code Semantics and Cross-Version Structural Information.
Authors:
Zou, Yifan1 (AUTHOR) ivan@hrbeu.edu.cn, Wang, Huiqiang1 (AUTHOR) wanghuiqiang@hrbeu.edu.cn, Lv, Hongwu1 (AUTHOR) lvhongwu@hrbeu.edu.cn, Zhao, Shuai1 (AUTHOR) zhaoshuai@hrbeu.edu.cn, Tian, Haoye2 (AUTHOR) haoye.tian@unimelb.edu.au
Source:
International Journal of Software Engineering & Knowledge Engineering. Jul2024, Vol. 34 Issue 7, p1135-1171. 37p.
Database:
Business Source Premier

Weitere Informationen

Context: Cross-project defect prediction (CPDP), due to the potential of adaption by industry in realistic scenarios, had gained significant attention from the research community. Currently, existing CPDP studies use static statistical features designed by experts, which might not capture the semantic and structural aspects of software, resulting in low accuracy in defect prediction. Meanwhile, they tend to overlook the valuable iterative information brought about by version updates in mature software projects. Objective: This paper introduces DETECTOR, a novel CPDP approach based on coDE semanTic and cross-vErsion struCTural infORmation to leverage cross-versions features of the software and improve the performance of CPDP. Methods: DETECTOR parses source code to exploit Abstract Syntax Trees (ASTs) and cross-version software network (Cross-SN) that consists of internal class dependency network and cross-version class dependency edges. It utilizes Attention-based Bi-LSTM and simplified graph convolutional neural networks to automatically extract software features from ASTs and Cross-SN. The extracted features are fused using gate(⋅) to generate more effective cross-version features. Finally the source project is selected to carry out the data used to train the classifier to predict the defects. Results: Empirical studies on seven open-source Java projects, the experiment results show that: (1) DETECTOR outperforms the state-of-the-art models in CPDP; (2) our proposed cross-version dependent edges positively contribute to DETECTOR performance; (3) gate(⋅) outperforms existing strategies in fusion features; (4) more multi-versions information enhance DETECTOR's performance. Conclusion: DETECTOR can predict more defects in CPDP and improve the accuracy and effectiveness of prediction. [ABSTRACT FROM AUTHOR]

Copyright of International Journal of Software Engineering & Knowledge Engineering is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)