Treffer: Exploring Transfer Learning for Multilingual Software Quality: Code Smells, Bugs, and Harmful Code

Title:
Exploring Transfer Learning for Multilingual Software Quality: Code Smells, Bugs, and Harmful Code
Source:
Journal of Software Engineering Research and Development. 13
Publisher Information:
Sociedade Brasileira de Computacao - SB, 2025.
Publication Year:
2025
Document Type:
Fachzeitschrift Article
ISSN:
2195-1721
DOI:
10.5753/jserd.2025.4593
Rights:
CC BY
Accession Number:
edsair.doi...........8feaf8b64ca6740887b2e7df2abc8cb0
Database:
OpenAIRE

Weitere Informationen

Code smells are indicators of poor design implementation and decision-making that can potentially harm the quality of software. Therefore, detecting these smells is crucial to prevent such issues. Some studies aim to comprehend the impact of code smells on software quality, while others propose rules or machine learning-based approaches to identify code smells. Previous research has focused on labeling and analyzing code snippets that significantly impair software quality using machine learning techniques. These snippets are classified as Clean, Smelly, Buggy, and Harmful Code. Harmful Code refers to smelly code segments that have one or more reported bugs, whether fixed or not. Consequently, the presence of a Harmful Code increases the risk of introducing new defects and/or design issues during the remediation process. We perform our study as an extension of the previous study, with the scope of 5 smell types. The total number of commits across all four tables (Java, C++, C#, and Python projects) is 641,736, versions of 91 open-source projects, 17,022 bugs, and 24,737 code smells. The findings revealed promising transferability of knowledge between Java and C# in the presence of various code smell types, while C++ and Python exhibited more challenging transferability. Also, our study discovered that a sample size of 32 demonstrated favorable outcomes for most harmful codes, underscoring the efficiency of transfer learning even with limited data. Moreover, the exploration of transfer learning between bugs and code smells represents a not-very-ineffective avenue within the realm of software engineering.