Treffer: Machine Learning-Based Approach for Classifying the Source Code Using Programming Keywords.

Title:
Machine Learning-Based Approach for Classifying the Source Code Using Programming Keywords.
Source:
IUP Journal of Information Technology; Mar2022, Vol. 18 Issue 1, p7-25, 19p
Database:
Complementary Index

Weitere Informationen

The implementation phase is one of the most critical periods in software development. Developers build their source code or reuse old source code functionalities concerning the requirement of the system. Most developers spend more time searching and navigating old source codes than developing them. It is essential to have an efficient method to search source code functionality within a short period. Topic modeling of source code is an approach used to extract topics from source codes. Many topic modeling approaches have been implemented using statistical techniques, which have many setbacks. Those results rely on non-formal code elements such as identifier names, comments, etc. Our novel approach is implemented using a machine-learning algorithm to address these issues. The source code functionality results depend only on the algorithm or the syntax of the source code. Three Java project functionalities, such as prime number, Fibonacci number, and selection sort were evaluated in this study. Java parser library is used to derive the source code elements, and an algorithm is created to take the count matrix of the source code features. Then the dataset was fed to three models--Artificial Neural Network (ANN), Random Forest (RF), and Ensemble Approach. It was found that the Ensemble Approach showed a 96.7% accuracy by surpassing ANN and RF. [ABSTRACT FROM AUTHOR]

Copyright of IUP Journal of Information Technology is the property of IUP Publications and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)