Treffer: Authorship attribution of source code by using back propagation neural network based on particle swarm optimization.

Title:
Authorship attribution of source code by using back propagation neural network based on particle swarm optimization.
Authors:
Yang X; National Engineering Lab for Mobile Network Technologies, Beijing University of Posts and Telecommunications, Beijing, China., Xu G; National Engineering Lab for Mobile Network Technologies, Beijing University of Posts and Telecommunications, Beijing, China., Li Q; National Engineering Lab for Mobile Network Technologies, Beijing University of Posts and Telecommunications, Beijing, China., Guo Y; National Engineering Lab for Mobile Network Technologies, Beijing University of Posts and Telecommunications, Beijing, China., Zhang M; National Engineering Lab for Mobile Network Technologies, Beijing University of Posts and Telecommunications, Beijing, China.
Source:
PloS one [PLoS One] 2017 Nov 02; Vol. 12 (11), pp. e0187204. Date of Electronic Publication: 2017 Nov 02 (Print Publication: 2017).
Publication Type:
Journal Article
Language:
English
Journal Info:
Publisher: Public Library of Science Country of Publication: United States NLM ID: 101285081 Publication Model: eCollection Cited Medium: Internet ISSN: 1932-6203 (Electronic) Linking ISSN: 19326203 NLM ISO Abbreviation: PLoS One Subsets: MEDLINE
Imprint Name(s):
Original Publication: San Francisco, CA : Public Library of Science
References:
PLoS One. 2015 Jun 23;10(6):e0129363. (PMID: 26103634)
Grant Information:
R01 AA017202 United States AA NIAAA NIH HHS
Entry Date(s):
Date Created: 20171103 Date Completed: 20171204 Latest Revision: 20201214
Update Code:
20250114
PubMed Central ID:
PMC5667828
DOI:
10.1371/journal.pone.0187204
PMID:
29095934
Database:
MEDLINE

Weitere Informationen

Authorship attribution is to identify the most likely author of a given sample among a set of candidate known authors. It can be not only applied to discover the original author of plain text, such as novels, blogs, emails, posts etc., but also used to identify source code programmers. Authorship attribution of source code is required in diverse applications, ranging from malicious code tracking to solving authorship dispute or software plagiarism detection. This paper aims to propose a new method to identify the programmer of Java source code samples with a higher accuracy. To this end, it first introduces back propagation (BP) neural network based on particle swarm optimization (PSO) into authorship attribution of source code. It begins by computing a set of defined feature metrics, including lexical and layout metrics, structure and syntax metrics, totally 19 dimensions. Then these metrics are input to neural network for supervised learning, the weights of which are output by PSO and BP hybrid algorithm. The effectiveness of the proposed method is evaluated on a collected dataset with 3,022 Java files belong to 40 authors. Experiment results show that the proposed method achieves 91.060% accuracy. And a comparison with previous work on authorship attribution of source code for Java language illustrates that this proposed method outperforms others overall, also with an acceptable overhead.