Treffer: An Investigation into Misuse of Java Security APIs by Large Language Models

Title:

An Investigation into Misuse of Java Security APIs by Large Language Models

Authors:

Zahra Mousavi, Chadni Islam, Kristen Moore, Alsharif Abuadbba, M. Ali Babar

Source:

Proceedings of the 19th ACM Asia Conference on Computer and Communications Security. :1299-1315

Publication Status:

Preprint

Publisher Information:

ACM, 2024.

Publication Year:

2024

Subject Terms:

FOS: Computer and information sciences, Computer Science - Computers and Society, Computer Science - Cryptography and Security, Computer Science - Computation and Language, Computers and Society (cs.CY), Cryptography and Security (cs.CR), Computation and Language (cs.CL)

Document Type:

Fachzeitschrift Article

DOI:

10.1145/3634737.3661134

DOI:

10.48550/arxiv.2404.03823

Access URL:

http://arxiv.org/abs/2404.03823

Rights:

CC BY

Accession Number:

edsair.doi.dedup.....cf5e18b59cdac8d60f8e1f21947cfee9

Database:

OpenAIRE

Weitere Informationen

The increasing trend of using Large Language Models (LLMs) for code generation raises the question of their capability to generate trustworthy code. While many researchers are exploring the utility of code generation for uncovering software vulnerabilities, one crucial but often overlooked aspect is the security Application Programming Interfaces (APIs). APIs play an integral role in upholding software security, yet effectively integrating security APIs presents substantial challenges. This leads to inadvertent misuse by developers, thereby exposing software to vulnerabilities. To overcome these challenges, developers may seek assistance from LLMs. In this paper, we systematically assess ChatGPT's trustworthiness in code generation for security API use cases in Java. To conduct a thorough evaluation, we compile an extensive collection of 48 programming tasks for 5 widely used security APIs. We employ both automated and manual approaches to effectively detect security API misuse in the code generated by ChatGPT for these tasks. Our findings are concerning: around 70% of the code instances across 30 attempts per task contain security API misuse, with 20 distinct misuse types identified. Moreover, for roughly half of the tasks, this rate reaches 100%, indicating that there is a long way to go before developers can rely on ChatGPT to securely implement security API code.
This paper has been accepted by ACM ASIACCS 2024

Treffer: An Investigation into Misuse of Java Security APIs by Large Language Models

Weitere Informationen

Links

Zusatz-Funktionen