Treffer: 基于API语义聚类的安卓恶意软件检测方法研究.
Weitere Informationen
To address the challenges of declining detection accuracy and model aging in traditional Android malware detection methods when encountering variant samples, a malware detection method based on API semantic clustering was proposed. Redundant nodes were eliminated through a function call graph optimization algorithm, and API call sequences were embedded into semantic vectors using the Skip-Gram model to capture functional and contextual similarities between APIs. Semantically similar APIs were then grouped via K-means clustering, and a dynamic extension algorithm was introduced to adapt to newly emerged APIs.To verify the effectiveness of the proposed method, experiments were conducted by training on a large-scale dataset from 2017—2018, followed by detection on evolved malicious samples from 2019—2023. The proposed method was compared with the classic detection methods such as MaMaDroid and Malscan through experiments. The results showed that the average F1 score of the proposed method reached 96.6%. In model aging experiments, the proposed method improved the average F1-score by 8.3%~23.1%. Furthermore, comparative experiments on API semantic features and cluster quantity analysis confirm that the precision of semantic extraction and cluster configuration significantly improves detection performance. Experimental results indicate that the proposed method achieves higher detection accuracy and exhibits a slower aging rate. [ABSTRACT FROM AUTHOR]
针对传统安卓恶意软件检测方法在面对变种样本时存在检测准确性下降和模型老化问题,提出一种基于应用程序编程接口(application programming interface, API)语义聚类的恶意软件检测方法。首先,通过函数调用图优化算法去除冗余节点,并结合Skip-Gram模型对API调用序列进行语义向量嵌入,以捕捉API间的功能关联与上下文相似性;然后,利用K-means聚类将语义相近的API归为同类,并通过动态扩展算法适应新出现的API。为验证所提方法的有效性,实验部分首先基于2017—2018年的大量数据集进行训练,然后针对2019—2023年进化的恶意样本进行检测。将所提方法与经典检测方法MaMaDroid和MalScan等开展对比实验,结果显示所提方法的平均F1分数达到96.6%。在模型老化实验中,所提方法的平均F1值提升了8.3%~23.1%。此外,通过API语义特征对比实验和不同类簇数量实验,进一步验证了语义提取精度与类簇设置对检测性能具有提升效果。综合上述实验结果可知,所提方法能够实现更高的检测准确性和更低的老化速度。 [ABSTRACT FROM AUTHOR]
Copyright of Chinese Journal of Network & Information Security is the property of Beijing Xintong Media Co., Ltd. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)