Predicting vulnerable software components through n-gram analysis and statistical feature selection
Resource type
Authors/contributors
- Pang, Yulei (Author)
- Xue, Xiaozhen (Author)
- Namin, Akbar Siami (Author)
Title
Predicting vulnerable software components through n-gram analysis and statistical feature selection
Abstract
Vulnerabilities need to be detected and removed from software. Although previous studies demonstrated the usefulness of employing prediction techniques in deciding about vulnerabilities of software components, the accuracy and improvement of effectiveness of these prediction techniques is still a grand challenging research question. This paper proposes a hybrid technique based on combining N-gram analysis and feature selection algorithms for predicting vulnerable software components where features are defined as continuous sequences of token in source code files, i.e., Java class file. Machine learning-based feature selection algorithms are then employed to reduce the feature and search space. We evaluated the proposed technique based on some Java Android applications, and the results demonstrated that the proposed technique could predict vulnerable classes, i.e., software components, with high precision, accuracy and recall. © 2015 IEEE.
Proceedings Title
International Conference on Machine Learning and Applications
Publisher
Institute of Electrical and Electronics Engineers Inc.
Date
2015
Pages
543-548
ISBN
9781509002870 (ISBN)
Citation Key
pop00094
Language
English
Extra
36 citations (Crossref) [2023-10-31]
tex.type: Proceedings paper
Citation
Pang, Y., Xue, X., & Namin, A. S. (2015). Predicting vulnerable software components through n-gram analysis and statistical feature selection. International Conference on Machine Learning and Applications, 543–548. https://doi.org/10.1109/ICMLA.2015.99
Link to this record