Theses and Dissertations

Issuing Body

Mississippi State University


Williams, Byron J.

Committee Member

Hansen, Eric

Committee Member

Phillips, Mike J.

Committee Member

Lee, Sarah B.

Committee Member

Iannucci, Stefano

Date of Degree


Original embargo terms


Document Type

Dissertation - Open Access


Computer Science

Degree Name

Doctor of Philosophy


James Worth Bagley College of Engineering


Department of Computer Science and Engineering


Software security is an important aspect of ensuring software quality. The goal of this study is to help developers evaluate software security at the early stage of development using traceable patterns and software metrics. The concept of traceable patterns is similar to design patterns, but they can be automatically recognized and extracted from source code. If these patterns can better predict vulnerable code compared to the traditional software metrics, they can be used in developing a vulnerability prediction model to classify code as vulnerable or not. By analyzing and comparing the performance of traceable patterns with metrics, we propose a vulnerability prediction model. Objective: This study explores the performance of code patterns in vulnerability prediction and compares them with traditional software metrics. We have used the findings to build an effective vulnerability prediction model. Method: We designed and conducted experiments on the security vulnerabilities reported for Apache Tomcat (Releases 6, 7 and 8), Apache CXF and three stand-alone Java web applications of Stanford Securibench. We used machine learning and statistical techniques for predicting vulnerabilities of the systems using traceable patterns and metrics as features. Result: We found that patterns have a lower false negative rate and higher recall in detecting vulnerable code than the traditional software metrics. We also found a set of patterns and metrics that shows higher recall in vulnerability prediction. Conclusion: Based on the results of the experiments, we proposed a prediction model using patterns and metrics to better predict vulnerable code with higher recall rate. We evaluated the model for the systems under study. We also evaluated their performance in the cross-dataset validation.