Theses and Dissertations

Issuing Body

Mississippi State University

Advisor

Dampier, David

Committee Member

Butler, Cary

Committee Member

Vaughn, Rayford

Committee Member

Jankun-Kelly, T.J.

Date of Degree

12-9-2011

Document Type

Dissertation - Open Access

Major

Computer Science

Degree Name

Doctor of Philosophy

College

James Worth Bagley College of Engineering

Department

Department of Computer Science and Engineering

Abstract

Identification of source code authorship can be a useful tool in the areas of security and forensic investigation by helping to create corroborating evidence that may send a suspected cyber terrorist, hacker, or malicious code writer to jail. When applied to academia, it can also prove a useful tool for professors who suspect students of academic dishonesty, plagiarism, or modification of source code related to programming assignments. The purpose of this dissertation is to determine whether or not cross-entropy approaches to source code authorship analysis will succeed in predicting the correct author of a given piece of source code. If so, this work will try to identify factors that affect the accuracy of the algorithm, how programmer experience determines accuracy, and whether a cross-entropy approach performs better than some known source code authorship approaches. The approach taken in the research effort will manufacture a corpus of source code writings from various authors based on the same system descriptions and varying system descriptions, from which benchmarks of different approaches can be measured.

URI

https://hdl.handle.net/11668/17009

Share

COinS