
Theses and Dissertations
ORCID
https://orcid.org/0009-0003-3600-1531
Issuing Body
Mississippi State University
Advisor
Rahimi, Shahram
Committee Member
Gudla, Charan
Committee Member
Mittal, Sudip
Date of Degree
12-13-2024
Original embargo terms
Visible MSU only 6 months
Document Type
Graduate Thesis - Campus Access Only
Major
Computer Science(Research)
Degree Name
Master of Science (M.S.)
College
James Worth Bagley College of Engineering
Department
Department of Computer Science and Engineering
Abstract
As the field of Natural Language Processing (NLP) continues to evolve, evaluating the performance of both proprietary and open-source language models has become increasingly critical. This research provides a comprehensive analysis of proprietary models like GPT-3.5 Turbo, GPT-4, and GPT-4 Turbo, alongside open-source models such as FLAN-T5, GPT-Neo, and GPT-2. By leveraging traditional metrics like ROUGE and BLEU, as well as custom metrics including ReGrAde, Contextual Precision, and Faithfulness, the study evaluates these models across closed-domain tasks (e.g., factual question-answering) and open-domain tasks (e.g., creative writing and brainstorming). The proprietary models excelled in structured, fact-based tasks, while the open-source models exhibited strengths in creative and open-ended tasks. The study highlights the limitations of both model types, particularly in summarization and maintaining faithfulness to input data. The findings offer key insights into the future development of language models and their application, pushing the boundaries of NLP capabilities for both research and practical implementations.
Recommended Citation
Kanduri, Abhilash, "A comprehensive performance evaluation of proprietary and open-source language models in closed and open-domain tasks" (2024). Theses and Dissertations. 6343.
https://scholarsjunction.msstate.edu/td/6343