Publications

Beyond Accuracy: Evaluating LLMs for Validating Community Service Provider Directory

ORCID

Saei: https://orcid.org/0000-0001-8125-435X; Anreddy: https://orcid.org/0000-0003-3362-1332

MSU Affiliation

James Worth Bagley College of Engineering; Department of Industrial and Systems Engineering

Creation Date

2026-06-01

Abstract

As artificial intelligence tools are increasingly adopted to validate community service provider directories, it is critical to assess whether large language models (LLMs) can reliably verify structured data in these systems. This study evaluates five LLMs, LLaMA 3.3 70B Versatile, LLaMA 3.1 8B Instant, LLaMA 3 70B 8192, LLaMA 3 8B 8192, and Gemma2 9B IT, using community service provider data from Mississippi across three evaluation conditions: clean records (base-line), systematically corrupted entries, and records with missing fields. Model responses were categorized as “Verified,” “Not Verified,” or “Needs Checking” to assess each model’s ability to confirm correct data, reject erroneous records, and handle uncertainty, respectively. Among the models tested, LLaMA 3.3 70B Versatile demonstrated the most robust overall performance, achieving high verification accuracy on clean data (96%) and the strongest error detection capabilities by rejecting 47% of corrupted entries. In contrast, LLaMA 3 8B 8192 incorrectly verified 79% of corrupted records, indicating unsafe over-permissiveness and weak anomaly detection. These results underscore that high verification accuracy alone is insufficient; effective referral system design must prioritize models that exhibit strong error detection capabilities and appropriately defer uncertain cases to human oversight.

Publication Date

10-13-2025

Publication Title

Software and Data Engineering: 34th International Conference, SEDE 2025, New Orleans, LA, USA, October 20-21, 2025, Proceedings

Publisher

Springer

First Page

373

Last Page

380

Rights

Recommended Citation

Saei, S., Ghimire, S., Anreddy, S. (2026). Beyond Accuracy: Evaluating LLMs for Validating Community Service Provider Directory. In: Rahimi, N., Margapuri, V., Golilarz, N.A. (eds) Software and Data Engineering. SEDE 2025. Communications in Computer and Information Science, vol 2720 . Springer, Cham. https://doi.org/10.1007/978-3-032-08649-5_23

Link to Full Text

COinS

Digital Object Identifier (DOI)

https://doi.org/10.1007/978-3-032-08649-5_23

Publications

Beyond Accuracy: Evaluating LLMs for Validating Community Service Provider Directory

ORCID

MSU Affiliation

Creation Date

Abstract

Publication Date

Publication Title

Publisher

First Page

Last Page

Rights

Recommended Citation

Digital Object Identifier (DOI)

Browse

Search

Submit

Learn More

Powered By

Publications

Beyond Accuracy: Evaluating LLMs for Validating Community Service Provider Directory

Authors

ORCID

MSU Affiliation

Creation Date

Abstract

Publication Date

Publication Title

Publisher

First Page

Last Page

Rights

Recommended Citation

Share

Digital Object Identifier (DOI)

Browse

Search

Submit

Learn More

Powered By