Beyond Accuracy: Evaluating LLMs for Validating Community Service Provider Directory

ORCID

Saei: https://orcid.org/0000-0001-8125-435X; Anreddy: https://orcid.org/0000-0003-3362-1332

MSU Affiliation

James Worth Bagley College of Engineering; Department of Industrial and Systems Engineering

Creation Date

2026-06-01

Abstract

As artificial intelligence tools are increasingly adopted to validate community service provider directories, it is critical to assess whether large language models (LLMs) can reliably verify structured data in these systems. This study evaluates five LLMs, LLaMA 3.3 70B Versatile, LLaMA 3.1 8B Instant, LLaMA 3 70B 8192, LLaMA 3 8B 8192, and Gemma2 9B IT, using community service provider data from Mississippi across three evaluation conditions: clean records (base-line), systematically corrupted entries, and records with missing fields. Model responses were categorized as “Verified,” “Not Verified,” or “Needs Checking” to assess each model’s ability to confirm correct data, reject erroneous records, and handle uncertainty, respectively. Among the models tested, LLaMA 3.3 70B Versatile demonstrated the most robust overall performance, achieving high verification accuracy on clean data (96%) and the strongest error detection capabilities by rejecting 47% of corrupted entries. In contrast, LLaMA 3 8B 8192 incorrectly verified 79% of corrupted records, indicating unsafe over-permissiveness and weak anomaly detection. These results underscore that high verification accuracy alone is insufficient; effective referral system design must prioritize models that exhibit strong error detection capabilities and appropriately defer uncertain cases to human oversight.

Publication Date

10-13-2025

Publication Title

Software and Data Engineering: 34th International Conference, SEDE 2025, New Orleans, LA, USA, October 20-21, 2025, Proceedings

Publisher

Springer

First Page

373

Last Page

380

Rights

© 2026 The Author(s), under exclusive license to Springer Nature Switzerland AG

Share

COinS
 

Digital Object Identifier (DOI)

https://doi.org/10.1007/978-3-032-08649-5_23