Graph Symbolic Regression to Interpret the Propagation of Vesicular Stomatitis Virus Across the U.S. and Mexico
ORCID
Rashme: https://orcid.org/0009-0009-1032-5005; Zhang (Zonghan): https://orcid.org/0009-0008-1578-5556; Weeks: https://orcid.org/0009-0006-8353-8165; Benbrahim: https://orcid.org/0009-0000-3728-9870; Zhang (Zijian): https://orcid.org/0009-0009-1099-4364; Chen: https://orcid.org/0000-0003-4112-9647; Pillai: https://orcid.org/0000-0002-2275-6998; Ramkumar: https://orcid.org/0000-0003-3183-0165; Nanduri: https://orcid.org/0000-0002-9996-2976
MSU Affiliation
James Worth Bagley College of Engineering; Department of Computer Science and Engineering; College of Veterinary Medicine; Department of Comparative Biomedical Sciences
Creation Date
2026-01-15
Abstract
The Vesicular Stomatitis virus (VSV) causes cases of livestock disease that occur every year in regions in Mexico. Every few years, VSV spreads northwards into the U.S. in large outbreak events affecting hundreds of livestock premises across multiple states, leading to significant economic losses due to quarantines, trade restrictions, and veterinary expenses. VSV cases are mainly driven by biting arthropod vectors from multiple genera with different ecologies, making outbreak control challenging. The sporadic nature of outbreaks and limited understanding of transmission dynamics further hinder containment efforts, reducing the effectiveness of preemptive measures. In this paper, we propose an interpretable model to elucidate the key rules governing the spread of VSV. This model employs a sparse symbolic regression model, SINDy (Sparse Identification of Nonlinear Dynamical Systems), to identify the most significant ecological variables in spread dynamics, considering both spatial and temporal factors. Since many counties did not have VSV cases during the study period, counties were clustered into 40 regions incorporating static environmental variables land cover, soil properties, livestock density, and climate data and using spatially constrained Agglomerative Clustering based on geographic adjacency, resulting in an average region size of approximately 90 counties. Ecological variables included dynamic and static variables such as temperature, humidity, wind, soil characteristics, and altitude associated with vectors and hosts (cattle, horses, and mules). The change in cases from month to month by region was modeled using two SINDy variants: a baseline model with only ecological features (Normal) and an extended model incorporating spatially derived graph features (Graph).Each alpha was chosen to minimize CV-MSE while retaining less than 11 terms. Graphical features greatly reduced model error, and the SINDy model with select graphical features had a slightly better CV-MSE score than when all graphical features were included. All models identified the infected species as important in capturing the dynamics of case differences between regions.
Publication Date
12-12-2025
Publication Title
SIGSPATIAL '25: Proceedings of the 33rd ACM International Conference on Advances in Geographic Information Systems
Publisher
ACM
First Page
977
Last Page
980
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
Rashme, T., Zhang, Z., Weeks, J., Benbrahim, M., Zhang, Z., Chen, Z., Pillai, N., Ramkumar, R., & Nanduri, B. (2025). Graph symbolic regression to interpret the propagation of Vesicular Stomatitis Virus across the U.S. and Mexico. Proceedings of the 33rd ACM International Conference on Advances in Geographic Information Systems, 977-980. https://doi.org/10.1145/3748636.3764166