Mississippi State University
Mohammad Sepehrifar, Prakash Patil, Qian Zhou
Date of Degree
Original embargo terms
Complete embargo for 1 year
Dissertation - Open Access
Doctor of Philosophy
College of Arts and Sciences
Department of Mathematics and Statistics
Novel clustering methods are presented and applied to financial data. First, a scan-statistics method for detecting price point clusters in financial transaction data is considered. The method is applied to Electronic Business Transfer (EBT) transaction data of the Supplemental Nutrition Assistance Program (SNAP). For a given vendor, transaction amounts are fit via maximum likelihood estimation which are then converted to the unit interval via a natural copula transformation. Next, a new Markov type relation for order statistics on the unit interval is developed. The relation is used to characterize the distribution of the minimum exceedance of all copula transformed transaction amounts above an observed order statistic. Conditional on observed order statistics, independent and asymptotically identical indicator functions are constructed and the success probably as a function of the gaps in consecutive order statistics is specified. The success probabilities are shown to be a function of the hazard rate of the transformed transaction distribution. If gaps are smaller than expected, then the corresponding indicator functions are more likely to be one. A scan statistic is then applied to the sequence of indicator functions to detect locations where too many gaps are smaller than expected. These sets of gaps are then flagged as being anomalous price point clusters. It is noted that prominent price point clusters appearing in the data may be a historical vestige of previous versions of the SNAP program involving outdated paper "food stamps". The second part of the project develops a novel clustering method whereby the time series of daily total EBT transaction amounts are clustered by periodicity. The schemeworks by normalizing the time series of daily total transaction amounts for two distinct vendors and taking daily differences in those two series. The difference series is then examined for periodicity via a novel F statistic. We find one may cluster the monthly periodicities of vendors by type of store using the F statistic, a proxy for a distance metric. This may indicate that spending preferences for SNAP benefit recipients varies by day of the month, however, this opens further questions about potential forcing mechanisms and the apparent changing appetites for spending.
Zhao, Zhicong, "Some new anomaly detection methods with applications to financial data" (2021). Theses and Dissertations. 5294.