Modeling of count data is a topic of interest in many applications.
Traditional time series assume continuous data with a normal distribution, which is not appropriate for count data. In this thesis we focus on linear and log-linear count models, which are also known as Generalized Auto-Regressive Conditional Heteroscedastic(INGARCH) models, with Poisson and NB2 distributions with or without zero-inflation. These models provide a parsimonious manner with which to account for serial correlation in count data through the conditional mean and distribution.
Current research on these models provides theoretical results for model analysis, estimation, and use.
This thesis provides a unified framework of the linear and log-linear models based on current literature on the models. We also provide several new results for these models. First, we develop a simple heuristic evaluation of the Poisson model. This approximation of the marginal distribution can be used to help visualize the range of possible values a given Poisson model is likely to achieve. This method can also be used as a horizon forecast when the future is far enough from the present that it has little to no effect on the forecast. We exploit the similarities between these models and ARMA models to find minimum bounds on the dispersion parameter required to ensure second order stationarity of the NB2 linear model. This bound is an important contribution because it helps ensure that estimation techniques are bounded to the stationarity region of the model.
We also extend estimation methods for the linear and log-linear models via conditional maximum likelihood estimation. This estimation method has been studied for the Poisson linear and log-linear models in the literature. Here we use this technique to develop estimators of the linear and log-linear NB2 models as well as the Poisson and NB2 models with zero-inflation. We evaluate the estimators for consistency as well as asymptotic performance. We find that they perform as expected in most cases. We compare the estimator for the NB2 model to the technique of quasi maximum likelihood estimation and find they perform comparably. In addition, we develop approximations for the limiting information matrix for two cases of the Poisson linear model. We evaluate these approximations and find that they perform admirably. We then use them to develop a better understanding of how true parameter values affect estimation for these models.
Finally, we study the use of linear and log-linear models for forecasting. We study the different types of forecasters that can be used, focusing predominantly on probabilistic forecasts. We discuss and evaluate the theoretical framework, as well as its practical counterpart, of probabilistic forecasting with these models. We then apply these methods to a real world data set to show how the models handle the correlation and variability of real world data.
Advisor: Professor Vinay Ingle
Professor Vinay Ingle
Professor Hanoch Lev-Ari
Dr. Dimitris Manolakis