Abstract
A discrepancy measure to assess model fitness against a nonparametric alternative is proposed. First, a Polya tree prior is constructed so that the centering distribution is the null. Second, the prior is updated in the light of data to obtain the posterior centering distribution as the alternative. Third, a Kullback-Leibler divergence type of test statistic is derived to assess the discrepancy between the two centering distributions. The properties of the test statistic are derived, and a power comparison with several well-known test statistics is conducted. The use of the test statistic is illustrated using network traffic data.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Andrews, D.F., Herzberg, A.M.: Data: A Collection of Problems from Many Fields for the Student and Research Worker. Springer, Berlin (1985)
Arizono, I., Ohta, H.: A test for normality based on Kullback-Leibler information. Am. Stat. 43, 20–22 (1989)
Berger, J.O., Guglielmi, A.: Bayesian and conditional frequentist testing of a parametric model versus nonparametric alternatives. J. Am. Stat. Assoc. 96, 174–184 (2001)
Carota, C., Parmigiani, G.: On Bayes factors for nonparametric alternatives. In: Bernardo, J.M., Berger, J.O., David, A.P., Smith, A.F.M. (eds.) Bayesian Statistics 5, pp. 507–511. Clarendon Press, Oxford (1996)
Chaganty, N.R., Karandikar, R.L.: Some properties of the Kullback-Leibler number. Sankhyā, Ser. A 58, 69–80 (1996)
d’Agostino, R.B., Stephens, M.A.: Goodnesso-of-fit techniques. Statistics: Textbooks and Monographs, vol. 68. Marcel Dekker, New York (1986)
Dudewicz, E.J., van der Meulen, E.C.: Entropy-based tests of uniformity. J. Am. Stat. Assoc. 76, 967–974 (1981)
Ebrahimi, N., Habibullah, M., Soofi, E.S.: Testing exponentiality based on Kullback-Leibler information. J. R. Stat. Soc., Ser. B 54, 739–748 (1992)
Evans, M., Swartz, T.: Distribution theory and inference for polynomial-normal densities. Commun. Stat., Theory Methods 23(4), 1123–1148 (1994)
Ferguson, T.S.: Prior distributions on spaces of probability measures. Ann. Stat. 2, 615–629 (1974)
Gelman, A., Meng, X.-L., Stern, H.: Posterior predictive assessment of model fitness via realized discrepancies. Stat. Sin. 6, 733–807 (1996)
Goutis, C., Robert, C.: Model choice in generalized linear models: a Bayesian approach via Kullback-Leibler projections. Biometrika 85, 29–37 (1998)
Hsieh, P.-H.: An exploratory first step in teletraffic data modeling: evaluation of long-run performance of parameter estimators. Comput. Stat. Data Anal. 40, 263–283 (2002)
Lavine, M.: Some aspects of polya tree distributions for statistical modelling. Ann. Stat. 20, 1222–1235 (1992)
Lavine, M.: More aspects of polya tree distributions for statistical modelling. Ann. Stat. 22, 1161–1176 (1994)
Ledwina, T.: Data-driven version of Neyman’s smooth test of fit. J. Am. Stat. Assoc. 89, 1000–1005 (1994)
Leland, W.E., Taqqu, M.S., Willinger, W., Wilson, D.V.: On the self-similar nature of ethernet traffic (Extended Version). IEEE/ACM Trans. Netw. 2, 1–15 (1994)
Mauldin, R.D., Sudderth, W.D., Williams, S.C.: Polya trees and random distributions. Ann. Stat. 20, 1203–1221 (1992)
Meng, X.L.: Posterior predictive p-values. Ann. Stat. 22, 1142–1160 (1994)
Mengerson, K., Robert, C.: Testing for mixtures: a Bayesian entropic approach. In: Bernardo, J.M., Berger, J.O., David, A.P., Smith, A.F.M. (eds.) Bayesian Statistics 5, pp. 255–276. Clarendon Press, Oxford (1996)
Neath, A.A.: Polya tree distributions for statistical modeling of censored data. J. Appl. Math. Decis. Sci. 7(3), 175–186 (2003)
Quesenberry, C.P., Miller Jr., F.L.: Power studies of some tests for uniformity. J. Stat. Comput. Simul. 5, 169–191 (1977)
Rubin, D.B.: Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12, 1151–1172 (1984)
Stephens, M.A.: EDF statistics for goodness of fit and some comparisons. J. Am. Stat. Assoc. 69, 730–737 (1974)
Swartz, T.: Goodness-of-fit tests using Kullback-Leibler information. Commun. Stat. Part. B, Simul. Comput. 21, 711–729 (1992)
Vasicek, O.: A test for normality based on sample entropy. J. R. Stat. Soc., Ser. B 38, 54–59 (1976)
Verdinelli, I., Wasserman, L.: Bayesian goodness-of-fit testing using infinite-dimensional exponential families. Ann. Stat. 26(4), 1215–1241 (1998)
Viele, K.: Evaluating fit using Dirichlet processes. Technical Report 384, Department of Statistics, University of Kentucky (http://web.as.uky.edu/statistics/techreports/techreports.html) (2000)
Walker, S., Muliere, P.: A characterisation of polya tree distributions. Stat. Probab. Lett. 31, 163–168 (1997)
Willinger, W., Taqqu, M.S., Sherman, R., Wilson, D.V.: Self-similarity through high-variability: statistical analysis of ethernet LAN traffic at the source level (Extended Version). IEEE/ACM Trans. Netw. 5, 71–86 (1997)
Willinger, W., Paxson, V., Taqqu, M.S.: Self-similarity and heavy tails: structural modeling of network traffic. In: Adler, R., Feldman, R., Taqqu, M.S. (eds.) A Practical Guide to Heavy Tails: Statistical Techniques and Applications, pp. 27–53. Birkhäuser, Boston (1998)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hsieh, PH. A nonparametric assessment of model adequacy based on Kullback-Leibler divergence. Stat Comput 23, 149–162 (2013). https://doi.org/10.1007/s11222-011-9298-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-011-9298-0