Abstract
Interval-valued data are observed as ranges instead of single values and frequently appear with advanced technologies in current data collection processes. Regression analysis of interval-valued data has been studied in the literature, but mostly focused on parametric linear regression models. In this paper, we study interval-valued data regression based on nonparametric additive models. By employing one of the current methods based on linear regression, we propose a nonparametric additive approach to properly analyze intervalvalued data with a possibly nonlinear pattern. We demonstrate the proposed approach using a simulation study and a real data example, and also compare its performance with those of existing methods.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Ahn, J., Peng, M., Park, C., & Jeon, Y. (2012). A resampling approach for interval-valued data regression. Statistical Analysis and Data Mining: The ASA Data Science Journal, 5, 336–348.
Barnett, T. P. (1985). Variations in near-global sea level pressure. Journal of the Atmospheric Sciences, 42, 478–501.
Bertrand, P., & Goupil, F. (2000). Descriptive statistics for symbolic data. In H.-H. Bock, & E. Diday (Eds.), Analysis of symbolic data (pp. 103–124). Berlin: Springer-Verlag.
Billard, L. (2007). Dependencies and variation components of symbolic interval-valued data. In P. Brito, G. Cucumel, P. Bertrand, & F. de Carvalho (Eds.), Selected contributions in data analysis and classification (pp. 3–13). Berlin: Springer-Verlag.
Billard, L. (2008). Sample covariance functions for complex quantitative data. In World congress. Yokohama, Japan: International Association of Computational Statistics.
Billard, L., & Diday, E. (2000). Regression analysis for interval-valued data. In H. A. L. Kiers, J.-P. Rassoon, P. J. F. Groenen, & M. Schader (Eds.), Data analysis, classification, and related methods (pp. 369–374). Berlin: Springer-Verlag.
Billard, L., & Diday, E. (2007). Symbolic data analysis: conceptual statistics and data mining. Chichester: Wiley.
Blanco-Fernandez, A., Colubi, A., & Gonzalez-Rodriguez, G. (2013). Linear regression analysis for interval-valued data based on set arithmetic: A review. In C. Borgelt, M. A. Gil, J. M. C. Sousa, & M. Verleysen (Eds.), Studies in fuzziness and soft computing: Vol. 285. Towards advanced data analysis by combining soft computing and statistics (pp. 19–31). Berlin: Springer-Verlag.
Blanco-Fernandez, A., Corral, N., & Gonzalez-Rodriguez, G. (2011). Estimation of a flexible simple linear model for interval data based on set arithmetic. Computational Statistics & Data Analysis, 55, 2568–2578.
Breiman, L., & Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation (with discussion). Journal of the American Statistical Association, 80, 580–619.
Buja, A., Hastie, T., & Tibshirani, R. (1989). Linear smoothers and additive models. Annals of Statistics, 17, 453–555.
Carroll, R. J., Maity, A., Mammen, E., & Yu, K. (2009). Nonparametric additive regression for repeatedly measured data. Biometrika, 96, 383–398.
Curtis, S. M., Banerjee, S., & Ghosal, S. (2014). Fast Bayesian model assessment for nonparametric additive regression. Computational Statistics & Data Analysis, 71, 347–358.
Davis, R. E. (1976). Predictability of sea surface temperature and sea level pressure anomalies over the North Pacific Ocean. Journal of Physical Oceanography, 6, 249–266.
Diday, E. (1987). The symbolic approach in clustering and related methods of data analysis. In H.-H. Bock (Ed.), Classification and related methods of data analysis. Amsterdam: North-Holland.
Diday, E. (1995). Probabilist, possibilist and belief object for knowledge analysis. Annals of Operations Research, 55, 227–276.
Diday, E., & Emilion, R. (1996). Lattices and capacities in analysis of probabilist object. In E. Diday, Y. Lechevallier, & O. Opilz (Eds.), Studies in classification (pp. 13–30).
Diday, E., & Emilion, R. (1998). Capacities and credibilities in analysis of probabilistic objects by histograms and lattices. In C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock, & Y. Baba (Eds.), Data science, classification, and related methods (pp. 353–357).
Diday, E., Emilion, R., & Hillali, Y. (1996). Symbolic data analysis of probabilist objects by capacities and credibilities. XXXVIII Societa Italiana Di Statistica. Rimini, Italy.
Friedman, J. H., & Stuetzle, W. (1981). Projection Pursuit Regression. Journal of the American Statistical Association, 76, 817–823.
Gillett, N. P., Zwiers, F. W., Weaver, A. J., & Stott, P. A. (2003). Detection of human influence on sea-level pressure. Nature, 422, 292–294.
Hastie, T. J., & Tibshirani, R. J. (1984). Generalized additive models. Technical report. Division of Biostatistics, Stanford University.
Hastie, T. J., & Tibshirani, R. J. (1990). Generalized additive models. London: Chapman and Hall.
Horowitz, J. L. (2014). Nonparametric additive models. In The oxford handbook of applied nonparametric and semiparametric econometrics and statistics (pp. 129–148).
Horowitz, J. L., & Mammen, E. (2011). Oracle-efficient nonparametric estimation of an additive model with an unknown link function. Econometric Theory, 27, 582–608.
Iwasaki, M., & Tsubaki, H. (2005). A bivariate generalized linear model with an application to meteorological data analysis. Statistical Methodology, 2, 175–190.
Jang, W., & Loh, J. M. (2010). Density estimation for grouped data with application to line transect sampling. The Annals of Applied Statistics, 4, 893–915.
Kutzbach, J. E. (1967). Empirical eigenvectors of sea-level pressure, surface temperature and precipitation complexes over North America. Journal of Applied Meteorology, 6, 791–802.
Lima Neto, E. A., Cordeiro, G., & de Carvalho, F. (2011). Bivariate symbolic regression models for interval-valued variables. Journal of Statistical Computation and Simulation, 81, 1727–1744.
Lima Neto, E.A., Cordeiro, G.M., Carvalho, F.A.T., Anjos, U., & Costa, A. (2009). Bivariate generalized linear model for interval-valued variables. In Proceedings 2009 IEEE international joint conference on neural networks, Vol. 1 (pp. 2226–2229). Atlanta, USA.
Lima Neto, E. A., & de Carvalho, F. A. T. (2008). Center and range method for fitting a linear regression model to symbolic interval data. Computational Statistics & Data Analysis, 52, 1500–1515.
Lima Neto, E. A., & de Carvalho, F. A. T. (2010). Constrained linear regression models for symbolic interval-valued variables. Computational Statistics & Data Analysis, 54, 333–347.
Lima Neto, E. A., de Carvalho, F. A. T., & Tenorio, C. P. (2004). Univariate and multivariate linear regression methods to predict interval-valued features. In Lecture notes in computer science, AI 2004 advances in artificial intelligence (pp. 526–537). Berlin: Springer-Verlag.
Linton, O. B., & Härdle, W. (1996). Estimating additive regression models with known links. Biometrika, 83, 529–540.
Lutgens, F. K., & Tarbuck, E. J. (2007). The atmosphere: an introduction to meteorology. New Jersey: Prentice Hall.
McLean, M. W., Hooker, G., Staicu, A.-M., Scheipl, F., & Ruppert, D. (2014). Functional generalized additive models. Journal of Computational and Graphical Statistics, 23, 249–269.
Min, S. K., Legutke, S., Hense, A., & Kwon, W. T. (2005). Internal variability in a 1000-yr control simulation with the coupled climate model ECHO-G—I. Near-surface temperature, precipitation and mean sea level pressure. Tellus A, 57, 605–621.
Silva, A., Lima Neto, E.A., & Anjos, U. (2011). A regression model to interval-valued variables based on copula approach. In Proceedings of the 58th world statistics congress of the international statistical institute. Dublin, Ireland.
Stone, C. J. (1985). Additive regression and other nonparametric models. Annals of Statistics, 13, 689–705.
van der Burg, E., & de Leeuw, J. (1983). Non-linear canonical correlation. British Journal of Mathematical and Statistical Psychology, 36, 54–80.
Wadsworth, G. P. (1951). Application of statistical methods to weather forecasting. In T. F. Malone (Ed.), Compendium of meteorology (pp. 849–855). Boston: American Meteorological Society.
Wadsworth, G. P., Bryan, J. G., & Gordon, C. H. (1948). Short range and extended forecasting by statistical methods. Air. Wea. Serv. Tech. Rep. (105–37). (p. 202).
Wong, R. K. W., Yao, F., & Lee, T. C. M. (2014). Robust estimation for generalized additive models. Journal of Computational and Graphical Statistics, 23, 270–289.
Wood, S. N. (2000). Modelling and smoothing parameter estimation with multiple quadratic penalties. Journal of the Royal Statistical Society: Series B, 62, 413–428.
Wood, S. N. (2004). Stable and efficient multiple smoothing parameter estimation for generalized additive models. Journal of the American Statistical Association, 99, 673–686.
Wood, S. N. (2006). Generalized additive models: an introduction with R. Boca Raton: CRC Press.
Wood, S. N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society: Series B, 73, 3–36.
Xu, W. (2010). Symbolic data analysis: interval-valued data regression. (PhD thesis), University of Georgia.
Yang, C.-Y., Jeng, J.-T., Chuang, C.-C., & Tao, C. (2011). Constructing the linear regression models for the symbolic interval-values data using PSO algorithm. In 2011 international conference on system science and engineering (ICSSE) (pp. 177–181). IEEE.
Yang, L., Sperlich, S., & Härdle, W. (2003). Derivative estimation and testing in generalized additive models. Journal of Statistical Planning and Inference, 115, 521–542.
Yu, K., Park, B. U., & Mammen, E. (2008). Smooth backfitting in generalized additive models. Annals of Statistics, 36, 228–260.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lim, C. Interval-valued data regression using nonparametric additive models. J. Korean Stat. Soc. 45, 358–370 (2016). https://doi.org/10.1016/j.jkss.2015.12.003
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1016/j.jkss.2015.12.003