Abstract
Zighera (App Stoch Mod Data Anal 1:93–108 1985) introduced a new parameterization of log-linear models for analyzing categorical data, directly linked to a thorough analysis of discrimination information through Kullback-Leibler divergence. The method mainly aims at quantifying in terms of information the variations of a binary variable of interest, by comparing two contingency tables – or sub-tables – through effects of explanatory categorical variables. The present paper settles the mathematical background necessary to rigorously apply Zighera’s parameterization to any categorical data. In particular, identifiability and good properties of asymptotically χ 2-distributed test statistics are proven to hold. Determination of parameters and all tests of effects due to explanatory variables are simultaneous. Application to classical data sets illustrates contribution with respect to existing methods.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Agresti A (2002) Categorical data analysis, 2nd edn. Wiley, New York
Bhapker VP, Koch GG (1968) Hypotheses of no interaction in multidimensional contingency tables. Technometrics 10:107–123
Bishop YMM, Fienberg SE, Holland PW (1975) Discrete multivariate analysis. MIT Press, Cambridge
Brooks SP, King R (2001) Prior induction in log-linear models for general contingency table analysis. Ann Stat 29:715–747
Calamai PH, Moré JJ (1987) Projected gradient methods for linearly constrained problems. Math Program 39(1):93–116
Christensen R (1990) Log-linear models. Springer Verlag, New York
Cressie N, Read TRC (1989) Pearson’s χ 2 and the loglikelihood ratio statistic G 2: a comparative review. Int Stat Rev 57:19–43
Csiszár I (1975) I-divergence geometry of distributions. Ann Probab 3:146–158
Deming WE, Stephan FF (1940) On the least squares adjustment of a sampled frequency table when the expected marginal totals are known. Ann Math Stat 11:427–444
Fienberg SE, Rinaldo A (2007) Three centuries of categorical data analysis: loglinear models and maximum likelihood estimation. J Statist Plann Inference 137:3430–3445
Girardin V, Ricordeau A (1996) Analyse d’information sur les marges. Actes Journées Association Statisticiens Universitaires, Québec, pp 171–75
Girardin V, Lequesne J, Thévenon O (2017) How variation of scores of the Programme for International Student Assessment can be explained through analysis of information. In: Bozeman J, Olivera T, Skiadas C (eds) Stochastic modeling, data analysis with demography applications. Springer, New York
Gokhale DV, Kullback S (1978) The information in contingency tables. Marcel Dekker, New York
Good IJ (1963) Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables. Ann Math Stat 34:911–934
Kateri M (2014) Contingency table analysis: methods and implementation using R. Birkhäuser. Springer, New York
Ku HH, Kullback S (1968) Interaction in multidimensional contingency tables: an information theoretic approach. J Res Nat Bur Standards 72:159–199
Ku HH, Varner R, Kullback S (1968) Analysis of multi-dimensional contingency tables. In: Proceedings of the 14th conference on the design of experiments in army research, development and testing. Maryland, pp 157–180
Kullback S (1978) Information theory and statistics. Peter Smith Pub., Gloucester
Lequesne J (2015) Tests statistiques basés sur la théorie de l’information, applications en biologie et en démographie. PhD Thesis, Université de Caen Normandie, France
Little RJA, Wu MM (1991) Models for contingency tables with known margins. J Am Statist Assoc Theory Methods 86:87–95
Ruschendorf L (1995) Convergence of the iterative proportional fitting procedure. Ann Stat 23:1160–1174
Thévenon O (2009) Increased women’s labour force participation in europe: progress in the work- life balance or polarization of behaviors? Population 64:235–272
Vallet LA (2004) Modélisation log-linéaire et log-multiplicative des tableaux de contingence. Eléments de Cours Revue française de sociologie 3681–121
Zighera JA (1985) Partitioning information in a multidimensional contingency table and centering of loglinear parameters. App Stoch Mod Data Anal 1:93–108
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Girardin, V., Lequesne, J. & Ricordeau, A. Information-based Parameterization of the Log-linear Model for Categorical Data Analysis. Methodol Comput Appl Probab 20, 1105–1121 (2018). https://doi.org/10.1007/s11009-017-9597-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11009-017-9597-9