Identification of the Minimal Set of Attributes That Maximizes the Information towards the Author of a Political Discourse: The Case of the Candidates in the Mexican Presidential Elections

Neme, Antonio; Hernández, Sergio; Carrión, Vicente

doi:10.1007/978-3-642-37798-3_8

Antonio Neme^21,22,
Sergio Hernández²³ &
Vicente Carrión²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7630))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

1678 Accesses

Abstract

Authorship attribution has attracted the attention of the natural language processing and machine learning communities in the past few years. Here we are interested in finding a general measure of the style followed in the texts from the three main candidates in the Mexican presidential elections of 2012. We analyzed dozens of texts (discourses) from the three authors. We applied tools from the time series processing field and machine learning community in order to identify the overall attributes that define the writing style of the three authors. Several attributes and time series were extracted from each text. A novel methodology, based in mutual information, was applied on those time series and attributes to explore the relevance of each attribute to linearly separate the texts accordingly to their authorship. We show that less than 20 variables are enough to identify, by means of a linear recognizer, the authorship of a text from within one of the three considered authors.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Author Attribution of Literary Texts in Polish by the Sequence Averaging

Can Statistical Tests Be Used for Feature Selection in Diachronic Text Classification?

Rhythmic and Psycholinguistic Features for Authorship Tasks in the Spanish Parliament: Evaluation and Analysis

Keywords

References

Juola, P.: Authorship attribution. NOW Press (2008)
Google Scholar
Stamatatos, E.: A survey of modern authorship attribution methods. J. of the American Soc. for Information Science and Technology 60(3), 538–556 (2010)
Article Google Scholar
Neme, A., Cervera, A., Lugo, T.: Authorship attribution as a case of anomaly detection: A neural network model. Int. J of Hybrid Intell. Syst. 8, 225–235 (2011)
Google Scholar
Manning, C., Schutze, H.: Foundations of statistical natural language processig. MIT Press (2003)
Google Scholar
Abarbanel, H.: Analysis of observed chaotic data. Springer (1996)
Google Scholar
Kantz, H., Schreiber, T.: Nonlinear time series analysis, 2nd edn. Cambridge Press
Google Scholar
Shannon, C.E.: A Mathematical Theory of Communication. Bell System Technical Journal 27, 379–423, 623–656 (1948)
MathSciNet MATH Google Scholar
Cellucci, C.J., Albano, A.M., College, B., Rapp, P.E.: Statistical Validation of Mutual Information Calculations. Phy. Rev E. 71(6) (2005), 10.1103/PhysRevE.71.066208
Google Scholar
Santos, J., Marques de Sá, J., Alexandre, L., Sereno, F.: Optimization of the error entropy minimization algorithm for neural network classification. In: ANNIE V. 14 of Intelligent Engineering Systems Through Art. Neural Net, pp. 81–86. ASME Press, USA (2004)
Google Scholar
Silva, L., Marques de Sá, J., Alexandre, L.: Neural Network Classification using Shannon’s Entropy. In: ESANN 13th European Symp. on Art. Neural Net (2005)
Google Scholar
Cover, T., Thomas, J.: Elements of information theory, 2nd edn. Wiley (2006)
Google Scholar
Quinlan, R.: Programs for Machine Learning. Morgan Kaufmann Publishers (1993)
Google Scholar
Cortes, M.L., Ruiz-Shulcloper, J., Alba-Cabrera, E.: An overview of the evolution of the concept of testor. Pattern Recognition 34, 753–762 (2001)
Article MATH Google Scholar
Kohonen, T.: Self-organizing maps, 2nd edn. Springer (2000)
Google Scholar
The Self-Organizing Maps: Background, Theories, Extensions and Applications. Studies in Computational Intelligence (SCI), vol. 115, pp. 715–762 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Complex Systems Group, Universidad Autónoma de la Ciudad de México, San Lorenzo 290, México, D.F., México
Antonio Neme
Institute for Molecular Medicine, Finland
Antonio Neme
Postgraduation Program in Complex Systems, Universidad Autónoma de la Ciudad de México, México
Sergio Hernández
CINVESTAV IDS, México D.F., México
Vicente Carrión

Authors

Antonio Neme
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Hernández
View author publications
You can also search for this author in PubMed Google Scholar
Vicente Carrión
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Mexican Petroleum Institute, Eje Central Lazaro Cardenas Norte, 152, Col. San Bartolo Atepehuacan, CP 07730, México D.F., Mexico
Ildar Batyrshin
Tecnológico de Monterrey, Campus Estado de México, Carretera Lago de Guadalupe Km 3.5, CP 52926, Atizapán de Zaragoza, Estado de México, Mexico
Miguel González Mendoza

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Neme, A., Hernández, S., Carrión, V. (2013). Identification of the Minimal Set of Attributes That Maximizes the Information towards the Author of a Political Discourse: The Case of the Candidates in the Mexican Presidential Elections. In: Batyrshin, I., Mendoza, M.G. (eds) Advances in Computational Intelligence. MICAI 2012. Lecture Notes in Computer Science(), vol 7630. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37798-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-37798-3_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37797-6
Online ISBN: 978-3-642-37798-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Identification of the Minimal Set of Attributes That Maximizes the Information towards the Author of a Political Discourse: The Case of the Candidates in the Mexican Presidential Elections

Abstract

Chapter PDF

Similar content being viewed by others

Author Attribution of Literary Texts in Polish by the Sequence Averaging

Can Statistical Tests Be Used for Feature Selection in Diachronic Text Classification?

Rhythmic and Psycholinguistic Features for Authorship Tasks in the Spanish Parliament: Evaluation and Analysis

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Identification of the Minimal Set of Attributes That Maximizes the Information towards the Author of a Political Discourse: The Case of the Candidates in the Mexican Presidential Elections

Abstract

Chapter PDF

Similar content being viewed by others

Author Attribution of Literary Texts in Polish by the Sequence Averaging

Can Statistical Tests Be Used for Feature Selection in Diachronic Text Classification?

Rhythmic and Psycholinguistic Features for Authorship Tasks in the Spanish Parliament: Evaluation and Analysis

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation