Abstract
The availability of digitized collections of historical data, such as newspapers, increases every day. With that, so does the wish for historians to explore these collections. Methods that are traditionally used to examine a collection do not scale up to today’s collection sizes. We propose a method that combines text mining with exploratory search to provide historians with a means of interactively selecting and inspecting relevant documents from very large collections. We assess our proposal with a case study on a prototype system.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Alonso, O., Strötgen, J., Baeza-Yates, R., Gertz, M.: Temporal information retrieval: Challenges and opportunities. In: TWAW Workshop, WWW 2011 (2011)
Au Yeung, C., Jatowt, A.: Studying how the past is remembered: towards computational history through large scale text mining. In: CIKM 2011 (2011)
Bron, M., van Gorp, J., Nack, F., de Rijke, M.: Exploratory search in an audio-visual archive. In: EuroHCIR 2011 (2011)
Bron, M., van Gorp, J., Nack, F., de Rijke, M., Vishneuski, A., de Leeuw, S.: A subjunctive exploratory search interface to support media studies researchers. In: SIGIR 2012 (2012)
Condit, C.: The meanings of the gene: Public debates about human heredity. University of Wisconsin Press (1999)
Courant, P., Fraser, S., Goodchild, M., et al.: Our cultural commonwealth: The report of the american council of learned societies commission on cyberinfrastructure for humanities and social sciences (2006)
Marchionini, G.: Exploratory search: from finding to understanding. Commun. ACM 49(4), 41–46 (2006)
Michel, J., Shen, Y., Aiden, A., et al.: Quantitative analysis of culture using millions of digitized books. Science 331(6014), 176 (2011)
van Vree, F.: De Nederlandse pers en Duitsland 1930-1939. Historische Uitgeverij (1989)
Witte, E.: De constructie van België: 1828-1847. Lannoo Uitgeverij (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Odijk, D., de Rooij, O., Peetz, MH., Pieters, T., de Rijke, M., Snelders, S. (2012). Semantic Document Selection. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds) Theory and Practice of Digital Libraries. TPDL 2012. Lecture Notes in Computer Science, vol 7489. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33290-6_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-33290-6_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33289-0
Online ISBN: 978-3-642-33290-6
eBook Packages: Computer ScienceComputer Science (R0)