Abstract
The black box application evaluation methodology described in this tutorial is applicable to a broad range of operational information retrieval (IR) applications. Contrary to popular, traditional IR evaluation approaches that are limited to measure the IR system performance on a test collection, the black box evaluation methodology considers an IR application in its entirety: the underlying system, the corresponding document collection, and its configuration/application layer. A comprehensive set of quality criteria is used to estimate the user’s perception of the application. Scores are assigned as a weighted average of results from tests that evaluate individual aspects. The methodology was validated in a small evaluation campaign. An analysis of this campaign shows a correlation between the testers’ perception of the applications and the evaluation scores. Moreover, functional weaknesses of the tested IR applications can be identified and then systematically targeted.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Rietberger, S., Imhof, M., Braschler, M., Berendsen, R., Järvelin, A., Hansen, P., García Seco de Herrera, A., Tsikrika, T., Lupu, M., Petras, V., Gäde, M., Kleineberg, M., Choukri, K.: PROMISE deliverable 4.2: Tutorial on Evaluation in the Wild (2012)
Robertson, S.E., Maron, M.E., Cooper, W.S.: Probability of relevance: a unification of two competing models for document retrieval. Info. Tech: R. and.D 1, 1–21 (1982)
Cleverdon, C.W.: The Cranfield tests on index language devices (1967)
Voorhees, E.M.: The philosophy of information retrieval evaluation. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds.) CLEF 2001. LNCS, vol. 2406, pp. 355–370. Springer, Heidelberg (2002)
Jansen, B.J.: Search log analysis: What it is, what’s been done, how to do it (2006)
Blecic, D., Bangalore, N., Dorsch, J., Henderson, C., Koenig, M., Weller, A.: Using transaction log analysis to improve OPAC retrieval results (1998)
Kohavi, R., Henne, R., Sommerfield, D.: Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO (2007)
Radlinski, F., Kurup, M., Joachims, T.: How Does Clickthrough Data Reflect Retrieval Quality? (2008)
Dunlop, M.: Reflections on Mira: Interactive evaluation in information retrieval. J. Am. Soc. Inf. Sci. 51, 1269–1274 (2000)
Borlund, P.: User-centered evaluation of information retrieval systems. In: Information Retrieval: Searching in the 21st Century, pp. 21–37 (2009)
Braschler, M., Rietberger, S., Imhof, M., Järvelin, A., Hansen, P., Lupu, M., Gäde, M., Berendsen, R., García Seco de Herrera, A.: PROMISE deliverable 2.3: Best Practices Report (2012)
Braschler, M., Herget, J., Pfister, J., Schäuble, P., Steinbach, M., Stuker, J.: Evaluation der Suchfunktion von Schweizer Unternehmens-Websites (2006)
Braschler, M., Heuwing, B., Mandel, T., Womser-Hacker, C., Herget, J., Schäuble, P., Stuker, J.: Evaluation der Suchfunktion deutscher Unternehmens-Websites (2009)
Peters, C., Braschler, M., Clough, P.: Multilingual Information Retrieval: From Research to Practice. Springer (2012) ISBN 3642230075
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Braschler, M., Imhof, M., Rietberger, S. (2014). Black Box Evaluation for Operational Information Retrieval Applications. In: Ferro, N. (eds) Bridging Between Information Retrieval and Databases. PROMISE 2013. Lecture Notes in Computer Science, vol 8173. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54798-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-54798-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54797-3
Online ISBN: 978-3-642-54798-0
eBook Packages: Computer ScienceComputer Science (R0)