Abstract
We revisit the effects that various characteristics of the topic documents have on the effectiveness of the systems for the task of finding prior art in the patent domain. In doing so, we provide the reader interested in approaching the domain a guide of the issues that need to be addressed in this context.
For the current study, we select two patent based test collections with a common document representation schema and look at topic characteristics specific to the objectives of the collections. We look at the effect of languages on retrieval and at the length of the topic documents. We present the correlations between these topic facets and their retrieval results, as well as their relevant documents.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Alonso, O., Mizzaro, S.: Can we get rid of TREC assessors? Using Mechanical Turk for relevance assessment. In: Proc. of SIGIR IR Evaluation Workshop (2009)
Ferro, N., Peters, C.: CLEF 2009 Ad Hoc Track Overview: TEL and Persian Tasks. In: Peters et al. [14]
Fujii, A.: Enhancing patent retrieval by citation analysis. In: Proc. of SIGIR (2007)
Fujii, A., Iwayama, M., Kando, N.: Overview of the Patent Retrieval Task at the NTCIR-6 Workshop. In: Proc. of EVIA (2007)
Iwayama, M., Fujii, A., Kando, N., Takano, A.: Report on the patent retrieval task at NTCIR workshop 3. SIGIR Forum 38(1), 22–24 (2004)
Kishida, K., Chen, K.-H., Lee, S., Kuriyama, K., Kando, N., Chen, H.-H., Myaeng, S.H., Eguchi, K.: Overview of CLIR Task at the Fourth NTCIR Workshop. In: Proc. of the NTCIR Workshop (2004)
Kando, N., Leong, M.-K.: Workshop on Patent Retrieval (Workshop Report). SIGIR Forum 34(1) (2000)
Kürsten, J., Wilhelm, T., Eibl, M.: The Xtrieval Framework at CLEF 2008: Domain-Specific Track. In: Peters, et al. [13]
Larson, R.: Back to Basics - Again - for Domain-Specific Retrieval. In: Peters et al [13]
Lupu, M., Piroi, F., Hanbury, A.: Aspects and analysis of patent test collections. In: Proc. of PaIR (2010)
Lupu, M., Piroi, F., Huang, J., Zhu, J., Tait, J.: Overview of the TREC Chemical IR Track. In: Proc. of the 18th Text Retrieval Conference (2010)
Lv, Y., Zhai, C.: When documents are very long, BM25 fails! In: Proc. of SIGIR (2011)
Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.): CLEF 2008. LNCS, vol. 5706. Springer, Heidelberg (2009)
Peters, C., Di Nunzio, G.M., Kurimo, M., Mandl, T., Mostefa, D., Peñas, A., Roda, G. (eds.): CLEF 2009. LNCS, vol. 6241. Springer, Heidelberg (2010)
Petras, V., Baerisch, S.: The Domain-Specific Track at CLEF 2008. In: Peters et al. [13]
Piroi, F.: CLEF-IP 2010: Retrieval Experiments in the Intellectual Property Domain. In: CLEF 2010 LABs and Workshops, Notebook Papers (2010)
Piroi, F., Zenz, V.: Evaluating Information Retrieval in the Intellectual Property Domain: The CLEF-IP Campaign. In: Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol. 29 (2011)
Roda, G., Tait, J., Piroi, F., Zenz, V.: CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain. In: Peters et al. [14]
Voorhees, E., Buckland, L. (eds.): Proc. of TREC, volume Special Publication 500–278. NIST (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Piroi, F., Lupu, M., Hanbury, A. (2012). Effects of Language and Topic Size in Patent IR: An Empirical Study. In: Catarci, T., Forner, P., Hiemstra, D., Peñas, A., Santucci, G. (eds) Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics. CLEF 2012. Lecture Notes in Computer Science, vol 7488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33247-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-33247-0_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33246-3
Online ISBN: 978-3-642-33247-0
eBook Packages: Computer ScienceComputer Science (R0)