Abstract
The Topic Detection and Tracking (TDT) research program has been running for five years, starting with a pilot study and including yearly open and competitive evaluations since then. In this chapter we define the basic concepts of TDT and provide historical context for the concepts. In describing the various TDT evaluation tasks and workshops, we provide an overview of the technical approaches that have been used and that have succeeded.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Allan, J., Carbonell, J., Doddington, G., Yamron, J., and Yang, Y (1998). Topic detection and tracking pilot study: Final report. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, pages 194–218.
Allan, J., Gupta, R., and Khandelwal, V. (2001). Temporal summaries of news topics. In Proceedings of ACM SIGIR, Research and Development in Information Retrieval, pages 10–18.
Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., Lavrenko, V., Hoberman, R., and Caputo, D. (1999). Topic-based novelty detection: 1999 summer workshop at CLSP, final report. Available at http://www.clsp.jhu.edu/ws99/tdt.
Allan, J., Lavrenko, V., and Jin, H. (2000). First story detection in TDT is hard. In Ninth International Conference on Information Knowledge Management (CIKM’2000), Washington, D.C. ACM.
Carbonell, J., Yang, Y., Lafferty, J., Brown, R. D., Pierce, T, and Liu, X. (1999). CMU report on TDT-2: Segmentation, detection and tracking. In Proceedings of the DARPA Broadcast News Workshop, pages 117–120. Morgan Kauffman Publishers.
Cieri, C., Graff, D., Liberman, M., Martey, M., and S. Strassel (1999). The TDT-2 Text and Speech Corpus. In Proceedings of the DARPA Broadcast News Workshop, pages 57–60. Morgan Kauffman Publishers.
Dharanipragada, S., Franz, M., McCarley, J., Roukos, S., and Ward, T. (1999). Story segmentation and topic detection in the broadcast news domain. In Proceedings of the DARPA Broadcast News Workshop, pages 65–68. Morgan Kauffman Publishers.
Eichmann, D., Ruiz, M., Srinivasan, P., Street, N., Culy, C., and Menczer, F. (1999). A cluster-based approach to tracking, de- tection and segmentation of broadcast news. In Proceedings of the DARPA Broadcast News Workshop, pages 69–75. Morgan Kauffman Publishers.
Fukumoto, F. and Suzuki, Y. (2000). Event tracking based on domain dependency. In Proceedings of ACM SIGIR, Research and Development in Information Retrieval, pages 57–64.
Gillick, L., Ito, Y., Manganaro, L., Newman, M., Scattone, F., Wegmann, S., Yamron, J. R., and Zhan, R. (1998). Dragon systems’ automatic transcription of new TDT corpus. In Proceedings of the Broadcast News Transcription and Understanding Workshop, pages 219–222. Morgan Kauffman Publishers.
Greiff, W., Morgan, A., Fish, R., Richarsd, M., and Kundu, A. (2000). MITRE TDT-2000 segmentation system. In unpublished TDT 2000 proceedings. Also available at http://www.nist.gov/TDT.
Jin, H., Schwartz, R., Sista, S., and Walls, F. (1999). Topic tracking for radio, TV broadcast, and newswire. In Proceedings of the DARPA Broadcast News Workshop, pages 199–204. Morgan Kauffman Publishers.
Khandelwal, V., Gupta, R., and Allan, J. (2001). An evaluation corpus for temporal summarization. In Proceedings of the Human Language Technology Conference, pages 102–106. Morgan Kauffman Publishers.
Lafferty, J., Beeferman, D., and Berger, A. (1999). Statistical models for text segmentation. Machine Learning, 34(1–3): 177–210.
Martin, A. G., Doddington, T. K., Ordowski, M., and Przybocki, M. (1997). The DET curve in assessment of detection task performance. In Proceedings of EuroSpeech, volume 4, pages 1895–1898.
Meng, H., Chen, B., Khudanpur, S., Levow, G.-A., Lo, W.-K., Oard, D., Schone, P., Tang, K., Wang, H.-M., and Want, J. (2001). Mandarin-english information (MEI): Investigating translingual speech retrieval. In Proceedings of the Human Language Technology Conference, pages 239–245. Morgan Kauffman Publishers.
Ponte, J. and Croft, W. (1997). Text segmentation by topic. In Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries, pages 113–125.
Stolcke, A., Shriberg, E., Hakkani-Tür, D., Tür, G., Rivlin, Z., and Sönmez, K. (1999). Combining words and speech prosody for automatic topic segmentation. In Proceedings of the DARPA Broadcast News Workshop, pages 61–64. Morgan Kauffman Publishers.
Swan, R. and Allan, J. (1999). Extracting significant time varying features from text. In Eighth International Conference on Information Knowledge Management (CIKM), pages 38–45. ACM Press.
Swan, R. and Allan, J. (2000). Automatic generation of overview timelines. In Proceedings of ACM SIGIR, Research and Development in Information Retrieval, pages 49–56.
van Mulbregt, P., Carp, I., Gillick, L., Lowe, S., and Yamron, J. (1999). Segmentation of automatically transcribed broadcast news text. In Proceedings of the DARPA Broadcast News Workshop, pages 77–80. Morgan Kauffman Publishers.
Yang, Y., Ault, T., Pierce, T., and Lattimer, C. W. (2000). Improving text categorization methods for event tracking. In Proceedings of ACM SIGIR, Research and Development in Information Retrieval, pages 65–72.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Science+Business Media New York
About this chapter
Cite this chapter
Allan, J. (2002). Introduction to Topic Detection and Tracking. In: Allan, J. (eds) Topic Detection and Tracking. The Information Retrieval Series, vol 12. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0933-2_1
Download citation
DOI: https://doi.org/10.1007/978-1-4615-0933-2_1
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5311-9
Online ISBN: 978-1-4615-0933-2
eBook Packages: Springer Book Archive