Skip to main content

Introduction to Topic Detection and Tracking

  • Chapter
Topic Detection and Tracking

Part of the book series: The Information Retrieval Series ((INRE,volume 12))

Abstract

The Topic Detection and Tracking (TDT) research program has been running for five years, starting with a pilot study and including yearly open and competitive evaluations since then. In this chapter we define the basic concepts of TDT and provide historical context for the concepts. In describing the various TDT evaluation tasks and workshops, we provide an overview of the technical approaches that have been used and that have succeeded.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Allan, J., Carbonell, J., Doddington, G., Yamron, J., and Yang, Y (1998). Topic detection and tracking pilot study: Final report. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, pages 194–218.

    Google Scholar 

  2. Allan, J., Gupta, R., and Khandelwal, V. (2001). Temporal summaries of news topics. In Proceedings of ACM SIGIR, Research and Development in Information Retrieval, pages 10–18.

    Google Scholar 

  3. Allan, J., Jin, H., Rajman, M., Wayne, C., Gildea, D., Lavrenko, V., Hoberman, R., and Caputo, D. (1999). Topic-based novelty detection: 1999 summer workshop at CLSP, final report. Available at http://www.clsp.jhu.edu/ws99/tdt.

    Google Scholar 

  4. Allan, J., Lavrenko, V., and Jin, H. (2000). First story detection in TDT is hard. In Ninth International Conference on Information Knowledge Management (CIKM’2000), Washington, D.C. ACM.

    Google Scholar 

  5. Carbonell, J., Yang, Y., Lafferty, J., Brown, R. D., Pierce, T, and Liu, X. (1999). CMU report on TDT-2: Segmentation, detection and tracking. In Proceedings of the DARPA Broadcast News Workshop, pages 117–120. Morgan Kauffman Publishers.

    Google Scholar 

  6. Cieri, C., Graff, D., Liberman, M., Martey, M., and S. Strassel (1999). The TDT-2 Text and Speech Corpus. In Proceedings of the DARPA Broadcast News Workshop, pages 57–60. Morgan Kauffman Publishers.

    Google Scholar 

  7. Dharanipragada, S., Franz, M., McCarley, J., Roukos, S., and Ward, T. (1999). Story segmentation and topic detection in the broadcast news domain. In Proceedings of the DARPA Broadcast News Workshop, pages 65–68. Morgan Kauffman Publishers.

    Google Scholar 

  8. Eichmann, D., Ruiz, M., Srinivasan, P., Street, N., Culy, C., and Menczer, F. (1999). A cluster-based approach to tracking, de- tection and segmentation of broadcast news. In Proceedings of the DARPA Broadcast News Workshop, pages 69–75. Morgan Kauffman Publishers.

    Google Scholar 

  9. Fukumoto, F. and Suzuki, Y. (2000). Event tracking based on domain dependency. In Proceedings of ACM SIGIR, Research and Development in Information Retrieval, pages 57–64.

    Google Scholar 

  10. Gillick, L., Ito, Y., Manganaro, L., Newman, M., Scattone, F., Wegmann, S., Yamron, J. R., and Zhan, R. (1998). Dragon systems’ automatic transcription of new TDT corpus. In Proceedings of the Broadcast News Transcription and Understanding Workshop, pages 219–222. Morgan Kauffman Publishers.

    Google Scholar 

  11. Greiff, W., Morgan, A., Fish, R., Richarsd, M., and Kundu, A. (2000). MITRE TDT-2000 segmentation system. In unpublished TDT 2000 proceedings. Also available at http://www.nist.gov/TDT.

  12. Jin, H., Schwartz, R., Sista, S., and Walls, F. (1999). Topic tracking for radio, TV broadcast, and newswire. In Proceedings of the DARPA Broadcast News Workshop, pages 199–204. Morgan Kauffman Publishers.

    Google Scholar 

  13. Khandelwal, V., Gupta, R., and Allan, J. (2001). An evaluation corpus for temporal summarization. In Proceedings of the Human Language Technology Conference, pages 102–106. Morgan Kauffman Publishers.

    Google Scholar 

  14. Lafferty, J., Beeferman, D., and Berger, A. (1999). Statistical models for text segmentation. Machine Learning, 34(1–3): 177–210.

    MATH  Google Scholar 

  15. Martin, A. G., Doddington, T. K., Ordowski, M., and Przybocki, M. (1997). The DET curve in assessment of detection task performance. In Proceedings of EuroSpeech, volume 4, pages 1895–1898.

    Google Scholar 

  16. Meng, H., Chen, B., Khudanpur, S., Levow, G.-A., Lo, W.-K., Oard, D., Schone, P., Tang, K., Wang, H.-M., and Want, J. (2001). Mandarin-english information (MEI): Investigating translingual speech retrieval. In Proceedings of the Human Language Technology Conference, pages 239–245. Morgan Kauffman Publishers.

    Google Scholar 

  17. Ponte, J. and Croft, W. (1997). Text segmentation by topic. In Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries, pages 113–125.

    Chapter  Google Scholar 

  18. Stolcke, A., Shriberg, E., Hakkani-Tür, D., Tür, G., Rivlin, Z., and Sönmez, K. (1999). Combining words and speech prosody for automatic topic segmentation. In Proceedings of the DARPA Broadcast News Workshop, pages 61–64. Morgan Kauffman Publishers.

    Google Scholar 

  19. Swan, R. and Allan, J. (1999). Extracting significant time varying features from text. In Eighth International Conference on Information Knowledge Management (CIKM), pages 38–45. ACM Press.

    Google Scholar 

  20. Swan, R. and Allan, J. (2000). Automatic generation of overview timelines. In Proceedings of ACM SIGIR, Research and Development in Information Retrieval, pages 49–56.

    Google Scholar 

  21. van Mulbregt, P., Carp, I., Gillick, L., Lowe, S., and Yamron, J. (1999). Segmentation of automatically transcribed broadcast news text. In Proceedings of the DARPA Broadcast News Workshop, pages 77–80. Morgan Kauffman Publishers.

    Google Scholar 

  22. Yang, Y., Ault, T., Pierce, T., and Lattimer, C. W. (2000). Improving text categorization methods for event tracking. In Proceedings of ACM SIGIR, Research and Development in Information Retrieval, pages 65–72.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer Science+Business Media New York

About this chapter

Cite this chapter

Allan, J. (2002). Introduction to Topic Detection and Tracking. In: Allan, J. (eds) Topic Detection and Tracking. The Information Retrieval Series, vol 12. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0933-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-0933-2_1

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-5311-9

  • Online ISBN: 978-1-4615-0933-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics