Generic Audio Data Segmentation and Indexing

Zhang, Tong; Kuo, C.-C. Jay

doi:10.1007/978-1-4757-3339-6_4

Tong Zhang³ &
C.-C. Jay Kuo⁴

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 606))

191 Accesses
1 Citations

Abstract

In the coarse-level segmentation and indexing stage, audio data are segmented and classified into basic audio types, based on morphological and statistical analysis of the temporal curves of the short-time energy function, the short-time average zero-crossing rate, and the short-time fundamental frequency, as well as the spectral peak tracks of audio signals. Threshold-based heuristical rules are derived empirically to guide the classification procedures. Therefore, the approach is completely generic and model-free, which can be applied under any circumstances. An illustration of the scheme is shown in Figure 4.1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Integrated Media Systems Center, University of Southern California, 90089-2564, Los Angeles, CA, USA
Tong Zhang
Department of Electrical Engineering — Systems, University of Southern California, 90089-2564, Los Angeles, CA, USA
C.-C. Jay Kuo

Authors

Tong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
C.-C. Jay Kuo
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhang, T., Kuo, CC.J. (2001). Generic Audio Data Segmentation and Indexing. In: Content-Based Audio Classification and Retrieval for Audiovisual Data Parsing. The Springer International Series in Engineering and Computer Science, vol 606. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-3339-6_4

Download citation

DOI: https://doi.org/10.1007/978-1-4757-3339-6_4
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-4878-6
Online ISBN: 978-1-4757-3339-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics