Second-Generation Web Interface to Correcting ASR Output

Krůza, Oldřich; Kuboň, Vladislav

doi:10.1007/978-3-030-02686-8_56

Oldřich Krůza¹⁷ &
Vladislav Kuboň¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 880))

Included in the following conference series:

Proceedings of the Future Technologies Conference

1689 Accesses

Abstract

This paper presents a next-generation web application that enables users to contribute corrections to automatically acquired transcription of long speech recordings. We describe differences from similar settings, compare our solution with others and reflect on the development from the now 6 years old work we build upon in the light of the progress made, lessons learned and the new technologies available in the browser.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Crowdsourced Video Subtitling with Adaptation Based on User-Corrected Lattices

Open Challenge for Correcting Errors of Speech Recognition Systems

Comparison of Automatic Speech Recognition Systems

Notes

1.
https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-1455.
2.
trans.sourceforge.net.
3.
otranscribe.com.
4.
transcribe.wreally.com.
5.
Transcriber explicitly aligns the text with speech, while the other two merely support addition of timestamps into the transcription.
6.
Transcribe supports team co-operation.
7.
In our data, other speakers represent a negligible fraction but we may later add support for speaker annotation.
8.
The current word is on the top line on the screenshot because it is at the beginning of the recording.
9.
The median number of chunks is 1 (most recordings have no manually corrected segments), maximum is 1109. Median only counting touched recordings is 8.
10.
We could even stop the recalculation as soon as we find that the new horizontal coordinate of a word is left untouched, and add the difference in the vertical coordinate to all subsequent words, i.e. when a line stays the same, so do all below it.
11.
https://github.com/sixtease/MakonReact.

References

Abramov, D.: Redux. React Community. c (2015)
Google Scholar
Adenot, P., Wilson, C., Rogers, C.: Web audio API. W3C, October 10 (2013)
Google Scholar
Bojar, O., Janíček, M., Češka, P., Beňa, P., et al.: Czeng 0.7: parallel corpus with community-supplied translations. LREC 2008 (2008)
Google Scholar
Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)
Article Google Scholar
Hájek, J.: Český mystik karel makoň. Dingir 2007(4), 142–143 (2007)
Google Scholar
Ide, N., Fellbaum, C., Baker, C., Passonneau, R.: The manually annotated sub-corpus: a community resource for and by the people. In: Proceedings of the ACL 2010 Conference Short Papers, pp. 68–73. Association for Computational Linguistics (2010)
Google Scholar
Krůza, O., Peterek, N.: Making community and ASR join forces in web environment. In: International Conference on Text, Speech and Dialogue, pp. 415–421. Springer (2012)
Google Scholar
Marge, M., Banerjee, S., Rudnicky, A.I.: Using the Amazon mechanical Turk for transcription of spoken language. In: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5270–5273, March 2010
Google Scholar
Mihalcea, R., Chklovski, T.: Building sense tagged corpora with volunteer contributions over the web. Recent Advances in Natural Language Processing III: Selected Papers from RANLP 2003 260, p. 357 (2004)
Google Scholar
Reese, S., Boleda, G., Cuadros, M., Rigau, G.: Wikicorpus: a word-sense disambiguated multilingual wikipedia corpus (2010)
Google Scholar

Download references

Acknowledgments

The research was supported by SVV project number 260 453. This work has been using language resources stored and distributed by the LINDAT/CLARIN project of the Ministry of Education, Youth and Sports of the Czech Republic (project LM2015071).

Author information

Authors and Affiliations

Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, Charles University, Malostranské nám. 25, Prague, Czech Republic
Oldřich Krůza & Vladislav Kuboň

Authors

Oldřich Krůza
View author publications
You can also search for this author in PubMed Google Scholar
Vladislav Kuboň
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Oldřich Krůza .

Editor information

Editors and Affiliations

Saga University , Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Rahul Bhatia
The Science and Information (SAI) Organization, Bradford, UK
Supriya Kapoor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Krůza, O., Kuboň, V. (2019). Second-Generation Web Interface to Correcting ASR Output. In: Arai, K., Bhatia, R., Kapoor, S. (eds) Proceedings of the Future Technologies Conference (FTC) 2018. FTC 2018. Advances in Intelligent Systems and Computing, vol 880. Springer, Cham. https://doi.org/10.1007/978-3-030-02686-8_56

Download citation

DOI: https://doi.org/10.1007/978-3-030-02686-8_56
Published: 18 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02685-1
Online ISBN: 978-3-030-02686-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Second-Generation Web Interface to Correcting ASR Output

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Crowdsourced Video Subtitling with Adaptation Based on User-Corrected Lattices

Open Challenge for Correcting Errors of Speech Recognition Systems

Comparison of Automatic Speech Recognition Systems

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Second-Generation Web Interface to Correcting ASR Output

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Crowdsourced Video Subtitling with Adaptation Based on User-Corrected Lattices

Open Challenge for Correcting Errors of Speech Recognition Systems

Comparison of Automatic Speech Recognition Systems

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation