Hidden Markov Models

van den Bosch, Antal

doi:10.1007/978-1-4899-7687-1_124

Antal van den Bosch³

500 Accesses
5 Citations

Abstract

Starting from the concept of regular Markov models we introduce the concept of hidden Markov model, and the issue of estimating the output emission and transition probabilities between hidden states, for which the Baum-Welch algorithm is the standard choice. We mention typical application in which hidden Markov models play a central role, and mention a number of popular implementations.

Access provided by CONRICYT-eBooks. Download reference work entry PDF

A Systematic Review of Hidden Markov Models and Their Applications

Article 12 May 2020

Probabilistic Modeling in Machine Learning

What Are the Chances?—Hidden Markov Models

Definition

Hidden Markov models (HMMs) form a class of statistical models in which the system being modeled is assumed to be a Markov process with hidden states. From observed output sequences generated by the Markov process, both the output emission probabilities from the hidden states and the transition probabilities between the hidden states can be estimated with dynamic programming methods. The estimated model parameters can then be used for various sequence analysis purposes.

Motivation and Background

The states of a regular Markov model, named after Russian mathematician Andrey Markov (1865–1922), are directly observable; hence its only parameters are the state transition probabilities. In many real-world cases, however, the states of the system that one wants to model are not directly observable. For instance, in speech recognition, the audio is the observable stream, while the goal is to discover the phonemes (the categorical elements of speech) that emitted the audio. Hidden Markov models offer one type of architecture to estimate hidden states through indirect means. Dynamic programming methods have been developed that can estimate both the output emission probabilities and the transition probabilities between the hidden states, either from observations of output sequences only (an unsupervised learning setting) or from pairs of aligned output sequences and gold-standard hidden sequences (a supervised learning setting).

Structure of the Learning System

Figure 1 displays the general architecture of a hidden Markov model. Each circle represents a variable x_i or y_i occurring at time i; x_t is the discrete value of the hidden variable at time t. The variable y_t is the output variable observed at the same time t, said to be emitted by x_t. Arrows denote conditional dependencies. Any hidden variable is only dependent on its immediate predecessor; thus, the value of x_t is only dependent on that of x_t−1 occurring at time t − 1. This deliberate simplicity is referred to as the Markov assumption. Analogously, observed variables such as y_t are conditionally dependent only on the hidden variables occurring at the same time t, i.e., x_t in this case.

Typically, a start state x₀ is used as the first hidden state (not conditioned by any previous state), as well as an end state x_n+1 that closes the hidden state sequence of length n. Start and end states usually emit meta-symbols signifying the “start” and “end” of the sequence.

An important constraint on the data that can in principle be modeled in a hidden Markov model is that the hidden and output sequences need to be discrete, aligned (i.e., one y_t for each x_t), and of equal length. Sequence pairs that do not conform to these constraints need to be discretized (e.g., in equal-length time slices) or aligned where necessary.

Training and Using Hidden Markov Models

Hidden Markov models can be trained both in an unsupervised and a supervised fashion. First, when only observed output sequences are available for training, the model’s conditional probabilities from this indirect evidence can be estimated through the Baum-Welch algorithm (Baum et al. 1970), a form of unsupervised learning, and an instantiation of the expectation-maximization (EM) algorithm (Dempster et al. 1977).

When instead aligned sequences of gold-standard hidden variables and output variables are given as supervised training data, both the output emission probabilities and the state transition probabilities can be straightforwardly estimated from frequencies of co-occurrence in the training data.

Once trained, it is possible to find the most likely sequence of hidden states that could have generated a particular (test) output sequence by the Viterbi algorithm (Viterbi 1967).

Applications of Hidden Markov Models

Hidden Markov models are known for their successful application in pattern recognition tasks such as speech recognition (Rabiner 1989) and DNA sequencing (Kulp et al. 1996) but also in sequential pattern analysis tasks such as in part-of-speech tagging (Church 1988).

Their introduction in speech recognition in the 1970s (Jelinek 1998) led the way toward the introduction of stochastic methods in general in the field of natural language processing in the 1980s and 1990s (Charniak 1993; Manning and Schütze 1999) and into text mining and information extraction in the late 1990s and onward (Freitag and McCallum 1999). In a similar way, hidden Markov models started to be used in DNA pattern recognition in the mid-1980s and have gained widespread usage throughout bioinformatics since (Durbin et al. 1998; Burge and Karlin 1997).

Programs

Many implementations of hidden Markov models exist. Three noteworthy packages are the following:

UMDHMM by Tapas Kanungo. Implements the forward-backward, Viterbi, and Baum-Welch algorithms (Kanungo 1999)
JAHMM by Jean-Marc François. A versatile Java implementation of algorithms related to hidden Markov models (François 2006)
HMMER by Sean Eddy. An implementation of profile HMM software for protein sequence analysis (Eddy 2007)

Cross-References

Author information

Authors and Affiliations

Centre for Language Studies, Radboud University, Nijmegen, The Netherlands
Antal van den Bosch

Authors

Antal van den Bosch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antal van den Bosch .

Editor information

Editors and Affiliations

The University of New South Wales, Sydney, NSW, Australia
Claude Sammut
Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

van den Bosch, A. (2017). Hidden Markov Models. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_124

Download citation

DOI: https://doi.org/10.1007/978-1-4899-7687-1_124
Published: 14 April 2017
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Hidden Markov Models

Abstract

Similar content being viewed by others

A Systematic Review of Hidden Markov Models and Their Applications

Probabilistic Modeling in Machine Learning

What Are the Chances?—Hidden Markov Models

Definition

Motivation and Background

Structure of the Learning System

Training and Using Hidden Markov Models

Applications of Hidden Markov Models

Programs

Cross-References

Recommended Reading

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Navigation

Hidden Markov Models

Abstract

Similar content being viewed by others

A Systematic Review of Hidden Markov Models and Their Applications

Probabilistic Modeling in Machine Learning

What Are the Chances?—Hidden Markov Models

Definition

Motivation and Background

Structure of the Learning System

Training and Using Hidden Markov Models

Applications of Hidden Markov Models

Programs

Cross-References

Recommended Reading

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Share this entry

Publish with us

Search

Navigation