Abstract
In this paper we describe a system to reliably localize the position of the speaker's face and mouth in videophone sequences. A statistical scheme based on a subspace method is presented for detecting human faces under varying poses. We propose a new matching criterion based on the Generalized Likelihood Ratio. The criterion is optimized efficiently with respect to similarity, affine or perspective transform parameters using a coarse-to-fine search strategy combined with a simulated annealing algorithm. Moreover we propose to extract a vector of geometrical features (four points) on the outline of the mouth. The extraction consists in analyzing amplitude projections in the regions of the mouth. All the computations are performed on H263-coded frames, with a QCIF spatial resolution. To this end, we propose algorithms adapted to the poor quality of the images and suited to a further real-time application.
This work is supported by the European Commission via the ACTS project VIDAS.
Preview
Unable to display preview. Download preview PDF.
References
M. Betke and N.C. Makris.-Fast object recognition in noisy images using simulated annealing.-In ICCV95, pp.523–530, Boston, June 1995.
M.J. Black and A.D. Jepson.-Eigentracking: robust matching and tracking of articulated objects using a view-based representation.-In ECCV96, pp.329–342, Cambridge, April 1996.
R. Brunelli and T. Poggio.-Face Recognition: Features versus Templates.-IEEE Trans. on Pattern Analysis and Machine Intelligence, 15(10):1042–1052, 1993.
A. Lanitis, C.J. Taylor and T.F. Cootes.-An unified approach to coding and interpreting face images.-In ICCV95, pp.368–373, Boston, June 1995.
B. Moghaddam and A. Pentland.-Maximum Likelihood detection of faces and hands.-In ICCV95, pp.786–793, Boston, June 1995.
H. Murase and S.K. Nayar.-Visual learning and recognition of 3D objects from appearance.-Int. J. Computer Vision, 14: 5–24, 1995.
K.V. Prasad, D.G. Stork and G.J. Wolff, Preprocessing video images for neural learning of lipreading.-In SPIE Proc. Substance Identification Analytics, 2093, pp.116–127, 1994.
K. Rijkse.-ITU standardization of very low bitrate video coding algorithms.-Signal Processing: Image Communication, 7: 553–565, 1995.
H.A. Rowley, S. Baluja and T. Kanade.-Neural network-based face detection.-In CVPR96, pp.203–208, San Francisco, June 1996.
K. Sung and T. Poggio.-Example-based learning for view-based human face detection.-In Technical Report AIM-1521, MIT, 1994.
M. Turk and A. Pentland.-Eigenfaces for recognition.-J. of Cognitive Science, 3(1):1–24, 1991.
A.L. Yuille, P.W. Hallinan and D.S. Cohen.-Feature Extraction from Faces Using Deformable Templates.-Int. J. of Computer Vision, 8(2):99–111, 1992.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kervrann, C., Davoine, F., Pérez, P., Li, H., Forchheimer, R., Labit, C. (1997). Generalized likelihood ratio-based face detection and extraction of mouth features. In: Bigün, J., Chollet, G., Borgefors, G. (eds) Audio- and Video-based Biometric Person Authentication. AVBPA 1997. Lecture Notes in Computer Science, vol 1206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0015976
Download citation
DOI: https://doi.org/10.1007/BFb0015976
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62660-2
Online ISBN: 978-3-540-68425-1
eBook Packages: Springer Book Archive