Abstract
This paper deals with the difficult problem of indexing ancient graphic images. It tackles the particular case of indexing drop caps (also called Lettrines) and specifically, considers the problem of letter extraction from this complex graphic images. Based on an analysis of the features of the images to be indexed, an original strategy is proposed. This approach relies on filtering the relevant information, on the basis of Meyer decomposition. Then, in order to accommodate the variability of representation of the information, a Zipf’s law modeling enables detection of the regions belonging to the letter, what allows it to be segmented. The overall process is evaluated using a relevant set of images, which shows the relevance of the approach.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Aujol J.F., Aubert G., Feraud L.B., Chambolle A.: Image decomposition into a bounded variation component and an oscillating component. J. Math. Imaging Vis. 22(1), 71–88 (2005)
Aujol J.-F., Chambolle A.: Dual norms and image decomposition models. Int. J. Comput. Vis. 63(1), 85–104 (2005)
Aujol J.-F., Gilboa G., Chan T., Osher S.: Structure-texture image decomposition—modeling, algorithms, and parameter selection. Int. J. Comput. Vis. 67(1), 111–136 (2006)
Bigun, J., Bhattacharjee, S.K., Michel, S.: Orientation radiograms for image retrieval: An alternative to segmentation. In: International Conference on Pattern Recognition, vol. 7276 (1996)
Caron Y., Makris P., Vincent N.: Use of power law models in detecting region of interest. Pattern Recogn. 40(9), 2521–2529 (2007)
Chambolle, A.: Total Variation Minimization and a Class of Binary MRF models. EMMCVPR, 3757 of Lecture Notes in Computer Sciences, pp. 136–152 (2005)
Chouaib, H., Cloppet, F. Vincent N.: Graphical Drop Caps Indexing. In: GREC 212–219 (2009)
Coustaty, M., Ogier, J.-M., Pareti, R., Vincent, N.: Drop caps decomposition for indexing a new letter extraction method. In: 10th International Conference on Document Analysis and Recognition, pp. 476–480, Barcelona, Spain, IEEE Computer Society (2009)
Delalandre, M.: Retrieval of the ornaments from the Hand-Press period: an overview. In: Internation Conference on Document Analysis and Recognition, vol. 2, pp. 496–500, Barcelona, Spain (2009)
Dubois, S., Lugiez, M., Péteri, R., Ménard, M.: Adding a noise component to a color decomposition model for improving color texture extraction. In: 4th European Conference on Colour in Graphics, Imaging, and Vision, pp. 394–398 (2008)
Hamidi A.E., Menard M., Lugiez M., Ghannam C.: Weighted and extended total variation for image restoration and decomposition. Pattern Recogn. 43(4), 1564–1576 (2010)
Journet N., Ramel J.-Y., Mullot R., Eglin V.: Document image characterization using a multiresolution analysis of the texture: application to old documents. IJDAR 11(1), 9–18 (2008)
Rudin L., Osher S., Fatemi E.: Nonlinear total variation based noise removal. Phys. D 60, 259–269 (1992)
Mallat S.: A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way. Academic Press, London (2008)
McQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability (1967)
Meyer, Y.: Oscillating patterns in image processing and nonlinear evolution equations. The fifteenth dean Jacqueline B. Lewis Memorial Lectures (2001)
OTSU N.: A threshold selection method from Gray-Level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Pareti, R., Uttama, S., Salmon, J., Ogier, J., Tabbone, S., Wendling, L., Adam, S., Vincent, N.: On defining signatures for the retrieval and the classification of graphical drop caps. In: Second International Conference on Document Image Analysis for Libraries, pp. 220–231. IEEE Computer Society (2006)
Pareti, R., Vincent, N.: Ancient initial letters indexing. In: ICPR ’06: Proceedings of the 18th International Conference on Pattern Recognition, pp. 756–759. IEEE Computer Society, Washington, DC (2006)
Remazeilles, C.: Etude des processus de degradation des manuscrits anciens ecrit a l’encre ferrogallique. PhD thesis, La Rochelle, (2001)
Salmon J.P., Wendling L., Tabbone S.: Improving the recognition by integrating the combination of descriptors. Int. J. Doc. Anal. Recogn. 9(1), 3–12 (2007)
Starck J.L., Elad M., Donoho D.L.: Image decomposition via the combination of sparse representations and a variational approach. IEEE Trans. Image Process. 14(10), 1570–1582 (2005)
Uttama, S., Loonis, P., Delalandre, M., Ogier, J.: Segmentation and retrieval of ancient graphic documents. In Graphics Recognition. Ten Years Review and Future Perspectives, LNCS. pp. 88–98. Springer, Berlin (2006)
Zipf G.: Human Behavior and the Principle of Least Effort. Hafner Pub. Co, New York (1949)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Coustaty, M., Pareti, R., Vincent, N. et al. Towards historical document indexing: extraction of drop cap letters. IJDAR 14, 243–254 (2011). https://doi.org/10.1007/s10032-011-0152-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-011-0152-x