Abstract
This paper describes a two-stage method of document image compression wherein a grayscale document image is first processed to improve its compressibility, then losslessly compressed. The initial processing involves hierarchical, coarse-to-fine morphological operations designed to combat the noiselike variability of the low-order bits while attempting to preserve or even improve intelligibility. The result of this stage is losslessly compressed by an arithmetic coder that uses a mixture model to derive context-conditional graylevel probabilities. The lossless stage is compared experimentally with several reference methods, and is found to be competitive at all rates. The overall system is found to be comparable with JPEG in terms of mean-square error performance, but appears to outperform JPEG in terms of subjectively judged document image intelligibility.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
M. Adler, T. Boutell, C. Brunschen, A. M. Costello, L. D. Crocker, A. Dilger, O. Fromme, J. Gailly, C. Herborth, A. Jakulin, N. Kettler, T. Lane, A. Lehmann, C. Lilley, D. Martindale, O. Mortensen, K. S. Pickens, R. P. Poole, G. Randers-Pehrson, G. Roelofs, W. van Schaik, G. Schalnat, P. Schmidt, T. Wegner, and J. Wohl. Png (portable network graphics) specification, version 1.0. Technical report, World Wide Web Consortium (W3C), October 1996. http://www.w3.org/TR/REC-png.
C. Baarber, D. Dobkin, and H. Huhdanpaa. The quickhull algorithm for convex hulls. ACM Trans. on Mathematical Software, December 1996. http://www.geom.umn.edu/software/qhull.
D. S. Bloomberg. Multiresolution morphological analysis of document images. In SPIE Conf. 1818, Visual Communications and Image Processing, pages 648–662, Boston, 1992.
T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley and Sons, 1991.
J. Gailly and M. Adler. The gzip homepage, December 1999. http://www.gzip.org/algorthm.txt.
P. Haffner, L. Bottou, P. Howard, P. Simard, Y. Bengio, and Y. LeCun. Browsing through high quality document images with djvu. In Proc. of Advances in Digital Libraries 98. IEEE, 1998.
G. G. Langdon and J. J. Rissanen. Compression of black-white images with arithmetic coding. IEEE Trans. Comm., COM-293858-867, June 1981.
A.C. Popat. Scalar quantization with arithmetic coding. Master’s thesis, Dept. of Elec. Eng. and Comp. Science, M.I.T., Cambridge, Mass., 1990. ftp://ftp.media.mit.edu/pub/k-arith-code.
A.C. Popat. Conjoint Probabilistic Subband Modeling. PhD thesis, Massachusetts Institute of Technology, 1997.
K. Popat and R. W. Picard. Cluster-based probability model applied to image restoration and compression. In ICASSP-94: 1994 International Conference on Acoustics, Speech, and Signal Processing, pages 381–384, Adelaide, Australia, April 1994. IEEE.
M. Rabbani and P. W. Jones. Digital Image Compression Techniques. SPIE Optical Engineering Press, Bellingham, Washington, 1991.
R.A. Redner and H. F. Walker. Mixture densities, maximum likelihood, and the EM algorithm. SIAM Review, 26(2):195–239, April 1984.
J. J. Rissanen and G. G. Langdon. Arithmetic coding. IBM J. Res. Develop., 23(2):149–162, March 1979.
I. Witten, R. Neal, and J. Cleary. Arithmetic coding for data compression. Communications of the ACM, 30(6):520–540, June 1987.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Kluwer Academic/Plenum Publishers
About this chapter
Cite this chapter
Popat, K., Bloomberg, D.S. (2002). Two-Stage Lossy/Lossless Compression of Grayscale Document Images. In: Goutsias, J., Vincent, L., Bloomberg, D.S. (eds) Mathematical Morphology and its Applications to Image and Signal Processing. Computational Imaging and Vision, vol 18. Springer, Boston, MA. https://doi.org/10.1007/0-306-47025-X_39
Download citation
DOI: https://doi.org/10.1007/0-306-47025-X_39
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-7923-7862-4
Online ISBN: 978-0-306-47025-7
eBook Packages: Springer Book Archive