Abstract
In this paper, we propose a clustering approach for solving the problem of reconstructing cross-cut shredded documents. This problem is important in the field of forensic science. Unlike other clustering approaches which are applied as a preprocessing step before the actual reconstruction algorithms, our clustering approach is part of the reconstruction process itself. We define a new cost function which mainly relies on black pixels to measure the cost of pairing two shreds together. The reconstruction algorithm creates multiple clusters which grow by adding additional shreds based on the cost function. Adding a shred may result in merging two or more clusters to produce a larger cluster. We, also, propose a way to involve the user in the reconstruction process. We compare our approach with a recent proposal and conclude that our approach gives better solutions in less time.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Aradhye, H. B. (2005). A generic method for determining up/down orientation of text in roman and non-roman scripts. Pattern Recognition, 38(11), 2114–2131.
Bose, P. & Kilani Ghoudi, J.D.C. (1998). Detection of text-line orientation. In Canadian conference on computational geometry.
Chung, M. G., Fleck, M., & Forsyth, D. (1998). Jigsaw puzzle solver using shape and color. In ICSP ’98. Fourth international conference on signal processing proceedings (Vol. 2, pp. 877–880).
Dorigo, M., & Blum, C. (2005). Ant colony optimization theory: a survey. Theoretical Computer Science, 344(2–3), 243–278.
Faure, C., & Vincent, N. (2007). Document image analysis for active reading. In SADPI ’07: proceedings of the 2007 international workshop on semantically aware document processing and indexing (pp. 7–14). New York: ACM Press.
Goldberg, D., Malon, C., & Bern, M. (2004). A global approach to automatic solution of jigsaw puzzles. Computational Geometry, 28(2–3), 165–174.
Justino, E., Oliveira, L. S., & Freitas, C. (2006). Reconstructing shredded documents through feature matching. Forensic Science International, 160(2–3), 140–147.
Likforman-Sulem, L., Zahour, A., & Taconet, B. (2007). Text line segmentation of historical documents: a survey. International Journal on Document Analysis and Recognition, 9(2), 123–138.
Lu, X., Kataria, S., Brouwer, W. J., Wang, J. Z., Mitra, P., & Giles, C. L. (2009). Automated analysis of images in documents for intelligent document search. International Journal on Document Analysis and Recognition, 12(2), 65–81.
Marques, M. A. O., & Freitas, C. O. A. (2009). Reconstructing strip-shredded documents using color as feature matching. In SAC ’09: Proceedings of the 2009 ACM symposium on applied computing (pp. 893–894). New York: ACM Press.
Mladenoviá, N., & Hansen, P. (1997). Variable neighborhood search. Computers & Operations Research, 24(11), 1097–1100.
Ogier, J. M., & Tombre, K. (2006). Madonne: document image analysis techniques for cultural heritage documents. In International conference on digital cultural heritage, Vienna, Austria.
Prandtstetter, M., & Raidl, G. R. (2008). Combining forces to reconstruct strip shredded text documents. In HM ’08: proceedings of the 5th international workshop on hybrid metaheuristics (pp. 175–189). Berlin: Springer.
Prandtstetter, M., & Raidl, G. R. (2009). Meta-heuristics for reconstructing cross cut shredded text documents. In GECCO ’09: proceedings of the 11th annual conference on genetic and evolutionary computation (pp. 349–356). New York: ACM Press.
Tybon, R., & Kerr, D. (2009). Automated solutions to incomplete jigsaw puzzles. Artificial Intelligence Review, 32(1–4), 77–99.
Ukovich, A., Ramponi, G., Doulaverakis, H., Kompatsiaris, Y., & Strintzis, M. (2004). Shredded document reconstruction using mpeg-7 standard descriptors. In Signal processing and information technology. Proceedings of the fourth IEEE international symposium (pp. 334–337).
Ukovich, A., Zacchigna, A., Ramponi, G., & Schoier, G. (2006). Using clustering for document reconstruction. In E. R. Dougherty, J. T. Astola, K. O. Egiazarian, N. M. Nasrabadi, & S. A. Rizvi (Eds.), Society of photo-optical instrumentation engineers (SPIE) conference series (pp. 168–179). Bellingham: SPIE Press.
Wang, Y., & Wahl, F. M. (1996). Interactive multiobjective decision-making approach to image reconstruction from projections. Signal Processing, 48(1), 67–75.
Wenyin, L., Zhang, W., & Yan, L. (2007). An interactive example-driven approach to graphics recognition in engineering drawings. International Journal on Document Analysis and Recognition, 9(1), 13–29.
Zhang, S., Li, B., & Xue, X. (2010). Semi-automatic dynamic auxiliary-tag-aided image annotation. Pattern Recognition, 43(2), 470–477.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sleit, A., Massad, Y. & Musaddaq, M. An alternative clustering approach for reconstructing cross cut shredded text documents. Telecommun Syst 52, 1491–1501 (2013). https://doi.org/10.1007/s11235-011-9626-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11235-011-9626-x