1 Introduction

The seeping of ink from the reverse side (bleed-through) is a severe distortion affecting most ancient archival manuscripts. This degradation greatly impairs the legibility and fruition of the manuscript contents, besides being highly unpleasant. Physical restoration of these manuscripts cannot be performed, as the chemical substances used to remove the unwanted ink could also damage or even destroy the foreground ink. Thus, digitization and digital image processing is the only viable way to manage this problem. On the one hand, digital acquisitions, especially the multispectral ones, could be a solution per se. Indeed, it might happen that, under some specific wavelengths, the seeping ink tends to fade. More frequently, image processing techniques applied to the digital images offer a number of different strategies to approach the problem. When the aim is to simply improve readability, thresholding techniques, either global or local, could be the first attempt to be made. However, the usually high variability of the intensity of the bleed-through pattern, which can be sometimes as dark as the foreground text, can make thresholding ineffective. On the other hand, thresholding unavoidably removes also other features of the manuscript, such as paper watermarks and texture, and other marks (e.g., stamps or annotations), which could be useful to the scholar to study the origin of the document.

Since our aim is not only to improve readability, but also to remove the unwanted, uninformative interferences while preserving the original appearance and color of the manuscript, more sophisticated image processing techniques must be employed. Most methods working on a single side are based on classification of the pixels in background, foreground and bleed-through and then on the inpainting of the identified bleed-through areas with a simulated background [1]. However, the techniques that uses both recto and verso are often more effective, since they exploit all the available information [2,3,4,5,6,7,8,9,10,11,12,13]. As a counterpart, using recto–verso images entails their preliminary, very accurate alignment. Nowadays many libraries and archives have specialized digitization devices, e.g., high-resolution CCD cameras, mounted on mechanical equipments that guarantee a stable setup. Nonetheless, misalignments frequently occur, due to the human intervention for repositioning the sheet when acquiring the verso image, or to accidental movements of the camera. Furthermore, the sheet might be intrinsically distorted in a non-rigid way.

Registration of recto–verso images is a challenging task, due to sparsity of bleed-through, different intensity of the same stroke in the two sides, and different degree of curvature of recto and verso in books. Most methods consider a global affine transformation, estimated by minimizing the intensity differences between the two sides [14], by using the Fourier–Mellin transform [15], or through a block-by-block strategy, where for each block in the recto the corresponding verso block is searched for in a larger window by spatial cross-correlation [16]. Projective transformations have also been proposed, e.g., in [17], where a feature-based method employs the corners detected from extracted character contours, and in [18], where the features are corresponding points located from the relative shifts of pairs of small recto–verso patches, computed via cross-power spectrum. Non-rigid registration is proposed in [19], combining a global affine transformation with a free-form, hierarchical transformation based on B-splines and detected recto–verso corresponding points. A thin-plate spline smoothness constraint is then applied for minimizing the residual complexity between the two sides, as done in [20]. In [21], a similar method is proposed, where the two sides are first globally aligned using the page outline and then a local grid point warp is applied, combining the norm of differences between both intensity and gradient with a content preserving smoothness penalty.

In any case, the registration of the recto and verso images of a manuscript is a time-consuming task that can even become impossible if local deformations are present, as may happen when the documents are acquired from bound books, or when humidity produced bubbles on the paper. On the other hand, registration is not necessary per se, but it is just a way to make possible the spatial match of the information contained in the two sides, for analysis tasks that require its joint use. Thus, recto–verso registration can be considered as part of the whole bleed-through removal process.

The bleed-through cancelation method proposed in this paper overcomes the above-discussed limitations and difficulties, since it does not require the global registration of the two entire document sides. Indeed, it is based on local registration of small recto–verso patches, performed jointly with their restoration. We assume that the deformation between the two sides, though not describable by a unique geometrical transformation, is locally rigid. This implies that, at the very local level, it can be approximated by a translation only. Thus, for every individual pair of image patches at the same location and of a given small size, we compute their relative shift through cross-correlation of their gradients. The verso patch exactly aligned with the recto patch is then found by simple translation. Any bleed-through cancelation algorithm, which exploits information from both sides but acts locally, can then be applied to the patches so aligned. By spanning the images with the sequence of all the adjacent, non-overlapping patches, this procedure returns the restored versions of the entire images.

We compare this patch-by-patch alignment plus restoration modality with the classical one, where the two images are first globally registered and then globally restored. The quality of the final restored images is taken as a measure of the quality of the registration as well. For a quantitative quality measure, we applied our method to the 25 pairs of the bleed-through database described in [22] and available at [23]. The pairs of this database are already registered, so that we artificially distorted each verso side through a projective transformation that is typical of acquisitions of recto–verso flat manuscripts. We also provide qualitative results obtained for real documents, affected by an unknown and probably non-rigid misalignment.

The paper is organized as follows. In Sect. 2, the whole procedure is described, and the method for computing the relative shift of the recto–verso patches is formalized and discussed. Section 3 regards the quantitative evaluation of the method on the bleed-through database of [22]. Section 4 is devoted to the discussion of the application of the method to real manuscripts pairs. Finally, Sect. 5 concludes the paper with the description of possible improvements and extensions.

Fig. 1
figure 1

Diagram illustrating the single-step restoration process of one recto patch

2 Patch-by-patch recto–verso alignment and bleed-through cancelation

As anticipated in Sect. 1, the method proposed here assumes that, for each side, joint registration and bleed-through removal is performed sequentially on each pair of the small, non-overlapping patches in which the images can be subdivided. We assume that, at the very local level, the reciprocal deformation between the two sides amounts to a translation only. For each small patch of one side, the homologous (i.e., of same size and at the same location) patch in the opposite side is first selected. The relative displacement between the two patches is estimated from the cross-correlation of their gradients, relying on the shift theorem of the Fourier transform. The estimated shift is then used to locate the best matching opposite patch. We assume herein a perfect registration of the three color planes in the two sides, which means that the geometric deformation between the recto and verso side is unique for the three pairs of homologous channels. As a consequence, we can compute the relative shifts between homologous patches in a single pair of channels only.

The bleed-through cancelation algorithm that we chose is an improvement in the one detailed in [12]. We would like, however, to highlight that any of the existing recto–verso restoration algorithms can be applied to the aligned patches, provided that it works locally. The specific algorithm used here was originally designed to simultaneously restore registered whole recto–verso pairs, either grayscale or RGB. This algorithm first identifies the bleed-through areas, based on a pixel-by-pixel criterium, and then fills-in them with estimated background values. In this work, we apply the algorithm in a patch-by-patch modality and improve the way to estimate locally the background values, now computed at the patch level.

We will distinguish between single-step restoration, to indicate the joint patch-by-patch alignment plus restoration method proposed here, and two-step restoration, to indicate that a preliminary, global registration of the two sides is performed prior patch-by-patch restoration.

The whole registration–restoration procedure is applied two times. The first time, we consider the recto as the reference image and detect the pairs of patches by simultaneously moving a square window across the two sides in such a way to span the whole image domain. For each pair, we align the verso patch on the recto patch by simple translation and then restore the only recto patch. Since the reference recto patches being restored are adjacent and non-overlapping, once all of them have been processed, the whole restored recto is readily obtained, without any need for interpolation across the restored patches themselves. Note that the restored recto remains geometrically unaltered. The second time, the procedure is repeated by inverting the role of recto and verso. Thus, this time, we obtain the geometrically unaltered restored verso.

In case of RGB images, the restoration of every patch is performed independently for the three color channels.

The diagram in Fig. 1 illustrates the single-step restoration process of one recto patch. Phase 1 shows the grid of all the patches in which the reference recto image (top) and the verso image (bottom) are subdivided. Note that the number and size of the patches are not the actual ones used for this manuscript; we depicted larger patches for make clearer the interpretation of the diagram. The two homologous (at same position) patches at hand are highlighted with a red box, and individually shown, enlarged, at Phase 2. Their misalignment is apparent. The two patches are used as input for the computation of their cross-correlation and then of their mutual shift (Phase 3). Based on the estimated shift, a different verso patch that best matches with the given recto patch can be now located in the verso image. For illustration purposes, in the whole verso image (Phase 1) we highlighted with a yellow box the location of this matching patch. The aligned verso patch is shown enlarged at Phase 4, and given as input, along with the original recto patch, to the restoration algorithm of Phase 5. This produces their free-of-interferences versions, shown at Phase 6 (only the recto is shown). Finally, the restored recto patch is put back in the whole recto image being restored (the patch highlighted with the green box at Phase 7). The procedure illustrated in the diagram is repeated for each pair of homologous patches, selected in any order. For clarity safe, in the diagram we refer to patches picked up in sequence from top to bottom and from left to right. As a matter of fact, in the recto of Phase 7 the patches that precede the one at hand are shown in their restored version, while the following ones are still degraded.

2.1 Alignment of the patches

With reference to the physical manuscript, given two patches \(f_r\) and \(f_v\) such that, in a common support, it is \(f_r(x+~\varDelta x_0,y+\varDelta y_0)=f_v(x,y)\), the following relationship between their FTs holds true:

$$\begin{aligned} F_v(\omega _x,\omega _y)=F_r(\omega _x,\omega _y)e^{j(\omega _x\varDelta x_0+\omega _y\varDelta y_0)} \end{aligned}$$
(1)

from which:

$$\begin{aligned} \frac{F_v(\omega _x,\omega _y)}{F_r(\omega _x,\omega _y)}=e^{j(\omega _x\varDelta x_0+\omega _y\varDelta y_0)} \end{aligned}$$
(2)

Theoretically, the inverse FT of the ratio in Eq. (2) would return a delta of Dirac impulse located in \((\varDelta x_0,\varDelta y_0)\). Nevertheless, as always happens when inverse filtering is used, due to the presence of random noise, dissimilar parts and gain changes, this operation produces a very noisy map, where a trustable peak cannot be located. A more robust estimate of the cross-correlation between \(f_r\) and \(f_v\) is given by the inverse FT of the cross-power spectrum:

$$\begin{aligned} \frac{F_v(\omega _x,\omega _y)\cdot F_r^*(\omega _x,\omega _y)}{|F_v(\omega _x,\omega _y)|\cdot |F_r^*(\omega _x,\omega _y)|}=e^{j(\omega _x\varDelta x_0+\omega _y\varDelta y_0)} \end{aligned}$$
(3)

where \(*\) denotes the complex conjugate. The location of the now well-emerging peak of the cross-correlation function so computed defines the relative displacement between the two patches.

Fig. 2
figure 2

Correlation matrix computed on the intensities (a) and on the gradients (b)

To further improve the estimation of the shifts, we compute cross-correlation of the gradients of the patches, rather than of their intensities. Indeed, though depicting the same scene, recto–verso images have different intensities. Specifically, dark strokes in one side (foreground text) are lighter in the other (bleed-through pattern), and vice versa. As for images of a same scene taken with different sensors (e.g., RGB color channels), we thus inferred that recto and verso mainly correlate in correspondence of the object borders or textures and that a measure of correlation is more reliable when performed on the gradients of the patches.

Figure 2 shows the correlation matrices computed on the intensities and on the gradients, respectively, for a typical pair of patches. Note how the peak of the gradient correlation matrix is much better defined than that of the intensity correlation matrix.

Whether using intensities or gradients, for an effective estimation of the relative shift the two patches must share some common strokes, either see-through in one side and foreground in the opposite side, or mixed see-through and foreground in both sides. Indeed, it is apparent that when the common portion of text is very little, the estimate might be inaccurate. Thus, the patches must be large enough to share a significant portion of text, while they must be small enough to assume that their misalignment can be approximated by a translation only. To simultaneously satisfy these opposite conditions, a good compromise is to make the size of the patch depending on the character size. For recto–verso pairs as large as around \(3000\times 4500\), which is typical of manuscript letters of the sixteenth–seventeenth centuries acquired at very high resolution, we experimentally found that patches whose size is between \(150\times 150\) and \(200\times 200\) give satisfactory results. When it is apparent that in some regions the misalignment is high, so that the two patches could share a too small portion of common text, it may also be convenient to compute cross-correlation between the recto patch and an enlarged window containing the homologous verso patch.

As a further drawback, it is clear that when one of the two patches or both are pure background, the shift computed with the method above is meaningless.

To cope with these inaccurate or meaningless shift estimates, the shifts for all the pairs of patches are computed off-line, possible outliers are identified on the basis of their deviation from local means and then corrected by averaging them with their four neighborhoods. In Fig. 3, the maps of the (x, y) values of the shifts of a few, adjacent recto–verso patches are shown before and after the correction of the outliers.

Fig. 3
figure 3

Typical map of the relative shifts estimated for a few adjacent recto–verso patches: a with outliers, b after correction of the outliers

2.2 Implementation of the single-step restoration algorithm

The pseudocode of the function SSR, which implements the single-step restoration algorithm, is shown in Fig. 4. The input parameters of the function are the Recto and Verso images and the size in pixel of the patches (PatchSize). The function returns the restored recto image (RectoRestored). For easier understanding, in the reported pseudocode we made some simplifying assumptions: The images are considered to be graylevel, and the number of rows and columns in the images is assumed to be an exact multiple of the patch size. The pseudocode can be easily modified to account for color images of any size, as the actual implemented algorithm does.

Initially, the function determines the number of horizontal and vertical patches (N and M). The subsequent loop on N and M selects the various recto–verso patch pairs in the images and calculates their mutual shifts through the function calculateShift, which implements the method described in Sect. 2.1. The shift of each verso patch with respect to the homologous recto patch is stored in the matrix Shift, whose size is NxM. The shift matrix is then corrected for possible outliers, as explained at the end of Sect. 2.1, through the function correctOutliers.

The second loop selects again the recto patches and, by means of function shiftPatch, the shifted verso patches. Then, for each pair of patches, it estimates the average background values RectoBg and VersoBg, respectively, through the computeBackground function, and performs the restoration of the recto, through the function restorePatch. This function implements the bleed-through cancelation algorithm detailed in [12]. Each restored recto patch is then inserted in the proper position into the restored recto image RectoRestored, through function composeImage.

Fig. 4
figure 4

Pseudocode of the function SSR implementing the single-step restoration algorithm

3 Quantitative analysis

We quantitatively measured the results of our single-step method on the 25 pairs of the entire database in [23] and compared them with the results of the two-step method, based on the registration algorithm proposed in [18]. The pairs of this database are already registered, so that, assuming the alignment to be optimal, we also measured on them the quality of the restoration algorithm alone and used these measurements as a baseline.

In order to test the single-step and the two-step methods on this database, we artificially distorted the verso side of each image pair through a typical projective transformation previously estimated on a real pair of manuscripts. The matrix P of the used projective transformation is the following:

$$\begin{aligned} P=\begin{bmatrix} 0.969&\quad -0.016&\quad -1.071e-05 \\ -0.002&\quad 0.983&\quad 3.621e-07\\ 16.181&\quad 19.539&\quad 0.999 \end{bmatrix} \end{aligned}$$
(4)

where the various coefficients account for translations, scale factors and projective deformations.

To illustrate the complete procedure described in the previous section, we refer to the third pair of the database shown in Fig. 5a, b.

We first considered the undistorted, pre-registered pair available in the database and compared the results of the patch-by-patch restoration algorithm (Fig. 5e, f) with those furnished by the algorithm in [11] (Fig. 5b, c).

Fig. 5
figure 5

Application of the patch-by-patch restoration to a real pre-registered graylevel recto–verso pair: a original degraded recto, b original degraded verso, c recto restored with the algorithm in [11], d verso restored with the algorithm in [11], e recto restored with our patch-by-patch restoration algorithm, f verso restored with our patch-by-patch restoration algorithm. Original images a and b digitized by Irish Script On Screen (www.isos.dias.ie)

Note that, in the verso side, a few bleed-through strokes left in the result of [11] (highlighted with the red box) are fully removed by our algorithm, while some foreground strokes at the bottom of the recto side that are lost in the result of [11] are preserved by our algorithm.

We then show, in Fig. 6b, the verso distorted according to the projective transform of Eq. (4). Figure 6e, f shows the result of the application of the patch-by-patch, single-step alignment/restoration procedure proposed (only the recto is shown). For comparison, Fig. 6c, d shows the result of the two-step restoration, i.e., restoration applied after the preliminary global registration of the pair in Fig. 6a, b.

Fig. 6
figure 6

Application of the single-step alignment/restoration to the pair of Fig. 5a, b, where the verso side has been geometrically distorted by the projective transformation of Eq. (4): a original degraded recto, b original degraded and distorted verso, c two-step restored recto, d two-step restored verso, e single-step restored recto, f single-step restored verso

Note that the quality of the results provided by the two methods is similar. This was largely expected, since, in this case, the artificial deformation is globally rigid. However, in the result of the two-step method some bleed-through borders remain unremoved, and some strokes of the foreground text are not preserved. This seems to indicate that, due to computational approximations when applying the deformation to the verso, and the necessary interpolation to render the digital image, the actual deformation is not truly global. Thus, acting locally results to be beneficial anyway.

For each image in the database, a binary ground-truth mask of the foreground text is provided. Although these ground-truth images are synthetic, i.e., created manually, some authors have used them for a quantitative analysis of the results, comparing them with the binarized versions of the restored images. Since the ground-truth masks are available for the undistorted images, we can compare them with our binarized results only for the recto side. Figure 7 shows the available ground truth for the original recto of Fig. 6a compared with the binarized restored recto of Fig. 6c.

Fig. 7
figure 7

Comparison between the binarized version of the restored recto of Fig. 6e and the available ground truth: a manually generated ground truth of the undegraded recto, b binarized restored recto. Original image a provided by Irish Script On Screen (www.isos.dias.ie)

Fig. 8
figure 8

Plots of the weighted total errors for the 25 images of the database in [23]. Blue line: restoration of the undistorted images; red line: single-step restoration of the distorted images; gray line: two-step restoration of the distorted images (color figure online)

As binarization algorithm, we used the adaptive Sauvola algorithm [24]. As quality indices we computed the probability FgError that a pixel in the foreground text was classified as background, the probability BgError that a background or bleed-through pixel was classified as foreground, and the WTotError, that is, the weighted mean of FgError and BgError, with the weights being the numbers of the foreground pixels and the background pixels as they result from the corresponding ground-truth images. The weighted total error WTotError indicates the probability that any pixel in the image was misclassified. According to [22], these quality indices are defined as:

$$\begin{aligned} \begin{array}{ll} FgError &{}=\frac{1}{N_{Fg}}\sum _ {t\in GT(Fg)}|GT(t)-B(t)| \\ BgError &{}=\frac{1}{N_{Bg}}\sum _ {t\in GT(Bg)}|GT(t)-B(t)| \\ WTotError &{}=\frac{N_{Fg}FgError+N_{Bg}BgError}{N} \end{array} \end{aligned}$$
(5)

where GT is the ground truth, B is the binarized restoration result, GT(Fg) is the foreground region of the ground-truth image constituted of \(N_{Fg}\) pixels, GT(Bg) is the complementary background region of the ground-truth image constituted of \(N_{Bg}\) pixels and N is the total number of pixels in the image.

Fig. 9
figure 9

Plots of the execution times (in s) for the 25 images of the database in [23]. Blue line: restoration of the undistorted images; red line: single-step restoration of the distorted images; gray line: two-step restoration of the distorted images (color figure online)

The plots of Fig. 8 show the comparison of the WTotError quality measure obtained, at each image, with the results of three methods: (1) the patch-by-patch restoration applied to the original, undistorted pair (blue line), which we consider as our reference, (2) the single-step alignment/restoration applied to the distorted pair (red line) and (3) the two-step registration/restoration applied to the distorted pair (gray line).

From the plots, it is apparent that the combined alignment plus restoration method proposed performs almost identically to the restoration algorithm applied to the original aligned pairs, whereas it is much better than the two-step method where the pairs are registered off-line. We already commented the probable reasons of this behavior from the qualitative analysis of the results of Fig. 6.

Finally, it is worth highlighting again the simplicity of our method, which leads to a very fast algorithm. The execution times for each of the 25 images are reported in the plots of Fig. 9, compared with those of the two-step algorithm.

To conclude the analysis of our method on artificially distorted images of the database in [23], we attempted bleed-through cancelation on the same pair shown in Fig. 5, where the verso has been now distorted through an elastic warping operator available as a tool of Photoshop. This warping operator was manually applied in different ways to various areas of the image.

In a first example, we tried to simulate the typical situation occurring when the acquisition is made from a book. We left the recto side flat and reproduced the effect due to the curvature of the verso side in correspondence of the binding regions of the book. Figure 10a, b shows the distorted verso and its superposition in transparency with the original, flat verso. Figure 10c, d shows the restored recto and verso sides obtained through the single-step method.

In a second example, we applied to the verso a stronger local, elastic deformation, not necessarily corresponding to a real situation. Our aim was just to qualitatively evaluate the robustness of our shift-based patch-by-patch alignment against locally elastic deformations. Figure 11a, b shows the distorted verso and its superposition in transparency with the original, flat verso. It is apparent the non-stationarity and non-rigidity of the deformation, which produces significant scale changes in some image areas.

Fig. 10
figure 10

Application of a local warping to the verso of Fig. 5b: a distorted verso, b superposition in transparency of the original verso and the distorted verso, c restored recto, d restored verso

Fig. 11
figure 11

Application of local elastic warpings to the verso of Fig. 5b: a distorted verso, b superposition in transparency of the original verso and the distorted verso, c recto restored by the single-step method, d verso restored by the single-step method, e recto restored after global registration through the Elastix toolbox, f verso restored after global registration through the Elastix toolbox

Fig. 12
figure 12

Evaluation of the robustness of patch alignment for different degrees of bleed-through. First and second rows: recto and undeformed verso with (from left to right) \(10\%\), \(20\%\), \(30\%\), \(40\%\), \(50\%\) of bleed-through; third and fourth rows: corresponding histograms of the horizontal and vertical shifts computed on the entire set of patches

Despite the fact that our method, at present, does not treat scale changes, the mechanism of working on small patches is partially able to overcome this inconvenience. The results of the single-step method, shown in Fig. 11c, d, can be considered satisfactory, though the bleed-through is not perfectly removed. It is worth to highlight that no rigid global registration technique could be able to make flat the distorted image, so that we did not apply the two-step method in this situation. Instead, we attempted to flatten the distorted verso (i.e., to align it on the flat recto), by using a recently proposed image registration software toolbox, namely Elastix [25], designed for the non-rigid registration of elastically distorted medical images.

Elastix exploits local similarity measures (local correlation and local mutual information), which are well suited to cope with image intensity inhomogeneity and non-stationarity, to iteratively update the transformation using gradient descent. Elastix provides a broad range of intensity-based registration options and settings, so that it has many parameters to be set. We run the code several times to empirically find the parameter values that produced the best results. The aligned recto–verso pair was given as input to the patch-by-patch restoration algorithm, whose results are shown in Fig. 11e, f. As it can be appreciated, the quality of the restored images obtained with the two procedures is similar.

Fig. 13
figure 13

A real RGB manuscript: a original degraded recto, b original degraded (mirrored) verso, c superposition in transparency of the original recto and the mirrored verso

Fig. 14
figure 14

Restoration to the misaligned pair of Fig. 13a, b: a recto restored with the two-step method, b recto restored by the proposed single-step method, c verso restored with the two-step method, d verso restored by the proposed single-step method

Fig. 15
figure 15

An enlarged detail of the manuscript in Figs.  13 and 14 : a original recto, b original verso, c verso aligned on the recto through global image registration, d verso restored with the single-step proposed method, e verso restored after preliminary, global image registration

3.1 Robustness of patch alignment

In Sect. 2.1, we discussed the necessary conditions for a reliable estimation of the mutual shifts between pairs of recto–verso patches through cross-correlation of the intensities or the edges. These conditions mainly regard the size of the patches, which must be large enough to make the two recto–verso patches share a significant portion of text, and small enough to assume that their misalignment can be approximated by a translation only. We also provided practical reliefs to fix possible inaccurate or meaningless shift estimates. However, it is apparent that also the measurement of correlation between two signals can be unreliable itself, for instance, in case of large differences in the amplitude of the two signals. In our specific application, this happens when the degree of ink seeping is low, so that there is a large difference between the intensity (or gradient) of the bleed-through pattern and that of the foreground text that generated it.

To test the robustness of the patch alignment strategy against different degrees of bleed-through, we performed the following synthetic experiment. We used the first pair of images of the public dataset in [22, 23], since for those images binary ground truths of the clean recto and verso sides are available. We generated a synthetic, typical document background, one for the recto and another one for the verso, and then placed on them the two clean foreground texts. In this way, we obtained clean recto and verso images. We then mixed the two clean images according to the density model described in [12], by using different percentages of ink seeping. We considered values of bleed-through percentage ranging from 5 to \(50\%\). For simplicity sake, and without loss of generality, the mixing model was considered space invariant in this experiment. As the generated recto and verso images are perfectly aligned, we then applied to the verso image a fixed, global shift, whose values are comparable to the typical shifts that we found in real recto–verso document images. We applied a deformation constituted by a translation only since this allows us to precisely evaluate the quality of the patch alignment process. Figure 12, first and second rows, shows the unshifted recto–verso images for some of the degrees of bleed-through analyzed.

At each percentage of bleed-through, we computed cross-correlation of each pair of all the adjacent and non-overlapping recto–verso patches in the two images. The estimated mutual shift values were compared with the true shift applied to the verso image, which is known by construction. We computed the average of the shifts estimated for the entire set of patches, and their histogram, i.e., the percentages of patches returning the same shift value. In case of perfect shift estimation for all patches, the average should be equal to the true shift, and the histogram should consist of a single impulse, located in correspondence of the true shift value.

Fig. 16
figure 16

An enlarged detail of the manuscript in Figs. 13 and 14: a original verso, b verso restored with the single-step proposed method, c verso restored with the two-step method

We have experimentally found that for percentage values of bleed-through higher than \(20\%\) the average shift values in the x and y directions are always almost equal to the true shift, whereas the histogram peak is exactly located in the true shift, for both x and y directions (see Fig. 12, third and fourth rows). For each percentage of bleed-through, we also compared the quality of restoration in the cases of shifted and unshifted verso image. For percentage values of bleed-through higher than \(20\%\), the weighted total errors for shifted and unshifted verso images are comparable.

When the percentage values of bleed-through are lower than \(20\%\), shift estimation returns random values, so that the application of the restoration algorithm is meaningless. On the other hand, for those percentages bleed-through is practically imperceptible, as it can be appreciated from Fig. 12, and it can be removed by simple thresholding.

4 Experiments on real color manuscripts

In this section, we will discuss the results of an experiment performed on a recto–verso RGB manuscript. Figure 13 shows the original recto and reflected verso images, together with their superposition in transparency.

From Fig. 13c, it clearly appears an unidentifiable misalignment between the two sides. We first attempted the two-step restoration modality, assuming a projective deformation between the two sides. This produced the results in Fig. 14a, c, where some bleed-through strokes are still visible. We inferred that the relative deformation between the two sides is likely to be non-globally rigid. The application of the single-step method proposed in this paper gives the much better results of Fig. 14b, d.

The observation of enlarged details allows a more comprehensive discussion of the differences between the two methods. Figure 15a, b shows two homologous areas in the original recto and verso sides. Figure 15c shows the verso area after the global registration of the two whole images. It is apparent that, while the horizontal shift component of the mutual deformation has been corrected, the vertical shift component is only little reduced. As a consequence, the restoration algorithm was not able to remove some bleed-through strokes (see Fig. 15e). The improved result obtained with the single-step algorithm is shown in Fig. 15d. Note also that, besides exhibiting an imperfect correction of the geometrical deformation, the detail of Fig. 15c appears manifestly smoother than its original. This degradation persists in its poorly restored version of Fig. 15e. Oversmoothing is a typical effect of applying an off-line registration to the image pair, since interpolation of the verso pixels is necessary in order to correct the geometric deformation. This unpleasant effect is absent in the result of the single-step method, since we only admit translations between the patches. Note also that the restored area has maintained its geometrical, original asset.

Another detail is shown in Fig. 16. Here, the two-step method performs better than in the previous detail (see Fig. 16c), meaning that the local misalignment has been satisfactorily corrected by the global registration process. Consequently, the bleed-through strokes are almost completely removed. However, due to the little misalignment still left, their borders remain unremoved, making the bleed-through still visible. The result of the single-step method is superior, as shown in Fig. 16b.

5 Conclusions and future work

We proposed a fully automatic procedure for the joint registration and restoration of recto–verso misaligned manuscripts affected by bleed-through, assuming the relative deformation to be locally rigid. For each pair of small homologous patches in the two sides, the verso is registered on the recto by a simple translation, estimated by cross-correlation of image gradients, rather than image intensities. Then, a pixel-by-pixel identification of the bleed-through pattern, followed by inpainting with locally estimated background values, is performed. The procedure is repeated by inverting the two sides, thus furnishing their restored versions while leaving unaltered the original geometric appearance. The results show a significant improvement in the local registration performance, with a consequent much more effective removal of the bleed-through, when compared with the classical two-step procedure that globally aligns the two sides off-line. Furthermore, the single-step procedure is much faster. We experimented the procedure also in the case of mild elastic deformations, obtaining a satisfactory performance as well.

Straightforward future studies could consist in experimenting this patch-by-patch joint alignment and restoration with other bleed-through removal algorithms available in the literature. Furthermore, the presently adopted restoration algorithm could be improved, with respect to both the identification step and the inpainting step. For instance, in [26] we proposed an inpainting technique based on image sparse representation and dictionary learning.

As per the registration, future investigations could regard the extension of the method to stronger and locally varying elastic deformations of the sheet. At the patch level, such deformations could be approximated by affine transformations accounting also for scale changes, and estimated, e.g., by a Fourier–Mellin transform.