Geometric Total Variation for Image Vectorization, Zooming and Pixel Art Depixelizing

Kerautret, Bertrand; Lachaud, Jacques-Olivier

doi:10.1007/978-3-030-41404-7_28

Bertrand Kerautret¹² &
Jacques-Olivier Lachaud¹³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12046))

Included in the following conference series:

Asian Conference on Pattern Recognition

1586 Accesses
1 Citations

Abstract

We propose an original method for vectorizing an image or zooming it at an arbitrary scale. The core of our method relies on the resolution of a geometric variational model and therefore offers theoretic guarantees. More precisely, it associates a total variation energy to every valid triangulation of the image pixels. Its minimization induces a triangulation that reflects image gradients. We then exploit this triangulation to precisely locate discontinuities, which can then simply be vectorized or zoomed. This new approach works on arbitrary images without any learning phase. It is particularly appealing for processing images with low quantization like pixel art and can be used for depixelizing such images. The method can be evaluated with an online demonstrator, where users can reproduce results presented here or upload their own images.

This work has been partly funded by CoMeDiC ANR-15-CE40-0006 research grant.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Depth-aware image vectorization and editing

Article 07 May 2019

Gradient Art: Creation and Vectorization

Infinite Dimensional Optimization Models and PDEs for Dejittering

Keywords

1 Introduction

There exist many methods for zooming into raster images, and they can be divided into two groups. The first group gathers “image super-resolution” or “image interpolation” techniques and tries to deduce colors on the zoomed image from nearby colors in the original image. These methods compute in the raster domain and their output is not scalable. The second group is composed of “image vectorization” approaches and tries to build a higher level representation of the image made of painted polygons and splines. These representations can be zoomed, rotated or transformed. Each approach has its own merits, but none perform well both on camera pictures and on drawings or pixel art.

Image Interpolation/Super Resolution. These approaches construct an interpolated image with higher resolution. A common trait is that they see the image as a continuous function, such that we only see its sampling in the original image. Their diversity lies generally in the choice of the space of continuous functions. For instance linear/cubic interpolation just fits linear/cubic functions, but much more complicated spaces have been considered. For instance the method proposed by Getreuer introduces the concept of contour stencils [6, 7] to accurately estimate the edge orientation. Roussos and Maragos proposed another strategy using a tensor driven diffusion [19] (see also [8]). Due to the increasing popularity of convolutional neural network (CNN), several authors propose super resolution networks to obtain super resolution image. For instance Shi et al. define a CNN architecture exploiting feature maps computed in low resolution space [22]. After learning, this strategy reduces the computational complexity for a fixed super-resolution and gives interesting results on photo pictures. Of course, all these approaches do not provide any vectorial representation and give low quality results on images with low quantization or pixel art images. Another common feature is their tendency to “invent” false oscillating contours, like in Fourier super-resolution.

Pixel Art Super-Resolution. A subgroup of these methods is dedicated to the interpolation of pixel art images, like the classic HqX magnification filter [23] where X represents the scale factor. We may quote a more advanced approach which proposes as well a full vectorization of pixel art image [13]. Even if the resulting reconstruction for tiny images is nice looking, the proposed approach is designed specifically for hand-crafted pixel art image and is not adaptable to photo pictures.

Image Vectorization. Recovering a vector-based representation from a bitmap image is a classical problem in the field of digital geometry, computer graphics and pattern recognition. Various commercial design softwares propose such a feature, like Indesign™, or Vector Magic [10] and their algorithms are naturally not disclosed. Vectorization is also important for technical document analysis and is often performed by decomposition into arcs and straight line segments [9]. For binary images, many vectorization methods are available and may rely on dominant points detection [16, 17], relaxed straightness properties [1], multi-scale analysis [5] or digital curvature computation [11, 15].

Extension to grey level images is generally achieved by image decomposition into level sets, vectorization of each level, then fusion (e.g. [12]). Extension to color images is not straightforward. Swaminarayan and Prasad proposed to use contour detection associated to a Delaunay triangulation [25] but despite the original strategy, the digital geometry of the image is not directly exploited like in above-mentionned approaches. Other methods define the vectorization as a variational problem, like Lecot and Levy who use the Mumford Shah piecewise smooth model and an intermediate triangulation step to best approximate the image [14]. Other comparable methods construct the vectorization from topological preservation criteria [24], from splines and adaptive Delaunay triangulation [4] or from Bézier-curves-patch-based representation [26]. We may also cite the interactive method proposed by Price and Barrett that let a user edit interactively the reconstruction from a graph cut segmentation [18].

Our Contribution. We propose an original approach to this problem, which borrows and mixes techniques from both groups of methods, thus making it very versatile. We start with some notations that will be used thoughout the paper. Let $\varOmega $ be some open subset of $\mathbb {R}^2$, here a rectangular subset defining the image domain. Let K be the image range, a vector space that is generally $\mathbb {R}$ for monochromatic images and $\mathbb {R}^3$ for color image, equipped with some norm $\left\| \cdot \right\| _K$. We assume in the following that gray-level and color components lie in $[0, 1 ]$.

First our approach is related with the famous Total Variation ($\mathrm {TV}$), a well known variational model for image denoising or inpainting [20]. Recall that if f is a differentiable function in $\varOmega $, its total variation can be simply defined as:

$$\begin{aligned} \mathrm {TV}(f) := \int _\varOmega \left\| \nabla f (x) \right\| _K \mathrm {d}x. \end{aligned}$$

(1)

In a more general setting, the total variation is generally defined by duality. As noted by many authors [3], different discretizations of $\mathrm {TV}$ are not equivalent. However they are all defined locally with neighboring pixels, and this is why they are not able to fully capture the structure of images.

In Sect. 2, we propose a geometric total variation. Its optimization seeks the triangulation of lattice points of $\varOmega $ that minimizes a well-defined $\mathrm {TV}$ corresponding to a piecewise linear function interpolating the image. The optimal triangulation does not always connect neighboring pixels and align its edges with image discontinuities. For instance digital straight segments are recognized by the geometric total variation.

In Sect. 3, we show how to construct a vector representation from the obtained triangulation, by regularizing a kind of graph dual to the triangulation. Section 4 shows a super-resolution algorithm that builds a zoomed raster image with smooth Bezier discontinuities from the vector representation. Finally, in Sect. 5 we compare our approach to other super-resolution and vectorization methods and discuss the results.

2 Geometric Total Variation

The main idea of our geometric total variation is to structure the image domain into a triangulation, whose vertices are the pixel centers located at integer coordinates. Furthermore, any triangle of this triangulation should cover exactly three lattice points (its vertices). The set of such triangulations of lattice points in $\varOmega $ is denoted by ${\mathcal {T}(\varOmega )}$.

Let s be an image, i.e. a function from the set of lattice points of $\varOmega $ to K. We define the geometric total variation of s with respect to an arbitrary triangulation T of ${\mathcal {T}(\varOmega )}$ as

$$\begin{aligned} \mathrm {GTV}(T, s) := \int _\varOmega \left\| \nabla \varPhi _{T,s}(x) \right\| _K dx, \end{aligned}$$

(2)

where $\varPhi _{T,s}(x)$ is the linear interpolation of s in the triangle of T containing x. Note that points of $\varOmega $ whose gradient is not defined have null measure in the above integral (they belong to triangle edges of T).

And the Geometric Total Variation (GTV) of s is simply the smallest among all possible triangulations:

$$\begin{aligned} \mathrm {GTV}_{{\mathcal {T}(\varOmega )}} (s) := \min _{T \in {\mathcal {T}(\varOmega )}} \mathrm {GTV}(T,s). \end{aligned}$$

(3)

In other words, the GTV of a digital image s is the smallest continuous total variation among all natural triangulations sampling s. Since ${\mathcal {T}(\varOmega )}$ will not change in the following, we will omit it as subscript and write simply $\mathrm {GTV}(s)$.

One should note that classical discrete TV models associated to a digital image generally overestimate its total variation (in fact, the perimeter of its level-sets due to the co-area formula) [3]. Classical discrete TV models $\mathrm {TV}(s)$ are very close to $\mathrm {GTV}(T,s)$ when T is any trivial Delaunay triangulation of the lattice points. By finding more meaningful triangulation, $\mathrm {GTV}(s)$ will often be much smaller than $\mathrm {TV}(s)$.

Combinatorial Expression of Geometric Total Variation. Since $\varPhi _{T,s}(x)$ is the linear interpolation of s in the triangle $\tau $ of T containing x, it is clear that its gradient is constant within the interior of $\tau $. If $\tau =\mathbf {p}\mathbf {q}\mathbf {r}$ where $\mathbf {p}, \mathbf {q}, \mathbf {r}$ are the vertices of $\tau $, one computes easily that for any x in the interior of $\tau $, we have

$$\begin{aligned} \nabla \varPhi _{T,s}(x)&= s(\mathbf {p}) (\mathbf {r}-\mathbf {q})^\perp + s(\mathbf {q}) (\mathbf {p}-\mathbf {r})^\perp + s(\mathbf {r}) (\mathbf {q}-\mathbf {p})^\perp . \end{aligned}$$

(4)

We thus define the discrete gradient of s within a triangle $\mathbf {p}\mathbf {q}\mathbf {r}$ as

$$\begin{aligned} \varvec{\nabla }s (\mathbf {p}\mathbf {q}\mathbf {r}) := s(\mathbf {p}) (\mathbf {r}-\mathbf {q})^\perp + s(\mathbf {q}) (\mathbf {p}-\mathbf {r})^\perp + s(\mathbf {r}) (\mathbf {q}-\mathbf {p})^\perp , \end{aligned}$$

(5)

where $\mathbf {x}^\perp $ is the vector $\mathbf {x}$ rotated by $\frac{\pi }{2}$. Now it is well known that any lattice triangle that has no integer point in its interior has an area $\frac{1}{2}$ (just apply Pick’s theorem for instance). It follows that for any triangulation T of ${\mathcal {T}(\varOmega )}$, every triangle has the same area $\frac{1}{2}$. We obtain the following expression for the $\mathrm {GTV}$:

$$\begin{aligned} \mathrm {GTV}(T, s)&= \int _\varOmega \left\| \nabla \varPhi _{T,s}(x) \right\| _K dx \nonumber \\&= \sum _{\mathbf {p}\mathbf {q}\mathbf {r}\in T} \int _{\mathbf {p}\mathbf {q}\mathbf {r}} \left\| \nabla \varPhi _{T,s}(x) \right\| _K dx&\text {(integral is additive)} \nonumber \\&= \sum _{\mathbf {p}\mathbf {q}\mathbf {r}\in T} \int _{\mathbf {p}\mathbf {q}\mathbf {r}} \left\| \varvec{\nabla }s (\mathbf {p}\mathbf {q}\mathbf {r}) \right\| _K dx&\text {(by (4) and (5))} \nonumber \\&= \frac{1}{2}\sum _{\mathbf {p}\mathbf {q}\mathbf {r}\in T} \left\| \varvec{\nabla }s (\mathbf {p}\mathbf {q}\mathbf {r}) \right\| _K&\text {(since }\mathrm {Area}(\mathbf {p}\mathbf {q}\mathbf {r}) = \frac{1}{2}) \end{aligned}$$

(6)

Minimization of GTV. Since we have a local combinatorial method to compute the $\mathrm {GTV}$ of an arbitrary triangulation, our approach to find the optimal $\mathrm {GTV}$ of some image s will be greedy, iterative and randomized. We will start from an arbitrary triangulation (here the natural triangulation of the grid where each square has been subdivided in two triangles) and we will flip two triangles whenever it decreases the GTV (Table 1).

Table 1. Illustration of geometric total variation on a simple black and white image. Even if the pixel values of the image are the same (represented by black and white disks), different triangulations of the pixels yield different $\mathrm {GTV}$. In this case, the right one is the optimal triangulation and is clearly more aligned with the underlying contour. Vectorizing or zooming the right triangulation will provide more interesting results.

Full size table

This approach is similar in spirit to the approach of Bobenko and Springborn [2], whose aim is to define discrete minimal surfaces. However they minimize the Dirichlet energy of the triangulation (i.e. squared norm of gradient) and they show that the optimum is related to the intrinsic Delaunay triangulation of the sampling points, which can be reached by local triangle flips. On the contrary, our energy has not this nice convexity property and it is easy to see for instance that minima are generally not unique. We have also checked that there is a continuum of energy if we minimize the p-norm $\left\| \cdot \right\| _K^p$ for $1 \le p \le 2$. For $p=2$ the optimum is trivially the Delaunay triangulation of the lattice points. When p is decreased toward 1, we observe that triangle edges are more and more perpendicular to the discrete gradient.

Table 1 shows the $\mathrm {GTV}$ of several triangulations in the simple case of a binary image representing an edge contour with slope $\frac{2}{3}$. It shows that the smaller the $\mathrm {GTV}$ the more align with the digital contour is the triangulation.

Algorithm 1 is the main function that tries to find a triangulation T which is as close as possible to $\mathrm {GTV}(s)$. It starts from any triangulation of s (line 6 : we just split into two triangles each square between four adjacent pixels). Then it proceeds by passes (line 7). At each pass (line 10), every potential arc is checked for a possible flip by a call to CheckArc (Algorithm 2) at line 13. The arc is always flipped if it decreases the $\mathrm {GTV}$ and randomly flipped if it does not change the $\mathrm {GTV}$ (line 14). Arcs close to the flip are queued (line 16) since their further flip may decrease the $\mathrm {GTV}$ later. The algorithm stops when the number of flips decreasing the $\mathrm {GTV}$ is zero (flips keeping the same $\mathrm {GTV}$ are not counted).

Function CheckArc (Algorithm 2) checks if flipping some arc would decrease the $\mathrm {GTV}$. It first checks if the arc/edge is flippable at line 3 (e.g. not on the boundary). Then it is enough to check only one arc per edge (line 5). Afterwards the edge may be flipped only if the four points surrounding the faces touching the edge form a strictly convex quadrilateron (line 6). Finally it computes the local $\mathrm {GTV}$ of the two possible decompositions of this quadrilateron using formula (6) and outputs the corresponding case (from line 7).

Note that the randomization of flips in the case where the energy $\mathrm {GTV}$ is not changed is rather important in practice. As said above, it is not a “convex” energy since it is easy to find instances where there are several optima (of course, our search space being combinatorial, convexity is not a well defined notion). Randomization helps in quitting local minima and this simple trick gives nice results in practice.

Figure 1 illustrates the capacity of geometric total variation to capture the direction of image discontinuities. In this simple application, we use the discrete gradient on each triangle to display it with the corresponding linear shaded gradient. Hence we can zoom arbitrarily any image. The results show that, if we stay with the initial triangulation (as does standard discretization of $\mathrm {TV}$), zoomed image are not great. On the contrary, results are considerably improved if we use the optimized triangulation of $\mathrm {GTV}$. Last, we can simply render triangles with a crisped version of this gradient. Results are very nice in case of images with few colors like in pixel art.

3 Contour Regularization

Contours within images are in-between pixels. In some sense they are dual to the structure of the pixels. Since the Geometric Total Variation has structured the relations between pixels, it is natural to define the contours as a kind of dual graph to the optimal triangulation $T^*$. We wish also that these contour lines align nicely with discontinuities.

To do so, we introduce a variable $t_a$ on each arc whose value is kept in $[0,1 ]$. If the arc a is the oriented edge $(\mathbf {p},\mathbf {q})$ then the position of the contour crossing this edge will be $\mathbf {x}_a = t_a \mathbf {p}+ (1-t_a) \mathbf {q}$ (see the upper floating figure).

We denote by $a'$ the arc opposite to a, i.e. $(\mathbf {q},\mathbf {p})$. The face that is to the left of a is denoted $\mathrm {face}(a)$ while the one to its right is $\mathrm {face}'(a)$. We will guarantee that, at the end of each iteration, $\mathbf {x}_a = \mathbf {x}_{a'}$. We associate to each arc $a=(\mathbf {p},\mathbf {q})$ a dissimilarity weight $w_a := \left\| s(\mathbf {q}) - s(\mathbf {p}) \right\| _K = w_{a'}$. We introduce also a point $\mathbf {b}_f$ for each face f of T, which will lie at a kind of weighted barycenter of the vertices of f.

Algorithm 3 regularizes these points and provides a graph of contours that is a meaningful vectorization of the input bitmap image s. Note that the function Intersection$(a,\mathbf {y},\mathbf {z})$ returns the parameter t such that $\mathbf {x}_a$ lies at the intersection of straight line $( \mathbf {y}\mathbf {z})$ and the arc a.

First it starts with natural positions for $\mathbf {x}$ and $\mathbf {b}$ (resp. middle of arcs and barycenter of faces) at line 3 and line 4. Then it proceeds iteratively until stabilization in the loop at line 8. Each iteration consists of three steps: (i) update barycenters such that they are a convex combination of surrounding contour points weighted by their dissimilarities (line 9), (ii) update contour points such that they lie at the crossing of the arc and the nearby barycenters (line 11), (iii) average the two displacements along each edge according to respective area (line 13), thus guaranteeing that $\mathbf {x}_a=\mathbf {x}_{a'}$ for every arc. The area $\textsc {Area}(a)$ associated to an arc a is the area of the quadrilateron formed by the tail of a, barycenters $\mathbf {b}_{\mathrm {face}(a)}$ and $\mathbf {b}_{\mathrm {face}'(a)}$ and $\mathbf {x}_a$. Note that the coefficients $\alpha $ and $\alpha '$ computed at line 14 have value 1 whenever the area is $\frac{1}{6}$.

Since this process is always computing convex combinations of points with convex constraints, it converges quickly to a stable point. In all our experiments, we choose $\epsilon =0.001$ and the process converges in a dozen of iterations. Figure 2 illustrates it. The contour mesh is defined simply as follows: there is one cell per pixel, and for each pixel $\mathbf {p}$, its cell is obtained by connecting the sequence of points $\mathbf {x}_{a_0}, \mathbf {b}_{\mathrm {face}(a_0)},\mathbf {x}_{a_1}, \mathbf {b}_{\mathrm {face}(a_1)}\ldots $ for every arc $a_0,a_1,\ldots $ coming out of $\mathbf {p}$. Each cell of the contour mesh is displayed painted with the color of its pixel. It is clear that the contour mesh is much more meaningful after $\mathrm {GTV}$ optimization, and its further regularization remove some artefacts induced by the discretization. Last but not least, our approach guarantees that sample points always keep their original color.

4 Raster Image Zooming with Smooth Contours

Contour meshes as presented in Sect. 3 are easily vectorized as polylines. It suffices to gather cells with same pixel color as a regions and extract the common boundaries of these regions. Furthermore, these polylines are easily converted to smooth splines. We will not explore this track in this paper but rather present a raster approach with similar objectives and features.

From the image s with lattice domain $\varOmega $, we wish to build a zoomed image $s'$ with lattice domain $\varOmega '$. If the zoom factor is the integer z and $\varOmega $ has width w and height h, then $\varOmega '$ has width $z(w-1)+1$ and height $z(h-1)+1$. The canonical injection of $\varOmega $ into $\varOmega '$ is $\iota _z: (x,y) \mapsto (zx,zy)$. We use two auxiliary binary images S (for similarity image) and D for (for discontinuity image) with domain $\varOmega '$. We also define the tangent $\mathbf {T}_a$ at an arc a as the normalization of vector $\mathbf {b}_{\mathrm {face}(a)} - \mathbf {b}_{\mathrm {face}'(a)}$.

The zoomed image $s'$ is constructed as follows:

Similarity set. We set to 1 the pixels of S that are in $\iota _z(\varOmega )$. Furthermore, for every arc $a=(\mathbf {p},\mathbf {q})$, if $w_a=0$ then the digital straight segment between $\iota _z(\mathbf {p})$ and $\iota _z(\mathbf {q})$ is also set to 1. Last, we set the color $s'$ at these pixels to their color in s.
Discontinuity set. For every face f, we count the number n of arcs whose weight is not null. If $n=0$ we simply set $D(\iota _z(\mathbf {b}_f))=1$. $n=1$ is impossible. If $n=2$, let $a_1$ and $a_2$ be the two arcs with dissimilarities. We set $D(\mathbf {p})=1$ for every pixel $\mathbf {p}\in \varOmega '$ that belongs to the digitized Bezier curve of order 3, that links the points $\iota _z(\mathbf {x}_{a_1})$ to $\iota _z(\mathbf {x}_{a_2})$, is tangent to $\mathbf {T}_{a_1}$ and $\mathbf {T}_{a_2}$, and pass through point $\iota _z(\frac{1}{2}(\mathbf {b}_f+I))$, with I the intersection of the two lines defined by $\mathbf {x}_{a_i}$ and tangent $\mathbf {T}_{a_i}$, $i=1,2$. If $n=3$, for every arc a of f, we set $D(\mathbf {p})=1$ for every pixel $\mathbf {p}\in \varOmega '$ that belongs to the digitized Bezier curve of order 2 connecting $\iota _z(\mathbf {x}_{a})$ to $\iota _z(\mathbf {b}_f)$ and tangent to $\mathbf {T}_{a}$. Last we set the color $s'$ at all these pixels by linear interpolation of the colors of pixels surrounding s.
Voronoi maps. We then compute the voronoi map $\mathrm {Vor}(S)$ (resp. $\mathrm {Vor}(D)$), which associates to each pixel of $\varOmega '$ the closest point $\mathbf {p}$ of $\varOmega '$ such that $S(\mathbf {p})=1$ (resp. such that $D(\mathbf {p})=1$).
Image interpolation. For all pixel $\mathbf {p}$ of $\varOmega '$ such that $S(\mathbf {p})=0$ and $D(\mathbf {p})=0$, let $\mathbf {q}=\mathrm {Vor}(S)(\mathbf {p})$ and $\mathbf {q}'=\mathrm {Vor}(D)(\mathbf {p})$. We compute the distances $d = \Vert \mathbf {q}- \mathbf {p}\Vert $ and $d' = \Vert \mathbf {q}' - \mathbf {p}\Vert $. We use an amplification factor $\beta \in [0,1]$: when close to 0, it tends to make linear shaded gradient while close to 1, it makes contours very crisp.
$$\begin{aligned} s'(\mathbf {p}) = \left\{ \begin{array}{l} (1-\frac{2d'}{d+d'})s'(\mathbf {q}') + \frac{2d'}{d+d'}( \beta s'(\mathbf {q}) +(1-\beta )s'(\mathbf {q}') ) \text { when }d' \le d, \\ (1-\frac{2d}{d+d'})s'(\mathbf {q}) + \frac{2d}{d+d'}( \beta s'(\mathbf {q}) +(1-\beta )s'(\mathbf {q}') ) \text { otherwise}. \end{array} \right. \end{aligned}$$
In experiments, we always set $\beta =0.75$ which gives good results, especially for image with low quantization. Of course, many other functions can be designed and the crispness could also be parameterized locally, for instance as a function of the $\mathrm {GTV}$ of the triangle.
Antialiasing discontinuities. To get images that are slightly more pleasant to the eye, we perform a last pass where we antialias every pixel $\mathbf {p}$ such that $D(\mathbf {p})=1$. We simply perform a weighted average within a $3 \times 3$ window, with pixels in S having weight 4, pixels in D having weight 0.25 and other pixels having weight 1.

Figure 3 illustrates this method of raster image zooming which provides crisp discontinuities that follow Bezier curves. Comparing with Fig. 2, image contours are no more polygonal lines but look like smooth curves. Last this method still guarantees that original pixel colors are kept.

5 Experimental Evaluation and Discussion

Our method can both produce vectorized image or make rasterized zoomed images, either from camera pictures or tiny image with low quantization like pixel art images. To measure its performance, several other super-resolution methods have been tested on the set of images given in the first column of Fig. 4, with parameters kept constant across all input images. First, we experimented the method based on geometric stencils proposed by Getreuer [6] with default parameters, which is implemented in the online demonstrator [7]. As shown in the second column of the figure, results appear noisy with oscillations near edges, which are well visible on pixel art images (e.g. x-shape or dolphin images). Such defaults are also visible on ara or barbara image near the white area. Another super-resolution method was experimented which uses on a convolutional neural network [22]. Like the previous method, numerous perturbations are also visible both on pixel art images (x-shape or dolphin) but also in homogeneous areas close to strong gradients of ipol-coarsened image. We have tried different parameters but they lead to images with comparable quality. On the contrary, due to its formulation, our method does not produce false contours or false colors, and works indifferently on pixel art images or camera pictures.

Other comparisons are presented on Fig. 5 in order to give an overview of the behavior of five other methods. The two first methods complement the comparisons on pixel art image with respectively the depixelizing method proposed by Kopf and Lischinski [13] implemented in Inkscape, and a commercial software proposed by Vector Magic Inc [10]. Our method captures better the direction of discontinuities of the underlying shape with less contour oscillations (see the border of dolphin or x-shape). Two other methods were tested on barbara image: Roussos and Maragos tensor-driven method [8] and Potrace vectorization software [21]. Again, results appear with oscillations around strong gradients with Roussos-Maragos algorithm, while Potrace software tends to smooth too much the image. Finally we also applied the Hqx magnification filter proposed by Stepin [23] that provides interesting zoom results but presents some artefacts (see for instance the X center of Hq4x) and limited scale factor (i.e. 4). Note that other comparisons can easily be done with the following online demonstrator: https://ipolcore.ipol.im/demo/clientApp/demo.html?id=280 and source code is available on a GitHub repository: https://github.com/kerautret/GTVimageVect.

6 Conclusion

We have presented an original approach to the problem of image vectorization and image super-resolution. It is based on a combinatorial variational model, which introduces geometry into the classical total variation model. We have compared our method both to state-of-the-art vectorization and super-resolution methods and it behaves well both for camera pictures and pixel art images. We have also provided an online demonstrator, which allows users to reproduce results or test our method with new images. In future works, we plan to compare quantitatively our method with state-of-the-art techniques, and also use our approach to train a CNN for zooming into pixel art images.

References

Bhowmick, P., Bhattacharya, B.B.: Fast polygonal approximation of digital curves using relaxed straightness properties. IEEE Trans. Pattern Anal. Mach. Intell. 29(9), 1590–1602 (2007)
Article Google Scholar
Bobenko, A.I., Springborn, B.A.: A discrete Laplace-Beltrami operator for simplicial surfaces. Discret. Comput. Geom. 38(4), 740–756 (2007)
Article MathSciNet Google Scholar
Condat, L.: Discrete total variation: new definition and minimization. SIAM J. Imaging Sci. 10(3), 1258–1290 (2017)
Article MathSciNet Google Scholar
Demaret, L., Dyn, N., Iske, A.: Image compression by linear splines over adaptive triangulations. Sig. Process. 86(7), 1604–1616 (2006)
Article Google Scholar
Feschet, F.: Multiscale analysis from 1D parametric geometric decomposition of shapes. In: IEEE, ICPR, pp. 2102–2105 (2010)
Google Scholar
Getreuer, P.: Contour stencils: total variation along curves for adaptive image interpolation. SIAM J. Imaging Sci. 4(3), 954–979 (2011)
Article MathSciNet Google Scholar
Getreuer, P.: Image interpolation with geometric contour stencils. Image Process. On Line 1, 98–116 (2011)
MATH Google Scholar
Getreuer, P.: Roussos-maragos tensor-driven diffusion for image interpolation. Image Process. On Line 1, 178–186 (2011)
Google Scholar
Hilaire, X., Tombre, K.: Robust and accurate vectorization of line drawings. IEEE TPAMI 28(6), 890–904 (2006)
Article Google Scholar
Vector Magic Inc. Vector magic (2010). http://vectormagic.com
Kerautret, B., Lachaud, J.-O., Naegel, B.: Curvature based corner detector for discrete, noisy and multi-scale contours. IJSM 14(2), 127–145 (2008)
MathSciNet MATH Google Scholar
Kerautret, B., Ngo, P., Kenmochi, Y., Vacavant, A.: Greyscale image vectorization from geometric digital contour representations. In: Kropatsch, W.G., Artner, N.M., Janusch, I. (eds.) DGCI 2017. LNCS, vol. 10502, pp. 319–331. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66272-5_26
Chapter Google Scholar
Kopf, J., Lischinski, D.: Depixelizing pixel art. In: ACM SIGGRAPH 2011 Papers, pp. 99:1–99:8. Vancouver, British Columbia (2011)
Google Scholar
Lecot, G., Levy, B.: ARDECO: automatic region detection and conversion. In: 17th Eurographics Symposium on Rendering-EGSR 2006, pp. 349–360 (2006)
Google Scholar
Liu, H., Latecki, L.J., Liu, W.: A unified curvature definition for regular, polygonal, and digital planar curves. Int. J. Comput. Vis. 80(1), 104–124 (2008)
Article Google Scholar
Marji, M., Siy, P.: Polygonal representation of digital planar curves through dominant point detection – a nonparametric algorithm. Pattern Recogn. 37(11), 2113–2130 (2004)
Article Google Scholar
Nguyen, T.P., Debled-Rennesson, I.: A discrete geometry approach for dominant point detection. Pattern Recogn. 44(1), 32–44 (2011)
Article Google Scholar
Price, B., Barrett, W.: Object-based vectorization for interactive image editing. Vis. Comput. 22(9–11), 661–670 (2006)
Article Google Scholar
Roussos, A., Maragos, P.: Vector-valued image interpolation by an anisotropic diffusion-projection PDE. In: Sgallari, F., Murli, A., Paragios, N. (eds.) SSVM 2007. LNCS, vol. 4485, pp. 104–115. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72823-8_10
Chapter Google Scholar
Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms (1992)
Article MathSciNet Google Scholar
Selinger, P.: Potrace (2001–2017). http://potrace.sourceforge.net
Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: CVPR, pp. 1874–1883. IEEE, Las Vegas, June 2016
Google Scholar
Stepin, M.: HQx magnification filter (2003). http://web.archive.org/web/20070717064839/www.hiend3d.com/hq4x.html
Sun, J., Liang, L., Wen, F., Shum, H.-Y.: Image vectorization using optimized gradient meshes. Trans. Graph. 26, 11 (2007)
Article Google Scholar
Swaminarayan, S., Prasad, L.: Rapid automated polygonal image decomposition. In: 35th Workshop AIPR 2006, pp. 28–28, October 2006
Google Scholar
Xia, T., Liao, B., Yu, Y.: Patch-based image vectorization with automatic curvilinear feature alignment. Trans. Graph. 28, 115 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Univ Lyon, Lyon 2, LIRIS, 69676, Lyon, France
Bertrand Kerautret
Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LAMA, 73000, Chambéry, France
Jacques-Olivier Lachaud

Authors

Bertrand Kerautret
View author publications
You can also search for this author in PubMed Google Scholar
Jacques-Olivier Lachaud
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jacques-Olivier Lachaud .

Editor information

Editors and Affiliations

University of Malaya, Kuala Lumpur, Malaysia
Shivakumara Palaiahnakote
Consiglio Nazionale delle Ricerche, ICAR, Naples, Italy
Gabriella Sanniti di Baja
Chinese Academy of Sciences, Beijing, China
Liang Wang
Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kerautret, B., Lachaud, JO. (2020). Geometric Total Variation for Image Vectorization, Zooming and Pixel Art Depixelizing. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12046. Springer, Cham. https://doi.org/10.1007/978-3-030-41404-7_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-41404-7_28
Published: 23 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41403-0
Online ISBN: 978-3-030-41404-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Geometric Total Variation for Image Vectorization, Zooming and Pixel Art Depixelizing

Abstract

Similar content being viewed by others