Alternative patterns of the multidimensional Hilbert curve

Franco, Patrick; Nguyen, Giap; Mullot, Remy; Ogier, Jean-Marc

doi:10.1007/s11042-017-4744-4

Alternative patterns of the multidimensional Hilbert curve

Application in image retrieval

Published: 10 May 2017

Volume 77, pages 8419–8440, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

Alternative patterns of the multidimensional Hilbert curve

Download PDF

Patrick Franco ORCID: orcid.org/0000-0002-3566-1429¹,
Giap Nguyen¹,
Remy Mullot¹ &
…
Jean-Marc Ogier¹

493 Accesses
6 Citations
Explore all metrics

Abstract

Locality-preserving (distance preserving-mapping) is a useful property to manage multidimensional data. Close points in space remain -as much as possible- close after mapping on curve. That is why Hilbert space-filling curve is used in many domains and applications. Hilbert curve preserves well locality because from a construction aspect, it is guided by adajacency constraint on points ordering : the curve connects all points of a D-dimensional discrete space, without favoring any direction, under the constrainst that two successive points are separated by an unit distance. Originally defined in 2-D, all existing multidimensional extensions of the Hilbert curve satisfy adjacency by using the RBG pattern (based on Reflected Binary Gray code). The RBG pattern is then duplicated and arranged (geometrical transformations) to build the multidimensional Hilbert curve at a given order. In this paper, we emphasize that there are other patterns that can satisfy the adjacency. A formulation is given, an algorithm to find out solutions is provided and their respective level of locality preservation is estimated through a standard criterion. Results show that some new patterns can carry a comparable levels of locality and sometimes better than RBG. Moreover, selecting the best locality preserving pattern allows one to design, through orders, a new curve with a comparable overall locality preserving refer to Hilbert curve. The contribution of new patterns is experimented through a CBIR (Content-Based Image Retrieval) application. Large-scale image retrieval tests show that exploring the image feature space with an alternative way to the classical Hilbert curve can lead to improved image searching performances.

Studies of Norm-Based Locality Measures of Two-Dimensional Hilbert Curves

Article 07 August 2021

Space-Filling Curves based on Residue Number System

Locality-Sensitive Hashing for Finding Nearest Neighbors in Probability Distributions

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

A space-filling curve is a path going through every point of a given space. It is represented by a function which maps each point in the space with its index position along the curve. The space-filling curves cross many applications of computer science [5, 28, 51] and other domains such as mathematics [6, 48], electronics [14, 33] etc. In the field of image processing, the Hilbert curve is useful for several tasks as compression, image representation, matching (extended to 3 − D object), image features mapping [2, 5, 42, 52, 53] etc.

An interesting property of this kind of curve is the locality preserving, that represents the capacity to map close points (locality) to close indexes and vice versa. With a such property, a part of the spatial relationship between multi-dimensional points in the original space is not totally lost in 1 − D space. It is a key property because it conditions the fact that any interpretation of the data structure set up on the analysis on one-dimensional space is still reliable. Among the famous space-filling curves (Peano [45], Hilbert [18], Lebesgue [30] etc.), the curve proposed by Hilbert is considered as the most locality-preserving [35, 37, 46]. Referring to the original article [18], Hilbert curve fills a 2 − D space. So, a multidimensional extension is needed to fill higher dimensional spaces. There are several proposals for generating multidimensional versions of the Hilbert curve [7, 22, 29]. From a construction point of view, these models start from a m − D basic pattern that will later be combined in order to define a m − D Hilbert curve at order n. However, only one basic pattern is generally used in the literature: the RBG pattern (Reflected Binary Gray code). We think that restricts the ways to fill a space and that consequently limits the level of locality preservation reached.

In this paper, the main contributions deal with :

the definition of new patterns, different from classical RBG, in order to build multidimensional extensions of the popular Hilbert space-filling curve. These new patterns carry out a comparable level of locality than RBG, and sometimes better.
we also show that selecting the best locality-preserving pattern could contribute to designing, across orders, a new curve with a comparable - and sometimes better - overall locality preserving level than the referent Hilbert curve.

To assess the relevance of our proposition, a CBIR (Content-Based Image Retrieval) experiment was conducted on a collection of 19270 images (283 classes) issued from three data-bases (GREC,^{Footnote 1} MPEG-7,^{Footnote 2} and LBID^{Footnote 3}). The curve generated from the most locality preserving pattern was used to explore the multidimensional image features space. Comparative large-scale image retrieval tests show that a locality preserving gain could be translated into image search performance.

The rest of the paper is organised as follows: first, in Section 2, some generalities of the space-filling curves are introduced before highlighting, in Section 3, the role of pattern on locality preservation. Section 4 is dedicated to our proposition: a formulation of new basic patterns to fill a space is given, an algorithm to build them is provided and their level of locality is measured according to the standard criteria in [12].

The application of the curve - generated from the most locality preserving pattern - in CBIR framework is presented in Section 5. Comparative results refer to the multidimensional Hilbert curve are pointed out and analyzed in Section 6.

2 Space-filling curve

Space-filling curve is a self-similar path which goes through every point of a given multidimensional grid space. Hence, it is a bijection f which maps each D-dimensional point p belonging to the space to its index I (an integer) on the curve. An n-order space-filling curve is defined by:

$$ \begin{array}{rcl} f: \{0,1,..., 2^{n}-1\}^{D} & \to & \{0,1,...,2^{nD}-1\} \\ p & \mapsto & I=f(p) \end{array} $$

(1)

With I = f(p) is the index of a point p on the curve f.

When the space is filled by f, the localisation of a D-dimensional point p on the grid space (square, cube or hypercube) is substituted by its 1 − D position I on the curve. For example, two famous space-filling curves are illustrated in the Fig. 1, others curves can be found in [3, 47].

3 Locality preservation

3.1 Locality preservation

Using a space-filling curve can be seen as way of linearly ordering multidimensional points into a 1 − D line, where the order of a point p is coded by its index I.

Once a 1 − D points ordering is found, analyzing the indexes neighborhood of a given point can provide a view on multidimensional data localization and nearest-neighbors points queries can be performed [12, 21]. This track still realiable under the asumption of the distances is globally preserved after the mapping i.e. if close points in space are close in the line : this is the so called locality preserving or preserving mapping property. Several works have studied, analyzed and compared classical curves (sweep, scan, etc.) with space-filling curves (Peano, Hilbert, Lebesgue etc.) and it is now admitted that the Hilbert space-filling curve achieves the best locality preservation [35,36,37, 46]. However, in a multidimensional grid space, a path visiting every point exactly once, without crossing itself, and which would totally preserve the spatial locality does not exist. Whatever curve is used, cases of topology breaks can be observed (cf. Fig. 2) and they are taken into account by the handled locality preservation measure.

In this paper, the criterion provided by Faloutsos et al. in [12] is used to measure the locality preservation reached by our curves. Described below, L is the average farthest distance between all D-dimensional points p(i) and p(j) on the grid space whose corresponding indexes i and j are within the neighborhood radius N/2 :

$$ L(c)=\frac{1}{N^{D}}\sum\limits_{i,j \in [N^{D}], i<j ~ with ~\mathcal{v}(i,j)\leq N/2} Max\{ \mathcal{v}(p(i),p(j)) \} $$

(2)

L the level of locality preservation of a curve c.

With: N ^D the number of points of the grid space of size N = 2ⁿ, N/2 the radius of interest, $\mathcal {v}$ the metric used, and i respectively j the index of the D-dimensional points p(i) resp. p(j) on the curve c.

This is a practical criterion and well adapted to the proposed CBIR application because, for a given point’s index, L determines how many points with indexes entering into a fixed neighborhood radius are really inside the same radius in space. Initially fixed to N/2 in [12], the radius was extended because in many applications a larger neighborhood is usually needed. What is the behavior of the best locality preserving curve in this case ? It could be interesting to assess the locality preservation level of the Hilbert curve (compare with our curves) when the radius of neighborhood increases (cf. Section 4.4).

3.2 Preserving locality property: role of the basic pattern

Space-filling curve is self-similar, i.e. at a given n order it is composed by sub-curves which have the same shape as the curve itself but on a smaller scale. From construction point of view, this property is naturally coded in computer science by recursive functions. Starting from a m − D basic pattern at the first order n = 1, bulding the n order curve consists of a recursively copied and transformed, scaled version of the curve obtained at the previous n − 1 order. An illustration of this process applied on the 2 − D Hilbert curve is given in Fig. 3.

The role of the basic pattern is emphasized in this process of construction :

the basic pattern is the first order curve, it is the path ordering 2^D points;
it is widely copied to fit the n order curve (n > 1). The number of patterns which composes a D-dimensional space-filling curve at order n is 2^(n−1)D. For example, for n = 2, the 2-Dimensional Hilbert curve is set up from 2^D = 4 copies of the the basic pattern and 16 copies at order n = 3 (cf. Figs. 3, 4).

Since it is the input of the process of construction and it is widely copied, the basic pattern fixes the points ordering of the 2^(n−1)D sub-curves and consequently it must have an impact on the overall locality preservation. Picking out a better locality preserving pattern, or define a new one (cf. Section 4), could help to designing a curve with an overall better level of locality preservation. This idea is confirmed on two space-filling curves below (cf. Fig. 4).

4 Proposition of new basics patterns for multidimensional extension of the Hilbert space-filling curve

Our proposition is oriented toward the curve that maximizes the locality-preservation i.e. the Hilbert curve and it is focused on the definition of new alternative basic patterns. In this paper, the pattern is the first order (n = 1) space-filling curve, it is the model to create sub-curves of the curve through orders, and it is the path connecting all 2^D points (exactly once without crossing itself) in a D-dimensional grid space of size N = 2.

4.1 The Hilbert curve and multidimensional extensions

In response to Peano [45], Hilbert presents in 1891 another way to fill a 2 − D space which better preserves the locality [18].

Since then, some multidimensional extensions of the 2D Hilbert curve were developed and several models have been proposed. In [7] sets of low level binary operations are provided, state diagram based-models are given in [28], a table-driven framework is suggested in [22] etc. To our knowledge, all these contributions refer to the RBG code to order points at the first order curve (RBG^{Footnote 4} pattern). More recently, some works have attempted to access in other ways to fill a space. Liu in [32] puts forward four alternative sets of geometric transformations to arrange the 2D-RBG pattern to set up a curve at upper-orders. According to the set of transformations selected (L configuration) new paths are outlined, which form together with Hilbert and Moore curves a set named “complete set of the Hilbert curve”. This is a relevant study but it stills focusing on 2D space and it usually starts from RBG pattern.

A novel geometric approach with the aim to define a multidimensional compact Hilbert index dedicated to spaces with unequal size dimensions is established in [15]. The lines - usually used on Hilbert curve to symbolize points ordering- are replaced by arcs. We note that this geometric change does not modify the order where points (cells in the article) are visited. The cells ordering stills initialized by standard RBG code, i.e. it follows the RBG pattern.

4.2 Adjacency and pattern solution

In the Hilbert’s original paper [18], the construction of the 2D curve is driven by an implicit rule : “each square must have a common side with the previous square”. Extended to higher dimensions, this rule conveys the idea that each dimension must be sequentially visited. The RBG code - in which only one bit change is observed on two successive words – offers an intuitively response. But, when the dimensions are increasing, there are other solutions (than the RBG code) to connect all 2^D points (exactly once without crossing itself) while satisfying the Hilbert source idea. We propose to formalize the original 2D Hilbert rule to higher dimensions into adjacency constraints on D dimensional points (cf. (3)).

First, new pattern solutions are then resulting and secondly, we show later they reach different levels of locality preserving (cf. Section 4.4). Accessing to various solutions opens the possibility to select the pattern which better preserves locality. On the other hand, we have checked in Section 3.2 that selecting a better locality-preserving pattern leads to design, through orders, an overall better locality preserving curve. The 2 − D and 3 − D RBG patterns are shown in Figs. 5 and 6.

In the case of the multidimensional Hilbert curve, the pattern $\mathcal {P}$ must satisfy the adjacent condition, i.e.

$$ \sum\limits_{i=1}^{D}|p(k)_{i}-p(k+1)_{i}|=1, ~~ 0 \leq k < 2^{D}-1 $$

(3)

Where p(k) is the k − t h point in the pattern $\mathcal {P}$ and p(k)_i is the i − t h coordinate of the point p(k). Therefore, the pattern $\mathcal {P}$ of the D-dimensional Hilbert curve is a path going through every point in {0, 1}^D. In Section 4.3 an algorithm is provided to generate for a given D dimension all the patterns $\mathcal {P}$ solutions of (3). In this article, two solutions are considered as identical if they are isometric. Isometry is a distance-preserving transformation such as rotation or reflection. Because only isometric independent patterns could modify the locality-preserving of the curve, among all solutions of (3) only the independent solutions $\mathcal {P^{\ast }}$ are retained.

For example, in the 2 − D case, there is only one isometric independent pattern $\mathcal {P^{\ast }}$ satisfying (3), i.e. the 2 − D RBG pattern (Fig. 5, left). From 3 − D, there are many independent patterns being solution of (3) (cf. Fig. 6).

4.3 Generating adjacent patterns

A recursive algorithm is proposed in Fig. 7 to list all patterns $\mathcal {P}$ corresponding to a given dimension, solution of (3). A pattern solution is a sequence composed of 2^D points in which every two consecutive points are separated by a unit distance. This is a successive appending of a new point which is adjacent to the last point of the actual sequence. To satisfy the adjacent condition, the new point is obtained from the last point by modifying only one coordinate. Therefore, we have D − 1 possibilities for the new point. D − 1 possibilities and not D, because the coordinate previously selected by the last point must be excluded under penalty of passing twice through the same point. Accordingly, for a curve evolving in D-dimension, with the first point fixed (the sequence is initialized with the origin (0, 0, ..., 0)) listing all patterns, which contain 2^D − 1 adjacent points, the algorithm scans over $(D-1)^{2^{D}-1}$ possibilities at most. The algorithm complexity is high but in practice it remains computable; the search of patterns is a process lead off-line and for dimensions not exceeding 20 (cf. Section 5).

4.4 Measure of Locality preservation of new patterns

The preservation of locality is a highly sought property needed by applications managing multidimensional data [12, 21, 37]. But, what is the level of locality preservation reached by the new patterns?

Some comparative results are synthetized in Tables 1, 2, 3 and 4 respectively for the 4 − D and 5 − D cases. Only independent isometric patterns $\mathcal {P^{\ast }}$, solutions of (3) and generated from the Algorithm given in Fig. 7, are taken into account. The locality preservation of candidates is estimated through the criterion described and justified in Section 3.1. L is assessed not only for a single neighborhood radius as originally in [12] but for several radius values $\in [1-\frac {N^{D}}{4}]$. Note that, a smaller L value corresponds to a better locality preservation.

Table 1 Locality preservation L of the new 4-D patterns : RBG is not the most locality preserving pattern, there is a new way to fill a space $\mathcal {P}_{1}^{\ast }$ that reaches the same level of locality

Full size table

Table 2 Identification of the two 4-D spatial ordering which leads to the best locality preservation : RBG and $\mathcal {P}_{1}^{\ast }$ (cf. Table 1)

Full size table

Table 3 Locality preservation L of the new 5-D patterns: RBG, so far used in previous multidimensional extensions of the Hilbert curve, is not the most locality preserving

Full size table

From these results, we see that:

From 3 − D, there exists more than one pattern verifying (3). In fact, the number of new pattern solutions (independent isometric) increases with the space dimension, that opens the possibilities to fill a given space and to find a good pattern from locality criterion. For example, more than four patterns have been identified in the 4 − D case in which the classical RBG is contained (cf. Table 1) and there is a new pattern $\mathcal {P}_{1}^{\ast }$ that reaches the same level of locality.
From 5 − D, even if the L values evolve in the same order of magnitude^{Footnote 5} (cf. Table 3) a partition between new solutions and RBG is observed especially from medium to high neighborhood radius. For example, in the 5 − D case, the RBG is the best preserving pattern for radius ∈ [3 − 5], but when the locality is studied for radius > 5 then four new competitive patterns ($\mathcal {P}_{1}^{\ast },\mathcal {P}_{2}^{\ast },\mathcal {P}_{3}^{\ast },\mathcal {P}_{4}^{\ast }$) emerge.

The existence of patterns that improve the locality not only for a single radius value but for several consecutive values of neighborhood (6,7,8) could be interesting for a CBIR application. The results of image queries are often presented as top ranking which leads us to consider rather a range of neighborhoods than a single value. For example, $\mathcal {P}_{1}^{\ast }$ which is the best preserving pattern for radius 6,7 (with resp. L = 3.468 v s 3.750, L = 3.500 v s. 3.750) remains competitive for the radius 8 (L = 3.656 v s 3.750).

A smaller L value means that, when the points are ordered according to the $\mathcal {P}_{1}^{\ast }$ model (cf. Table 4) few points located outside the radius of interest in space are view as neighbors on the 1 − D line. In other words, a smaller L value illustrates that points considered as neighbors according to their respective indexes are really neighbors in multidimensional space. So, improving the level of locality preservation could have a positive impact on the results of image retrieval.

Table 4 Identification of the five 5-D points ordering which lead to the best locality preservation: RBG and $\mathcal {P}_{1}^{\ast },\mathcal {P}_{2}^{\ast },\mathcal {P}_{3}^{\ast },\mathcal {P}_{4}^{\ast }$, (cf. Table 3)

Full size table

Moreover, among new solutions, there are patterns whose locality remains competitive including through orders. The locality score evolution according the orders n = 1, 2, 3 of the 5 − D curve built from $\mathcal {P}_{1}^{\ast }$ and $\mathcal {P}_{2}^{\ast }$ is reported in Table 5. This behaviour, also observed for upper dimensions, tends to confirm that selecting a better locality-preserving pattern contributes to designing a curve with good overall locality.

Table 5 Locality preservation L -across orders n = 1, 2, 3- of 5-D curves built from some new patterns: cases of $\mathcal {P}_{1}^{\ast },\mathcal {P}_{2}^{\ast }$ compare to RBG

Full size table

5 Using new pattern-based space-filling curve for CBIR application

The contribution of new patterns is experimented through a CBIR application. An overview of the existing systems, questions of performance evaluation and future challenges can be found in [11, 26, 38, 54].

The curve generated from the most locality preserving 20 − D pattern (noted Max) is tested to quickly explore the image feature space. Through an efficient image descriptor, two visually similar images are featured by two close vectors. With the new pattern-based space-filling curve, the close vectors (feature space) are mapped to close indexes (index space). Hence, similar images correspond to close indexes. After ordering the images in the database according to their respective indexes (off-line indexation), searching images similar to the query one (on-line searching) involves selecting the sub-set of images whose 1 − D indexes are closest to the input one. The similarity of two images is the distance (Manhattan metric) of their respective indexes I on the curve used to scan the image feature space (Zernike moments). This key idea is used to set up a CBIR system synthesized in Fig. 8.

The following sections specify the conditions under which the experiments were conducted.

5.1 Image characterization via Zernike moments

Extracting the Zernike moments of an image is now an usual operation for image description. It consists in a projection of an image (or shape) onto a family of complex orthogonal functions (on the unit disk) called Zernike polynomials. They were introduced by Teague [49] and are considered as an efficient image descriptor, that was confirmed by many comparative studies [27, 34, 43]. Working with Zernike decomposition involves answering the question: what is the highest-order moments value (noted q) required for an accurate image representation ? q, is determined according to the criterion in [25] based on image quality reconstruction obtained from decomposition results.

This track^{Footnote 6} leads to q = 7, which yields a 20 − D feature vector for each image. This size achieved a trade off between effectiveness and robustness to noise. In noise context (cf. GREC database 5.2), a compact description seems to be required because we know that the higher order moments code the details which are precisely likely to be affected by degradations. For a fast calculation, Zernike moments were implemented following the algorithm in [20].

5.2 Image sources and dataset

The image dataset is built from the mixture of three image collections : GREC, MPEG-7 and LBID. It is composed of 19270 low-resolution binary images (∼ 300d p i) including 283 classes of object. Details on the collections and dataset constitution can be found in Tables 6 and 7. This track leads to handling various kinds of images. MPEG-7 and LBID refer to natural scenes while GREC is clearly related to technical drawing (architectural and electrical symbols). Furthermore, degraded images also appear, more than 7000 noised images are part of GREC (Kanungo noise model [24]).

Table 6 Image dataset : heterogeneous and noisy sources

Full size table

Table 7 Details on the constitution of the image dataset

Full size table

Building a database from such sources contributes to process in the same system several flows of various kind of images. These are realistic experimental conditions to estimate the ability of the proposed curves to re-order images coming from heterogeneous and noisy sources.

5.3 Metrics for CBIR performance evaluation

The following criteria have been estimated :

Time computing :^{Footnote 7}
- the indexing time : the time required for the system (cf. Fig. 8) to index images, including Zernike moments computation, mapping points to index on the selected curve and insertion into B+Tree.
- the image-searching time : the time needed to output a rank of 20 images whose indexes are the nearest neighbors of the query one (computation of moments, mapping, reading access to the B+Tree).
Image searching performance :
- precision : the average number of relevant images from a fixed number of outputs (20).
- recall : the average ratio of the number of relevant image retrieved to the total number of relevant images available in dataset.

6 Using new pattern-based space-filling curve for CBIR application : comparative results

The performances of the curve generated from the most locality preserving 20 − D pattern^{Footnote 8} (noted Max) are compared to that of Hilbert^{Footnote 9} in large-scale image retrieval tests. The mapping of features on selected curve is performed thanks to the linear time algorithm published by the authors in previous works [41]. Hilbert and Max curves do not require to be already built and hardware stored. Only the operation of mapping needs to be computed furthermore for fixed dimension and order, these practical considerations lead to reduced time processing.

6.1 Building the database: time for image indexing

The indexation of 18870 images (cf. Table 7) has been performed in less than 12 minutes (400 other images have been preserved for search test purposes) including Zernike moments computation, mapping features to index on curve and insertion into B+Tree. Whatever the curve used, the time consuming evolves in the same order of magnitude, the average time indexing per image is approximately 36.771 m s with the Max or Hilbert curve.

Furthermore, in the framework of this experiment, the system runs in linear time ${\mathcal O}(N)$ according to the number of input images N.

6.2 Retrieval results: precision, recall and query run time

Precision and recall are estimated on 400 queries randomly selected from the data-set (cf. Table 7). It is the sample of images not yet having undergone the indexing process. For a query image, the response of the system is composed of the 20 images whose indexes are the nearest neighbors of the input index, according the curve used to scan the feature space. Examples of ouputs are illustrated in Fig. 9, and large-scale scores are synthetized in Table 8.

Table 8 Comparative results: precision, recall and query run time over a sample of 400 queries randomly selected

Full size table

6.3 Results analysis

Compared to the classic multidimensional Hilbert curve, the Max curve, which is generated from the optimised pattern, allows a better overall performance in search precision (+9.2%) and recall (+5.7%), cf. Table 8. These results confirm the influence of the locality preservation on the image search performance. More broadly, the proposed CBIR system takes advantage of :

the capability of space-filling curves to cluster data : the clustering properties of the Hilbert curve were analysed in [37]. Locality between point groups (i.e. image clusters) in the multidimensional space (feature space) being preserved as much as possible in the linear space. On the other hand, we know that clustering techniques are useful on CBIR to partition visual data into groups in order to organize the multidimensional feature space with the aim of designing a scalability system [9, 39, 40].
the dimensionality reduction : the 1 − D value is a simple and efficient data structure used to fast index cluster of images in order to accelerate query run time. That was confirmed - in the framework of these tests (fixed dimension, curves computed off-line etc.) - by a short response time; a query run time is performed in less than 41 m s (cf. Table 8). On the other side, we know that linear dimensionality reduction approaches are suitable for CBIR problems [10, 17, 23]. For example, the Locality Preserving Projection [17] is designed for preserving in low dimension the local structure of the data observed in high dimension.
incremental mode : the databases are dynamics, new flow of images can be inserted without modifying existing entities of the system. Furthermore, even the decision-making is naive, it does not suffer from any dependence with a training process.

However, if the use of the Max curve increases image search performances, using a space-filling curve for CBIR application also has limitations. From a theoretical point of view, in a multidimensional space there is no total point ordering that preserves the overall spatial locality [21]. This explains why the level of precision does not currently exceed 67.3% even if it has previously been improved (+9, 2%). Furthermore, searching all the images belong to the same class of the query need to deeply explore the 1 − D line because topology breaks remain (cf. Fig. 2) that consequently affect the recall score reached (42.5%).

In the field of CBIR, some interesting properties can be highlighted. Concerning the clustering stage (on image features), many CBIR state-of-the art methods [1, 19, 39, 40] still - partially or totally - refer to the k-means algorithm [16] and variants (fuzzy C-means [31], PCK-means [4] etc). This observation also regards recent models as bag-of-words [50] where k-means are used to the quantization of multidimensional image descriptors into words to form a visual vocabulary. However, with such approaches, we know that the quality of results suffer from the dependency of some parameters (number of classes, initialization of class centers, definition of membership function, initial choice of weights). Here, even the search performances are probably lower than a fine-tuned k-means-based system, the clustering properties of the proposed system are related to the curve itself and they are independent of a priori knowledge of the dataset. Consequently, the results of the partition do not change with the number of classes or other parameters related to the incoming data. Furthermore, no training process is needed and coupled with the fast computational capability (discussed above), our approach, can design a system adapted to manage a dynamic dataset. This is clearly an advantage over the amount and diversity of available images (development of the Internet, and image capturing devices etc).

Another interesting characteristic is the possibility to connect the system to a decision-making loop driven by users. Compared with closed CBIR approaches our system can be embedded within a user-centered schema inspired by interactive clustering (relevant feedback [8, 11, 26]). More precisely, users could select the most relevant images obtained from a preliminary response and move them on the 1 − D line. Then new resulting indexes would be updated inside the system. The retrieval performance should increase incrementally. So, the performance of the proposed system could benefit with user interactions. This is a recent track that has emerged in CBIR.

Improvements can be added at every stage of the proposed system. For example, even Zernike moments were approved on MPEG-7 standard (region-based shape descriptors) it was shown in [55] that the generic Fourier descriptor could outperform results on image retrieval. A panorama of shape descriptors can be found in [56].

We emphasize that more intelligent strategy, like selection of discriminative features, could be used. Our objective was to measure - under realistic experimental conditions (including heterogenous sources and noised images)- the intrinsic performance of the proposed curve to rapidly explore a multidimensional space and decision-making without an optimal image representation.

Actually, the decision-making is guided by a conventional rule : images are sorted according to their respective indexes on the curve. The response of the system can be upgraded by post-processing refinement on features vectors of the images previously filtered. Complementary tests have shown that spending 3m s more (7.3% of 41m s), can lead to an average precison gain about 5% ($67.3\% \rightarrow 72.3\%$). This result was obtained via the sort of the distances (computed on moments, euclidean metric) between the query and the sub-set of 40 images corresponding to closest indexes on the curve. More sophisticated approaches such as re-ranking algorithms could also be envestigated, see [13, 44].

7 Conclusion

This paper deals with new ways to fill a multidimensional space - alternative to the famous Hilbert space-filling curve - while maintaining the locality as much as possible. In the same working direction, we can find [32]. The authors have checked in previous works, favourably received by the community, the utility of the Hilbert curve to manage multidimensional data [41, 42]. Here the focus is on the role of basic patterns to set up a curve with an interesting level of locality preservation. We show that new patterns can satisfy the adjacency - multidimensional generalization of the original 2D Hilbert curve condition. An algorithm is provided to find solutions, examples of outputs are given supplemented by illustrations. The number of new pattern solutions increases with the space dimension and locality measures, estimated via standard Faloutsos criterion, showing that among all solutions only isometric independant patterns lead to better level of locality preserving. Experiments have enabled the identification of patterns which preserve the locality better than the classical RBG, which has been used in every previous multidimensional extension of the Hilbert curve. Moreover, selecting the most locality-preserving pattern contributes to designing (through orders) a new curve with a good overall locality. This was translated in CBIR application by an image searching gain compared to the referent Hilbert curve (resp. +9.2%, +5.7% on precision resp. recall). Even the reached scores are not those of specialized CBIR systems, that are encouraging results obtained on large-scale tests, under realistic conditions including various classes of images issued from heterogeneous sources. The capability of a system to handle different types of images (natural / technical) is a trend observed in multimedia domain due to the multiplicity of digital image acquisition devices now available. Limitations have been identified and justificated and some paths to improvement have been indicated. Providing new ways to fill a multidimensional space while maintaining a level of locality comparable to the Hilbert curve, and sometimes better, can benefit the applications and some studies can be revisited.

In the field of image processing, Yasser et al. [53] proposed to scan the image space via the Hilbert curve to capture the spatial distribution of pixels. Selecting a new pattern-based curve should lead to better precision on pixels localisation and consequently to a better effectiveness on shape representation. Applied to image compression, improving locality preserving should reduce the number of runs composed of similar pixel intensity values and good compression rates should result [5, 37].

Secondly, space should be now filled by several curves and not -as usually- by a single curve. That opens the possibility to define and combine complementary views on the same multidimensional dataset that has often led to better decision-making, these are our future work.

Notes

http://grec2013.loria.fr/GREC2013/node/16
http://mpeg.chiariglione.org/standards/mpeg-7
http://www.lems.brown.edu/~dmc/
The RBG pattern is founded on the Reflected Binary Gray code. The RBG code generates binary words of the given size D (i.e. each word contains D bits) satisfying the condition that two successive words have D − 1 identical bits.
This is not surprising because by satisfying the proposed multidimensional generalization of the Hilbert rule, the patterns solution belongs to the same family of curves i.e. the family of Hilbert-like space-filling curve.
The sample test was composed of 400 images randomly selected from the database.
Our tests are realized on a standard laptop PC (Intel Core 2 Duo T9800 2.93Ghz x 2, 8GB RAM) running Ubuntu 12.10. The B+tree data structure is used for efficient implementation of the large-scale database.
Practically, the considered most-locality preserving pattern is the best solution off-line obtained from Fig. 7 after twenty-four hours of calculation.
The multidimensional Hilbert curve, is the RBG based-curve.

References

Amory A, Sammouda R, Mathkour H, Jomaa R (2012) A content based image retrieval using k-means algorithm. In: IEEE 7th International Conference on Digital Information Management (ICDIM), pp 221–225
Armstrong J, Ahmed M, Chau S (2009) A rotation-invariant approach to 2d shape representation using the hilbert curve. In: Springer (ed) Proceedings of the 6th International Conference on Image Analysis and Recognition, pp 594–603
Bader M (2012) Space-Filling Curves: An Introduction with Applications in Scientific Computing. Springer Science and Business Media. doi:10.1007/978-3-642-31046-1
Bilenko M, Basu S, Mooney R (2004) Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the 21st International Conference on Machine Learning (ICML), pp 81–88
Biswas S (2000) Hilbert scan and image compression. In: Proceedings of the 15th International Conference on Pattern Recognition, vol 3, pp 207–210
Butz A (1969) Convergence with Hilbert’s space filling curve. J Comput Syst Sci 3(2):128–146
Article MathSciNet MATH Google Scholar
Butz A (1971) Alternative algorithm for Hilbert’s space-filling curve. IEEE Trans Comput 20(4):424–426
Article MATH Google Scholar
Chatzichristofis SA, Zagoris K, Boutalis YS, Papamarkos N (2010) Accurate image retrieval based on compact composite descriptors and relevance feedback information. Int J Pattern Recognit Artif Intell 24:207–244
Article Google Scholar
Chierichetti F, Panconesi A, Raghavan P, Sozio M, Tiberi A, Upfal E (2007) Finding near neighbors through cluster pruning. In: Proceedings of the 26th ACM SIGMOD- SIGACT-SIGART symposium on Principles of Database Systems (PODS), pp 103–112
Cox T, Cox M (1994) Multidimensional scalling. Chapman & Hal, London
Google Scholar
Doulamis N, Doulamis A (2006) Evaluation of relevance feedback schemes in content-based in retrieval systems. Signal Process Image Commun 21:334–357
Article MATH Google Scholar
Faloutsos C, Roseman S (1989) Fractals for secondary key retrieval. In: Proceedings of the 8th ACM SIGACT-SIGMOD-SIGART symposium on Principles of Database Systems (PODS), pp 247–252
Guimarães Pedronette DC, S Torres R (2013) Image re-ranking and rank aggregation based on similarity of ranked lists. Pattern Recogn 46(8):2350–2360
Article Google Scholar
Haji-Hashemi M, Mir-Mohammad Sadeghi H, Moghtadai V (2006) Space-filling patch antennas with cpw feed. In: Progress in Electromagnetics Research Symposium. Cambridge, USA, pp 69–73
Hamilton CH, Rau-Chaplin A (2008) Compact Hilbert indices: Space-filling curves for domains with unequal side lengths. Inf Process Lett 105(5):155–163
Article MathSciNet MATH Google Scholar
Hartigan J, Wong M (1979) Algorithm as136: a k-means clustering algorithm. R Stat Soc Ser C 28:100–108
MATH Google Scholar
He X, Niyogi P (2003) Locality preserving projection. In: Advances in Neural Information Processing Systems, vol 16. MIT Press, pp 153–160
Hilbert D (1891) Ueber die stetige abbildung einer line auf ein flächenstück. Math Ann 38(3):459–460
Article MathSciNet MATH Google Scholar
Ho J, Lin S, Fann C, Wang Y, Chang R (2012) A novel content based image retrieval system using k-means with feature extraction. In: IEEE International Conference on Systems and Informatics (ICSAI), pp 785–790
Hosny KM (2008) Fast computation of accurate zernike moments. J Real-Time Image Proc 3(1-2):97–107
Article Google Scholar
Hue-Ling C, Ye-In C (2005) Neighbor-finding based on space-filling curves. Inf Syst 30:205–226
Article Google Scholar
Jin G, Mellor-Crummey J (2005) Sfcgen: a framework for efficient generation of multi-dimensional space-filling curves by recursion. ACM Trans Math Softw (TOMS) 31:120–148
Article MathSciNet MATH Google Scholar
Jolliffe I (2002) Principal component analysis, 2nd edn. Springer Series in Statistics, Springer
Kanungo T, Haralick R, Baird H, Stuezle W (2000) A statistical, nonparametric methodology for document degradation model validation. IEEE Trans Pattern Anal Mach Intell 22(11):1209–1223
Article Google Scholar
Khotanzad A, Hong Y (1990) Invariant image recognition by zernike moments. IEEE Trans Pattern Anal Mach Intell 12(5):489–497
Article Google Scholar
Kim D, Chung C, Barnard K (2005) Relevance feedback using adaptive clustering for image similarity retrieval. J Syst Softw 78(1):9–23
Article Google Scholar
Kim WY, Kim YS (2000) A region-based shape descriptor using zernike moments. Signal Process Image Commun 16(1-2):95–102
Article Google Scholar
Lawder J, King P (2000) Using space-filling curves for multi-dimensional indexing. In: Advances in databases, vol 1832, pp 20–35
Lawder J, King P (2001) Using state diagrams for hilbert curve mappings. Int J Comput Math 78(3):327–342
Article MathSciNet MATH Google Scholar
Lebesgue H (1904) Leċons sur l’intégration. Gauthier-Villars, Paris
MATH Google Scholar
Liu P, Jia K, Lv Z (2008) An effective and fast retrieval algorithm for content-based image retrieval. In: IEEE Congress on Image and Signal Processing (CISP’08), vol 2, pp 471–474
Liu X (2004) Four alternative patterns of the hilbert curve. Appl Math Comput 147(3):741–752
MathSciNet MATH Google Scholar
McVay J, Hoorfar A, Engheta N (2005) Thin absorbers using space-filling-curve high-impedance surfaces. In: IEEE International Symposium on Antennas and Propagation, vol 2A, pp 22–25
Mehtre BM, Kankanhalli MS, Lee WF (1997) Shape measures for content based image retrieval: a comparison. Inf Process Manag 33(3):319–337
Article Google Scholar
Mitchison G, Durbin R (1986) Optimal numberings of an n*n array. SIAM Journal on Algebraic Discrete Methods 7(4):571–582
Article MathSciNet MATH Google Scholar
Mokbel MF, Aref WG (2003) Analysis of multi-dimensional space-filling curves. GeoInformatiqua 7(3):179–209
Article Google Scholar
Moon B, Jagadish H, Faloutsos C, Saltz J (2001) Analysis of the clustering properties of the hilbert space-filling curve. IEEE Trans Knowl Data Eng 13(1):124–141
Article Google Scholar
Müller H, Müller W, McG Squire D, Marchand-Maillet S, Pun T (2001) Performance evaluation in content-based image retrieval: overview and proposal. Pattern Recogn Lett 22:593–601
Article MATH Google Scholar
Murthy V, Vamsidhar E, Swarup Kumar J, Sankara Rao P (2010) Content based image retrieval using hierarchical and k-means and clustering techniques. Int J Eng Sci Technol 2(3):209–212
Google Scholar
Nagthane D (2013) Content based image retrieval system using k-means clustering technique. Int J Comput Appl Inf Technol 3(1):22–29
Google Scholar
Nguyen G (2013) Space-filling curves and their application in image processing. PhD thesis, University of La Rochelle (France), Laboratoire Informatique Image et Interactions (L3I). HAL Id: tel-01174960
Nguyen G, Franco P, Mullot R, Ogier JM (2012) Mapping high dimensional image features onto hilbert curve: applying to fast image retrieval. In: Proceedings of the 21st International Conference on Pattern Recognition. IEEE Computer Society, pp 425–428
Novotni M, Klein R (2004) Shape retrieval using 3d zernike descriptors. Comput Aided Des 36(11):1047–1062
Article Google Scholar
Park G, Baek Y, Lee HK (2005) Re-ranking algorithm using post-retrieval clustering for content-based image retrieval. Inf Process Manag 41(2):177–194
Article MATH Google Scholar
Peano G (1890) Sur une courbe, qui remplit toute une aire plane. Math Ann 36(1):157–160
Article MathSciNet MATH Google Scholar
Perez A, Kamata S, Kawaguchi E (1992) Peano scanning of arbitrary size images. In: Proceedings of the International Conference on Pattern Recognition, pp 565–568
Sagan H (2012) Space-Filling Curves. Springer Science and Business Media. doi:10.1007/978-1-4612-0871-6
Sergeyev YD, Strongin RG, Lera D (2013) Introduction to Global Optimization Exploiting Space-Filling Curves. Springer Science & Business Media
Teague M (1980) Image analysis via the general theory of moments. J Opt Soc Am 70:920–930
Article MathSciNet Google Scholar
Tsai C (2012) Bag-of-words representation in image annotation: a review ISRN Artif Intell, 2012
Velho L, Gomes JdM (1991) Digital halftoning with space filling curves. ACM SIGGRAPH Computer Graphics 25(4):81–90
Article Google Scholar
Yasser E, Maher A, Siu-Cheung C, Wegdan A (2007) A view-based 3d object shape representation technique. In: Image Analysis and Recognition, vol 4633. Springer, Berlin Heidelberg, pp 411–422
Yasser E, Maher A, Wegdan A, Siu-Cheung C (2009) Shape representation and description using the hilbert curve. Pattern Recogn Lett 30(4):348–358
Article Google Scholar
Ying L, Dengsheng Z, Guojun L, Wei-Ying M (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recogn 40(1):262–282
Article MATH Google Scholar
Zhang D, Lu G (2002) Shape-based image retrieval using generic fourier descriptor. Signal Process Image Commun 17(10):825–848
Article Google Scholar
Zhang D, Lu G (2004) Review of shape representation and description techniques. Pattern Recogn 37(1):1–19
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire Informatique, Image, Interaction (L3i), EA 2118- University of La Rochelle (France), Avenue M. Crepeau, 17000, La Rochelle, France
Patrick Franco, Giap Nguyen, Remy Mullot & Jean-Marc Ogier

Authors

Patrick Franco
View author publications
You can also search for this author in PubMed Google Scholar
Giap Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Remy Mullot
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Marc Ogier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Patrick Franco.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Franco, P., Nguyen, G., Mullot, R. et al. Alternative patterns of the multidimensional Hilbert curve. Multimed Tools Appl 77, 8419–8440 (2018). https://doi.org/10.1007/s11042-017-4744-4

Download citation

Received: 09 February 2016
Revised: 01 March 2017
Accepted: 21 April 2017
Published: 10 May 2017
Issue Date: April 2018
DOI: https://doi.org/10.1007/s11042-017-4744-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Alternative patterns of the multidimensional Hilbert curve

Abstract

Similar content being viewed by others

Studies of Norm-Based Locality Measures of Two-Dimensional Hilbert Curves

Space-Filling Curves based on Residue Number System

Locality-Sensitive Hashing for Finding Nearest Neighbors in Probability Distributions

1 Introduction

2 Space-filling curve