1 Introduction

A space-filling curve is a path going through every point of a given space. It is represented by a function which maps each point in the space with its index position along the curve. The space-filling curves cross many applications of computer science [5, 28, 51] and other domains such as mathematics [6, 48], electronics [14, 33] etc. In the field of image processing, the Hilbert curve is useful for several tasks as compression, image representation, matching (extended to 3 − D object), image features mapping [2, 5, 42, 52, 53] etc.

An interesting property of this kind of curve is the locality preserving, that represents the capacity to map close points (locality) to close indexes and vice versa. With a such property, a part of the spatial relationship between multi-dimensional points in the original space is not totally lost in 1 − D space. It is a key property because it conditions the fact that any interpretation of the data structure set up on the analysis on one-dimensional space is still reliable. Among the famous space-filling curves (Peano [45], Hilbert [18], Lebesgue [30] etc.), the curve proposed by Hilbert is considered as the most locality-preserving [35, 37, 46]. Referring to the original article [18], Hilbert curve fills a 2 − D space. So, a multidimensional extension is needed to fill higher dimensional spaces. There are several proposals for generating multidimensional versions of the Hilbert curve [7, 22, 29]. From a construction point of view, these models start from a mD basic pattern that will later be combined in order to define a mD Hilbert curve at order n. However, only one basic pattern is generally used in the literature: the RBG pattern (Reflected Binary Gray code). We think that restricts the ways to fill a space and that consequently limits the level of locality preservation reached.

In this paper, the main contributions deal with :

  • the definition of new patterns, different from classical RBG, in order to build multidimensional extensions of the popular Hilbert space-filling curve. These new patterns carry out a comparable level of locality than RBG, and sometimes better.

  • we also show that selecting the best locality-preserving pattern could contribute to designing, across orders, a new curve with a comparable - and sometimes better - overall locality preserving level than the referent Hilbert curve.

To assess the relevance of our proposition, a CBIR (Content-Based Image Retrieval) experiment was conducted on a collection of 19270 images (283 classes) issued from three data-bases (GREC,Footnote 1 MPEG-7,Footnote 2 and LBIDFootnote 3). The curve generated from the most locality preserving pattern was used to explore the multidimensional image features space. Comparative large-scale image retrieval tests show that a locality preserving gain could be translated into image search performance.

The rest of the paper is organised as follows: first, in Section 2, some generalities of the space-filling curves are introduced before highlighting, in Section 3, the role of pattern on locality preservation. Section 4 is dedicated to our proposition: a formulation of new basic patterns to fill a space is given, an algorithm to build them is provided and their level of locality is measured according to the standard criteria in [12].

The application of the curve - generated from the most locality preserving pattern - in CBIR framework is presented in Section 5. Comparative results refer to the multidimensional Hilbert curve are pointed out and analyzed in Section 6.

2 Space-filling curve

Space-filling curve is a self-similar path which goes through every point of a given multidimensional grid space. Hence, it is a bijection f which maps each D-dimensional point p belonging to the space to its index I (an integer) on the curve. An n-order space-filling curve is defined by:

$$ \begin{array}{rcl} f: \{0,1,..., 2^{n}-1\}^{D} & \to & \{0,1,...,2^{nD}-1\} \\ p & \mapsto & I=f(p) \end{array} $$
(1)

With I = f(p) is the index of a point p on the curve f.

When the space is filled by f, the localisation of a D-dimensional point p on the grid space (square, cube or hypercube) is substituted by its 1 − D position I on the curve. For example, two famous space-filling curves are illustrated in the Fig. 1, others curves can be found in [3, 47].

Fig. 1
figure 1

2 − D Hilbert and the Z-order curves expressed at order n = 2. The Hilbert curve [18] proposed in 1891 is detailed in this paper; and the Z-order curve, which is a discrete version of the Lebesgue curve [30]. Example of the mapping of a point p with coordinates(1,1) onto the Hilbert curve: the localisation of p in the grid space is replaced by its 1 − D position I = 2 on the curve

3 Locality preservation

3.1 Locality preservation

Using a space-filling curve can be seen as way of linearly ordering multidimensional points into a 1 − D line, where the order of a point p is coded by its index I.

Once a 1 − D points ordering is found, analyzing the indexes neighborhood of a given point can provide a view on multidimensional data localization and nearest-neighbors points queries can be performed [12, 21]. This track still realiable under the asumption of the distances is globally preserved after the mapping i.e. if close points in space are close in the line : this is the so called locality preserving or preserving mapping property. Several works have studied, analyzed and compared classical curves (sweep, scan, etc.) with space-filling curves (Peano, Hilbert, Lebesgue etc.) and it is now admitted that the Hilbert space-filling curve achieves the best locality preservation [35,36,37, 46]. However, in a multidimensional grid space, a path visiting every point exactly once, without crossing itself, and which would totally preserve the spatial locality does not exist. Whatever curve is used, cases of topology breaks can be observed (cf. Fig. 2) and they are taken into account by the handled locality preservation measure.

Fig. 2
figure 2

Two cases (worst) where the topology between the spaces is broken. (a) p 1 and p 2 are two points far away in the space (d(p 1,p 2) = 4, Manhattan metric) but they have closest indexes into the 1 − D line, respectively I = 7 and I = 8. (b) p 1 and p 2 are two close points in the grid space (d(p 1,p 2) = 1) but they are far away into the 1 − D line (respectively I = 1 and I = 14)

In this paper, the criterion provided by Faloutsos et al. in [12] is used to measure the locality preservation reached by our curves. Described below, L is the average farthest distance between all D-dimensional points p(i) and p(j) on the grid space whose corresponding indexes i and j are within the neighborhood radius N/2 :

$$ L(c)=\frac{1}{N^{D}}\sum\limits_{i,j \in [N^{D}], i<j ~ with ~\mathcal{v}(i,j)\leq N/2} Max\{ \mathcal{v}(p(i),p(j)) \} $$
(2)

L the level of locality preservation of a curve c.

With: N D the number of points of the grid space of size N = 2n, N/2 the radius of interest, \(\mathcal {v}\) the metric used, and i respectively j the index of the D-dimensional points p(i) resp. p(j) on the curve c.

This is a practical criterion and well adapted to the proposed CBIR application because, for a given point’s index, L determines how many points with indexes entering into a fixed neighborhood radius are really inside the same radius in space. Initially fixed to N/2 in [12], the radius was extended because in many applications a larger neighborhood is usually needed. What is the behavior of the best locality preserving curve in this case ? It could be interesting to assess the locality preservation level of the Hilbert curve (compare with our curves) when the radius of neighborhood increases (cf. Section 4.4).

3.2 Preserving locality property: role of the basic pattern

Space-filling curve is self-similar, i.e. at a given n order it is composed by sub-curves which have the same shape as the curve itself but on a smaller scale. From construction point of view, this property is naturally coded in computer science by recursive functions. Starting from a mD basic pattern at the first order n = 1, bulding the n order curve consists of a recursively copied and transformed, scaled version of the curve obtained at the previous n − 1 order. An illustration of this process applied on the 2 − D Hilbert curve is given in Fig. 3.

Fig. 3
figure 3

Recursive construction of the 2-Dimensional Hilbert curve for n = 1, 2. The basic pattern is the first order curve. The second order curve is composed by 4 copies of a scaled version of the basic pattern; sometimes geometrical transformations may be also applied. For example, the left-bottom resp. the right-bottom sub-curve is the result of rotation and reflection of the basic pattern and resp. reflection

The role of the basic pattern is emphasized in this process of construction :

  • the basic pattern is the first order curve, it is the path ordering 2D points;

  • it is widely copied to fit the n order curve (n > 1). The number of patterns which composes a D-dimensional space-filling curve at order n is 2(n−1)D. For example, for n = 2, the 2-Dimensional Hilbert curve is set up from 2D = 4 copies of the the basic pattern and 16 copies at order n = 3 (cf. Figs. 34).

Fig. 4
figure 4

Selecting a better locality-preserving basic pattern leads to design, through orders, an overall better locality preserving curve: case of the 2 − D Hilbert and Z-order space-filling curves for n = 1, 2, 3. L n is the level of locality-preservation of the n order curve via the criteria described in Section 3.1 under Manhattan metric. A smaller L value corresponds to a better locality preservation. The locality superiority observed at the starting step for the Hilbert curve (50%) is conserved across the different orders (resp. 37.5%, 47.5%). This is also confirmed for upper dimensions

Since it is the input of the process of construction and it is widely copied, the basic pattern fixes the points ordering of the 2(n−1)D sub-curves and consequently it must have an impact on the overall locality preservation. Picking out a better locality preserving pattern, or define a new one (cf. Section 4), could help to designing a curve with an overall better level of locality preservation. This idea is confirmed on two space-filling curves below (cf. Fig. 4).

4 Proposition of new basics patterns for multidimensional extension of the Hilbert space-filling curve

Our proposition is oriented toward the curve that maximizes the locality-preservation i.e. the Hilbert curve and it is focused on the definition of new alternative basic patterns. In this paper, the pattern is the first order (n = 1) space-filling curve, it is the model to create sub-curves of the curve through orders, and it is the path connecting all 2D points (exactly once without crossing itself) in a D-dimensional grid space of size N = 2.

4.1 The Hilbert curve and multidimensional extensions

In response to Peano [45], Hilbert presents in 1891 another way to fill a 2 − D space which better preserves the locality [18].

Since then, some multidimensional extensions of the 2D Hilbert curve were developed and several models have been proposed. In [7] sets of low level binary operations are provided, state diagram based-models are given in [28], a table-driven framework is suggested in [22] etc. To our knowledge, all these contributions refer to the RBG code to order points at the first order curve (RBGFootnote 4 pattern). More recently, some works have attempted to access in other ways to fill a space. Liu in [32] puts forward four alternative sets of geometric transformations to arrange the 2D-RBG pattern to set up a curve at upper-orders. According to the set of transformations selected (L configuration) new paths are outlined, which form together with Hilbert and Moore curves a set named “complete set of the Hilbert curve”. This is a relevant study but it stills focusing on 2D space and it usually starts from RBG pattern.

A novel geometric approach with the aim to define a multidimensional compact Hilbert index dedicated to spaces with unequal size dimensions is established in [15]. The lines - usually used on Hilbert curve to symbolize points ordering- are replaced by arcs. We note that this geometric change does not modify the order where points (cells in the article) are visited. The cells ordering stills initialized by standard RBG code, i.e. it follows the RBG pattern.

4.2 Adjacency and pattern solution

In the Hilbert’s original paper [18], the construction of the 2D curve is driven by an implicit rule : “each square must have a common side with the previous square”. Extended to higher dimensions, this rule conveys the idea that each dimension must be sequentially visited. The RBG code - in which only one bit change is observed on two successive words – offers an intuitively response. But, when the dimensions are increasing, there are other solutions (than the RBG code) to connect all 2D points (exactly once without crossing itself) while satisfying the Hilbert source idea. We propose to formalize the original 2D Hilbert rule to higher dimensions into adjacency constraints on D dimensional points (cf. (3)).

First, new pattern solutions are then resulting and secondly, we show later they reach different levels of locality preserving (cf. Section 4.4). Accessing to various solutions opens the possibility to select the pattern which better preserves locality. On the other hand, we have checked in Section 3.2 that selecting a better locality-preserving pattern leads to design, through orders, an overall better locality preserving curve. The 2 − D and 3 − D RBG patterns are shown in Figs. 5 and 6.

Fig. 5
figure 5

Example of patterns solution of (3) in a 2 − D space. They are mutually isometric. As a result, they lead to the same level of locality preservation (L = 1.0) so, only isometric independent patterns solution are selected

Fig. 6
figure 6

New ways to fill a 3 − D space : example of 3 independent patterns \(\mathcal {P^{\ast }}\) generated from the Fig. 7. \(\mathcal {P}_{1}^{\ast }\) is the RBG classical way and \(\mathcal {P}_{2}^{\ast }\) and \(\mathcal {P}_{3}^{\ast }\) define new ways

In the case of the multidimensional Hilbert curve, the pattern \(\mathcal {P}\) must satisfy the adjacent condition, i.e.

$$ \sum\limits_{i=1}^{D}|p(k)_{i}-p(k+1)_{i}|=1, ~~ 0 \leq k < 2^{D}-1 $$
(3)

Where p(k) is the kt h point in the pattern \(\mathcal {P}\) and p(k) i is the it h coordinate of the point p(k). Therefore, the pattern \(\mathcal {P}\) of the D-dimensional Hilbert curve is a path going through every point in {0, 1}D. In Section 4.3 an algorithm is provided to generate for a given D dimension all the patterns \(\mathcal {P}\) solutions of (3). In this article, two solutions are considered as identical if they are isometric. Isometry is a distance-preserving transformation such as rotation or reflection. Because only isometric independent patterns could modify the locality-preserving of the curve, among all solutions of (3) only the independent solutions \(\mathcal {P^{\ast }}\) are retained.

For example, in the 2 − D case, there is only one isometric independent pattern \(\mathcal {P^{\ast }}\) satisfying (3), i.e. the 2 − D RBG pattern (Fig. 5, left). From 3 − D, there are many independent patterns being solution of (3) (cf. Fig. 6).

4.3 Generating adjacent patterns

A recursive algorithm is proposed in Fig. 7 to list all patterns \(\mathcal {P}\) corresponding to a given dimension, solution of (3). A pattern solution is a sequence composed of 2D points in which every two consecutive points are separated by a unit distance. This is a successive appending of a new point which is adjacent to the last point of the actual sequence. To satisfy the adjacent condition, the new point is obtained from the last point by modifying only one coordinate. Therefore, we have D − 1 possibilities for the new point. D − 1 possibilities and not D, because the coordinate previously selected by the last point must be excluded under penalty of passing twice through the same point. Accordingly, for a curve evolving in D-dimension, with the first point fixed (the sequence is initialized with the origin (0, 0, ..., 0)) listing all patterns, which contain 2D − 1 adjacent points, the algorithm scans over \((D-1)^{2^{D}-1}\) possibilities at most. The algorithm complexity is high but in practice it remains computable; the search of patterns is a process lead off-line and for dimensions not exceeding 20 (cf. Section 5).

4.4 Measure of Locality preservation of new patterns

The preservation of locality is a highly sought property needed by applications managing multidimensional data [12, 21, 37]. But, what is the level of locality preservation reached by the new patterns?

Some comparative results are synthetized in Tables 123 and 4 respectively for the 4 − D and 5 − D cases. Only independent isometric patterns \(\mathcal {P^{\ast }}\), solutions of (3) and generated from the Algorithm given in Fig. 7, are taken into account. The locality preservation of candidates is estimated through the criterion described and justified in Section 3.1. L is assessed not only for a single neighborhood radius as originally in [12] but for several radius values \(\in [1-\frac {N^{D}}{4}]\). Note that, a smaller L value corresponds to a better locality preservation.

Fig. 7
figure 7

Algorithm to list D-dimensional patterns \(\mathcal {P}\) solution of (3)

Table 1 Locality preservation L of the new 4-D patterns : RBG is not the most locality preserving pattern, there is a new way to fill a space \(\mathcal {P}_{1}^{\ast }\) that reaches the same level of locality
Table 2 Identification of the two 4-D spatial ordering which leads to the best locality preservation : RBG and \(\mathcal {P}_{1}^{\ast }\) (cf. Table 1)
Table 3 Locality preservation L of the new 5-D patterns: RBG, so far used in previous multidimensional extensions of the Hilbert curve, is not the most locality preserving

From these results, we see that:

  • From 3 − D, there exists more than one pattern verifying (3). In fact, the number of new pattern solutions (independent isometric) increases with the space dimension, that opens the possibilities to fill a given space and to find a good pattern from locality criterion. For example, more than four patterns have been identified in the 4 − D case in which the classical RBG is contained (cf. Table 1) and there is a new pattern \(\mathcal {P}_{1}^{\ast }\) that reaches the same level of locality.

  • From 5 − D, even if the L values evolve in the same order of magnitudeFootnote 5 (cf. Table 3) a partition between new solutions and RBG is observed especially from medium to high neighborhood radius. For example, in the 5 − D case, the RBG is the best preserving pattern for radius ∈ [3 − 5], but when the locality is studied for radius > 5 then four new competitive patterns (\(\mathcal {P}_{1}^{\ast },\mathcal {P}_{2}^{\ast },\mathcal {P}_{3}^{\ast },\mathcal {P}_{4}^{\ast }\)) emerge.

The existence of patterns that improve the locality not only for a single radius value but for several consecutive values of neighborhood (6,7,8) could be interesting for a CBIR application. The results of image queries are often presented as top ranking which leads us to consider rather a range of neighborhoods than a single value. For example, \(\mathcal {P}_{1}^{\ast }\) which is the best preserving pattern for radius 6,7 (with resp. L = 3.468 v s 3.750, L = 3.500 v s. 3.750) remains competitive for the radius 8 (L = 3.656 v s 3.750).

A smaller L value means that, when the points are ordered according to the \(\mathcal {P}_{1}^{\ast }\) model (cf. Table 4) few points located outside the radius of interest in space are view as neighbors on the 1 − D line. In other words, a smaller L value illustrates that points considered as neighbors according to their respective indexes are really neighbors in multidimensional space. So, improving the level of locality preservation could have a positive impact on the results of image retrieval.

Table 4 Identification of the five 5-D points ordering which lead to the best locality preservation: RBG and \(\mathcal {P}_{1}^{\ast },\mathcal {P}_{2}^{\ast },\mathcal {P}_{3}^{\ast },\mathcal {P}_{4}^{\ast }\), (cf. Table 3)

Moreover, among new solutions, there are patterns whose locality remains competitive including through orders. The locality score evolution according the orders n = 1, 2, 3 of the 5 − D curve built from \(\mathcal {P}_{1}^{\ast }\) and \(\mathcal {P}_{2}^{\ast }\) is reported in Table 5. This behaviour, also observed for upper dimensions, tends to confirm that selecting a better locality-preserving pattern contributes to designing a curve with good overall locality.

Table 5 Locality preservation L -across orders n = 1, 2, 3- of 5-D curves built from some new patterns: cases of \(\mathcal {P}_{1}^{\ast },\mathcal {P}_{2}^{\ast }\) compare to RBG

5 Using new pattern-based space-filling curve for CBIR application

The contribution of new patterns is experimented through a CBIR application. An overview of the existing systems, questions of performance evaluation and future challenges can be found in [11, 26, 38, 54].

The curve generated from the most locality preserving 20 − D pattern (noted Max) is tested to quickly explore the image feature space. Through an efficient image descriptor, two visually similar images are featured by two close vectors. With the new pattern-based space-filling curve, the close vectors (feature space) are mapped to close indexes (index space). Hence, similar images correspond to close indexes. After ordering the images in the database according to their respective indexes (off-line indexation), searching images similar to the query one (on-line searching) involves selecting the sub-set of images whose 1 − D indexes are closest to the input one. The similarity of two images is the distance (Manhattan metric) of their respective indexes I on the curve used to scan the image feature space (Zernike moments). This key idea is used to set up a CBIR system synthesized in Fig. 8.

Fig. 8
figure 8

Using new pattern-based space-filling curve for CBIR application: Overview

The following sections specify the conditions under which the experiments were conducted.

5.1 Image characterization via Zernike moments

Extracting the Zernike moments of an image is now an usual operation for image description. It consists in a projection of an image (or shape) onto a family of complex orthogonal functions (on the unit disk) called Zernike polynomials. They were introduced by Teague [49] and are considered as an efficient image descriptor, that was confirmed by many comparative studies [27, 34, 43]. Working with Zernike decomposition involves answering the question: what is the highest-order moments value (noted q) required for an accurate image representation ? q, is determined according to the criterion in [25] based on image quality reconstruction obtained from decomposition results.

This trackFootnote 6 leads to q = 7, which yields a 20 − D feature vector for each image. This size achieved a trade off between effectiveness and robustness to noise. In noise context (cf. GREC database 5.2), a compact description seems to be required because we know that the higher order moments code the details which are precisely likely to be affected by degradations. For a fast calculation, Zernike moments were implemented following the algorithm in [20].

5.2 Image sources and dataset

The image dataset is built from the mixture of three image collections : GREC, MPEG-7 and LBID. It is composed of 19270 low-resolution binary images (∼ 300d p i) including 283 classes of object. Details on the collections and dataset constitution can be found in Tables 6 and 7. This track leads to handling various kinds of images. MPEG-7 and LBID refer to natural scenes while GREC is clearly related to technical drawing (architectural and electrical symbols). Furthermore, degraded images also appear, more than 7000 noised images are part of GREC (Kanungo noise model [24]).

Table 6 Image dataset : heterogeneous and noisy sources
Table 7 Details on the constitution of the image dataset

Building a database from such sources contributes to process in the same system several flows of various kind of images. These are realistic experimental conditions to estimate the ability of the proposed curves to re-order images coming from heterogeneous and noisy sources.

5.3 Metrics for CBIR performance evaluation

The following criteria have been estimated :

  • Time computing :Footnote 7

    • the indexing time : the time required for the system (cf. Fig. 8) to index images, including Zernike moments computation, mapping points to index on the selected curve and insertion into B+Tree.

    • the image-searching time : the time needed to output a rank of 20 images whose indexes are the nearest neighbors of the query one (computation of moments, mapping, reading access to the B+Tree).

  • Image searching performance :

    • precision : the average number of relevant images from a fixed number of outputs (20).

    • recall : the average ratio of the number of relevant image retrieved to the total number of relevant images available in dataset.

6 Using new pattern-based space-filling curve for CBIR application : comparative results

The performances of the curve generated from the most locality preserving 20 − D patternFootnote 8 (noted Max) are compared to that of HilbertFootnote 9 in large-scale image retrieval tests. The mapping of features on selected curve is performed thanks to the linear time algorithm published by the authors in previous works [41]. Hilbert and Max curves do not require to be already built and hardware stored. Only the operation of mapping needs to be computed furthermore for fixed dimension and order, these practical considerations lead to reduced time processing.

6.1 Building the database: time for image indexing

The indexation of 18870 images (cf. Table 7) has been performed in less than 12 minutes (400 other images have been preserved for search test purposes) including Zernike moments computation, mapping features to index on curve and insertion into B+Tree. Whatever the curve used, the time consuming evolves in the same order of magnitude, the average time indexing per image is approximately 36.771 m s with the Max or Hilbert curve.

Furthermore, in the framework of this experiment, the system runs in linear time \({\mathcal O}(N)\) according to the number of input images N.

6.2 Retrieval results: precision, recall and query run time

Precision and recall are estimated on 400 queries randomly selected from the data-set (cf. Table 7). It is the sample of images not yet having undergone the indexing process. For a query image, the response of the system is composed of the 20 images whose indexes are the nearest neighbors of the input index, according the curve used to scan the feature space. Examples of ouputs are illustrated in Fig. 9, and large-scale scores are synthetized in Table 8.

Fig. 9
figure 9

Responses of the system when the feature space is explored by Max or Hilbert curve. The majority of outputs are similar to the query. Even if all the outputs do not belong to the same class, we can observe a relation with it. For example, with the query “triangle”, all the outputs refer to electrical symbols and they contain a triangle. This shows the advantage of the locality preserving property of space-filling curves: similar images are ordered close together

Table 8 Comparative results: precision, recall and query run time over a sample of 400 queries randomly selected

6.3 Results analysis

Compared to the classic multidimensional Hilbert curve, the Max curve, which is generated from the optimised pattern, allows a better overall performance in search precision (+9.2%) and recall (+5.7%), cf. Table 8. These results confirm the influence of the locality preservation on the image search performance. More broadly, the proposed CBIR system takes advantage of :

  • the capability of space-filling curves to cluster data : the clustering properties of the Hilbert curve were analysed in [37]. Locality between point groups (i.e. image clusters) in the multidimensional space (feature space) being preserved as much as possible in the linear space. On the other hand, we know that clustering techniques are useful on CBIR to partition visual data into groups in order to organize the multidimensional feature space with the aim of designing a scalability system [9, 39, 40].

  • the dimensionality reduction : the 1 − D value is a simple and efficient data structure used to fast index cluster of images in order to accelerate query run time. That was confirmed - in the framework of these tests (fixed dimension, curves computed off-line etc.) - by a short response time; a query run time is performed in less than 41 m s (cf. Table 8). On the other side, we know that linear dimensionality reduction approaches are suitable for CBIR problems [10, 17, 23]. For example, the Locality Preserving Projection [17] is designed for preserving in low dimension the local structure of the data observed in high dimension.

  • incremental mode : the databases are dynamics, new flow of images can be inserted without modifying existing entities of the system. Furthermore, even the decision-making is naive, it does not suffer from any dependence with a training process.

However, if the use of the Max curve increases image search performances, using a space-filling curve for CBIR application also has limitations. From a theoretical point of view, in a multidimensional space there is no total point ordering that preserves the overall spatial locality [21]. This explains why the level of precision does not currently exceed 67.3% even if it has previously been improved (+9, 2%). Furthermore, searching all the images belong to the same class of the query need to deeply explore the 1 − D line because topology breaks remain (cf. Fig. 2) that consequently affect the recall score reached (42.5%).

In the field of CBIR, some interesting properties can be highlighted. Concerning the clustering stage (on image features), many CBIR state-of-the art methods [1, 19, 39, 40] still - partially or totally - refer to the k-means algorithm [16] and variants (fuzzy C-means [31], PCK-means [4] etc). This observation also regards recent models as bag-of-words [50] where k-means are used to the quantization of multidimensional image descriptors into words to form a visual vocabulary. However, with such approaches, we know that the quality of results suffer from the dependency of some parameters (number of classes, initialization of class centers, definition of membership function, initial choice of weights). Here, even the search performances are probably lower than a fine-tuned k-means-based system, the clustering properties of the proposed system are related to the curve itself and they are independent of a priori knowledge of the dataset. Consequently, the results of the partition do not change with the number of classes or other parameters related to the incoming data. Furthermore, no training process is needed and coupled with the fast computational capability (discussed above), our approach, can design a system adapted to manage a dynamic dataset. This is clearly an advantage over the amount and diversity of available images (development of the Internet, and image capturing devices etc).

Another interesting characteristic is the possibility to connect the system to a decision-making loop driven by users. Compared with closed CBIR approaches our system can be embedded within a user-centered schema inspired by interactive clustering (relevant feedback [8, 11, 26]). More precisely, users could select the most relevant images obtained from a preliminary response and move them on the 1 − D line. Then new resulting indexes would be updated inside the system. The retrieval performance should increase incrementally. So, the performance of the proposed system could benefit with user interactions. This is a recent track that has emerged in CBIR.

Improvements can be added at every stage of the proposed system. For example, even Zernike moments were approved on MPEG-7 standard (region-based shape descriptors) it was shown in [55] that the generic Fourier descriptor could outperform results on image retrieval. A panorama of shape descriptors can be found in [56].

We emphasize that more intelligent strategy, like selection of discriminative features, could be used. Our objective was to measure - under realistic experimental conditions (including heterogenous sources and noised images)- the intrinsic performance of the proposed curve to rapidly explore a multidimensional space and decision-making without an optimal image representation.

Actually, the decision-making is guided by a conventional rule : images are sorted according to their respective indexes on the curve. The response of the system can be upgraded by post-processing refinement on features vectors of the images previously filtered. Complementary tests have shown that spending 3m s more (7.3% of 41m s), can lead to an average precison gain about 5% (\(67.3\% \rightarrow 72.3\%\)). This result was obtained via the sort of the distances (computed on moments, euclidean metric) between the query and the sub-set of 40 images corresponding to closest indexes on the curve. More sophisticated approaches such as re-ranking algorithms could also be envestigated, see [13, 44].

7 Conclusion

This paper deals with new ways to fill a multidimensional space - alternative to the famous Hilbert space-filling curve - while maintaining the locality as much as possible. In the same working direction, we can find [32]. The authors have checked in previous works, favourably received by the community, the utility of the Hilbert curve to manage multidimensional data [41, 42]. Here the focus is on the role of basic patterns to set up a curve with an interesting level of locality preservation. We show that new patterns can satisfy the adjacency - multidimensional generalization of the original 2D Hilbert curve condition. An algorithm is provided to find solutions, examples of outputs are given supplemented by illustrations. The number of new pattern solutions increases with the space dimension and locality measures, estimated via standard Faloutsos criterion, showing that among all solutions only isometric independant patterns lead to better level of locality preserving. Experiments have enabled the identification of patterns which preserve the locality better than the classical RBG, which has been used in every previous multidimensional extension of the Hilbert curve. Moreover, selecting the most locality-preserving pattern contributes to designing (through orders) a new curve with a good overall locality. This was translated in CBIR application by an image searching gain compared to the referent Hilbert curve (resp. +9.2%, +5.7% on precision resp. recall). Even the reached scores are not those of specialized CBIR systems, that are encouraging results obtained on large-scale tests, under realistic conditions including various classes of images issued from heterogeneous sources. The capability of a system to handle different types of images (natural / technical) is a trend observed in multimedia domain due to the multiplicity of digital image acquisition devices now available. Limitations have been identified and justificated and some paths to improvement have been indicated. Providing new ways to fill a multidimensional space while maintaining a level of locality comparable to the Hilbert curve, and sometimes better, can benefit the applications and some studies can be revisited.

In the field of image processing, Yasser et al. [53] proposed to scan the image space via the Hilbert curve to capture the spatial distribution of pixels. Selecting a new pattern-based curve should lead to better precision on pixels localisation and consequently to a better effectiveness on shape representation. Applied to image compression, improving locality preserving should reduce the number of runs composed of similar pixel intensity values and good compression rates should result [5, 37].

Secondly, space should be now filled by several curves and not -as usually- by a single curve. That opens the possibility to define and combine complementary views on the same multidimensional dataset that has often led to better decision-making, these are our future work.