1 Introduction

The concept of a cluster of points is one of the most important concepts in point pattern analysis. Point cluster analysis judges whether a point pattern is clustered, dispersed (regular), or random and detects local point clusters. An objective is to reveal the underlying structure of point patterns, i.e., how and why point clusters are generated. Geography considers the clusters of retail stores and restaurants (Scott 1970; Dawson 2012). Epidemiology discusses the clusters of disease cases (Elliot et al. 2000; Lawson 2013). Criminology analyzes the clusters of crime spots (Brantingham and Brantingham 1981; Wortley et al. 2008). Point cluster analysis has drawn much attention in various academic fields related to spatial phenomena.

There are at least three important points that we need to consider in the analysis of point clusters. The first is spatial inhomogeneity, which refers to the inhomogeneity of locations where points can be located. Suppose retail stores such as clothing and shoe stores. Zoning regulations restrict the locations of retail stores to commercial zones, and thus, the potential locations are inhomogeneous. Cuzick and Edwards (1990) considers the clusters of disease cases. Their locations are limited only to the residences of individuals, which is also usually inhomogeneous.

The second point is what we call aspatial inhomogeneity, which indicates the inhomogeneity of point characteristics. Pubs and bars prefer small buildings in commercial areas. Home decor and sporting goods shops tend to be located at larger places along highways. Older people are more likely to contract heart disease and diabetes (Brown et al. 2011; Kirkman et al. 2012). The height and diameter of trees affect the selection of hole-nesting birds (Van Balen et al. 1982; Peterson and Gauthier 1985).

We cannot neglect these two inhomogeneities in point cluster analysis since it may lead to erroneous conclusions. Suppose a statistical analysis concludes disease cases as clustered, suggesting an infectious disease. This, however, can happen by chance when the residences of individuals are clustered, even if the disease is not infectious. Birds' nests often form spatial clusters, but it may be caused by the characteristics of trees, such as their height and diameter, rather than their spatial locations.

The third point we need to consider is the geographic scale of analysis. Geographic scale refers to the spatial extent and resolution of analysis (Dabiri and Blaschke 2019; Oshan et al. 2022). Consideration of geographic scale is critical since the analysis results heavily depend on the geographic scale. Ripley's K-function, for instance, explicitly considers the analytical scale in point cluster analysis, which is represented by the radius of circles.

Point cluster analysis has been discussed in various academic fields, and numerous methods have been developed for this purpose. Existing methods, however, do not fully cover the above three points, as discussed in the following section, which motivated us to develop a new analytical method. We focus on the case where the locations of points are discrete and limited, such as individuals and buildings mentioned earlier. Our question is whether a certain type of points, such as disease cases and retail stores, are spatially clustered in this setting. We consider both the global and local point clusters, i.e., the global tendency and spatial variation in point clusters. Section 2 discusses the advantages and disadvantages of existing methods. Section 3 describes our method in detail. Section 4 tests the method's validity by applying it to hypothetical and real datasets. Section 5 summarizes the conclusion and discusses the topics of future research.

2 Related works

2.1 Methods based on the complete spatial randomness

The nearest neighbor method is a simple but effective tool for classifying point patterns (Clark and Evans 1954; Clark and Evans 1955; Diggle 1975). It measures the average distance between points and their nearest neighbor points and compares it with the average distance obtained under complete spatial randomness. A drawback is that the nearest neighbor method does not explicitly consider the geographic scale of analysis (Upton and Fingleton 1985; Boots and Getis 1988; Quattrochi and Goodchild 1997; Zhang et al. 2014). Different point patterns can have the same nearest neighbor distance, which implies that the nearest neighbor cannot distinguish many different patterns.

Ripley’s K-function resolves this problem (Ripley 1976; Ripley 1979). It places circles around points and counts the number of other points inside the circles. The K-function then compares it with that obtained under the complete spatial randomness. While the K-function evaluates the global tendency of clustering, scan statistic (Kulldorff and Nagarwalla 1995; Kulldorff 1997) focuses on local clusters of points. Placing circles of various sizes at various locations, scan statistic compares the numbers of points inside the circles with that outside the circles. Unfortunately, K-function and scan statistic in their original forms do not consider the spatial inhomogeneity of points. The complete spatial randomness assumed as the null hypothesis is often too relaxed in the real world (Cuzick and Edwards 1990).

2.2 Methods considering the spatial inhomogeneity of points

A model-based approach is one option to control the spatial inhomogeneity of points. Spatial statistics have developed stochastic point processes that describe the spatial patterns of points (Cliff and Ord 1981; Diggle and Rowlingson 1994; Baddeley 2007). We can generate point patterns based on a spatial point process and compare them with an observed pattern. A difficulty lies in the choice of the point process. Appropriate choice requires us to have enough knowledge of point processes, which is not always satisfied, especially at an early stage of analysis.

An exploratory approach is another option, and many methods are available to treat spatial inhomogeneity (Kulldorff 2006 provides a comprehensive review). The k nearest neighbors (k-NN) test developed by Cuzick and Edwards (1990) is one of the most popular methods and is widely used, especially in epidemiology (Gatrell et al. 1996; Haining 2003; Diggle 2013). The test considers the location of disease cases and controls, and the null hypothesis randomizes individuals’ labels (case/control) without changing their locations to evaluate the degree of point clustering. Ripley's cross K-function is also applicable to evaluate point clusters under spatial inhomogeneity (Diggle 1983; Cressie 2015). Though it usually assumes complete spatial randomness as the null hypothesis, we can include spatial inhomogeneity by using random labeling (Lynch and Moorcroft 2008; Tao and Thill 2019). Cumulative and maximum χ2 tests are also often used to control spatial inhomogeneity (Hirotsu 1986; Lagazio et al. 1996; Rogerson 2006; Boulesteix and Strobl 2007). Though these χ2 tests were not originally developed for spatial analysis, they are applicable to treat spatial inhomogeneity.

A drawback of the above exploratory methods is that they do not consider the aspatial inhomogeneity, i.e., the inhomogeneity of point characteristics. These methods assume that all points have the same probability of being assigned a certain label, which is unrealistic in real-world situations and thus should be relaxed.

2.3 Methods considering the aspatial inhomogeneity of points

Matched case–control design is one solution to control the aspatial inhomogeneity, which is often used in experiment designs in medical and biological sciences (Chetwynd et al. 2001; Jacquez et al. 2005; Pearce 2016). The design considers characteristics of individuals, such as age or gender, and chooses the controls in such a way that the distribution of their characteristics is close to those of cases. Though this method does not aim for spatial analysis, we can extend it into the spatial domain. A disadvantage is that it requires many individuals to be chosen as controls, especially when characteristics vary considerably among individuals.

Weighted random sampling is a procedure of selecting elements from a set according to a weighted probability distribution (Ahrens and Dieter 1985; Devroye 2006; Hübschle-Schneider and Sanders 2022). Unlike matched case–control design, weighted random sampling does not require many points. It is a candidate for controlling aspatial inhomogeneity in point cluster analysis.

2.4 Method considering geographic scale of analysis

There are at least two approaches to representing the geographic scale of analysis. One is to use an absolute spatial measure, such as the distance between locations, as a scale parameter. The K-function, for instance, utilizes circles to count the number of points. The radius of circles works as a parameter of representing the analytical scale. Similarly, scan statistic uses circles to detect point clusters, where the circle radius is a scale parameter.

Another approach is to use a relative spatial measure. Cuzick and Edwards (1990) consider the kth nearest neighbor points, where k represents the analytical scale. Jacquez (1996) also considers the kth nearest neighbor point to analyze the space–time interaction in point distributions. The colocation quotient is defined based on the type of the kth nearest neighbor points (Leslie and Kronenfeld 2011).

The two approaches have both advantages and disadvantages. An advantage of absolute measures is that we can easily understand the role and effect of analytical scale since they are represented by real values measured on a concrete space (Rogerson 2006). Relative measures are not easily interpretable since the distance to the kth nearest neighbor point varies among locations, which yields difficulty in choosing appropriate k (Chetwynd et al. 2001; Song and Kulldorff 2003; Tango 2007). An advantage of relative measures is that they explicitly consider the spatial inhomogeneity in analysis (Leslie and Kronenfeld 2011). Absolute measures implicitly assume homogeneous space; thus, they are not directly applicable to point cluster analysis under spatial inhomogeneity.

As seen above, existing methods do not fully satisfy all three points of our demand, i.e., simultaneously considering spatial inhomogeneity, aspatial inhomogeneity, and analytical scale. However, they provide us with effective tools for challenging our problems. The randomization test is effective to control the spatial inhomogeneity. Extending weighted random sampling, we can treat the aspatial inhomogeneity of points. Concerning the representation of the geographic scale of analysis, we choose an absolute measure complemented by the randomization test to treat the spatial inhomogeneity. We will describe our method in detail in the following section.

3 Method

Suppose a region Ξ contains N points, denoted as Z1, Z2,… ZN. Each point is labeled P or Q, which may represent cases of a disease or trees having birds’ nests mentioned in Sect. 1. NP and NQ denote the numbers of P and Q points, respectively. Our question is whether P points are clustered in the whole distribution. We assume a single characteristic of points considered closely related to the label, such as the age of individuals and the size of trees. We call this characteristic attribute hereafter. The attribute plays a key role in controlling the aspatial inhomogeneity.

3.1 Relationship between the label and the attribute

This subsection discusses the relationship between the label and the attribute. There are two types of attributes: categorical variables and numerical variables. The following treats these cases successively.

We first assume that the attribute is a categorical variable. Suppose that N points represent buildings and that labels P and Q indicate buildings of fast food restaurants and other buildings, respectively. We classify these buildings into three categories, i.e., those in urban, suburban, and rural areas. The area category is the attribute of buildings. Fast food restaurants tend to be located in urban rather than suburban or rural areas, implying that buildings in urban areas are more likely to be labeled P. We calculate the ratio of the buildings of fast food restaurants in each of the three area categories, which indicates the tendency for a building to be labeled P. We use the ratio as the weight in the null hypothesis of the statistical test described in the next subsection. Buildings with larger weights are more likely to be labeled P.

We then consider the case where the attribute is a numerical variable. Again, we consider the labels P and Q, which indicate the type of building mentioned earlier. We take the floor size of buildings as the attribute. Assume that fast food restaurants avoid very small and very large buildings and prefer middle-sized buildings. The floor size distribution of fast food restaurants has a bell shape. We then fit a Gaussian distribution to the size distribution and estimate the probability distribution. The estimated distribution indicates the relationship between the type of building and floor size, i.e., the tendency for a building to be labeled P. Using the estimated distribution, we calculate the weight of each point. Log normal and beta distributions are alternative options if the size distribution is skewed. A logistic distribution is useful when the tendency of being labeled P or Q monotonically increases or decreases. This applies to the relationship between diabetes and body weight since overweight monotonically increases the risk of diabetes (Colditz et al. 1990; Feldman et al. 2017).

As above, we first clarify the relationship between the label and the attribute. The weight quantitatively measures this relationship and works as a control variable of aspatial inhomogeneity.

3.2 Evaluation of point clustering

This subsection evaluates the clusters of points labeled P. We first discuss local analysis and then move to the global analysis. The former aims to capture the spatial variation in point clusters, while the latter aims to understand the global tendency of point clusters.

The local analysis starts by drawing a circle of radius r at a location X, denoted by C(r, X). We count the points labeled P and Q in C(r, X), denoted by nP and nQ, respectively. The ratio of P points in C(r, X) is given by

$$\alpha \left( {r,X} \right) = \frac{{n_{P} }}{{n_{P} + n_{Q} }}.$$
(1)

We compare α(r, X) with the ratio of P points in Ξ, as done in scan statistics:

$$\alpha_{0} = \frac{{N_{P} }}{N}.$$
(2)

If P points are clustered in C(r, X), α(r, X) is larger than α0. We perform a Monte Carlo simulation to evaluate the statistical significance of α(r, X). The null hypothesis assumes that α(r, X)=α0, i.e., the probability that a point is labeled P, is the same inside and outside C(r, X). The alternative hypothesis assumes that α(r, X) > α0, i.e., the probability that a point is labeled P is greater in C(r, X) than in its outside.

We extend the weighted random sampling as follows. We randomly label all the points without changing their locations in each simulation. A single simulation consists of N steps, which is equal to the total number of points. In each step, we choose a label, P or Q, and a point to be labeled following a statistical procedure. The probability that we choose a label is proportional to the number of points to be labeled. We denote the probabilities of choosing P and Q as sP and sQ, respectively. They are initially given by

$$s_{P} = \frac{{N_{P} }}{N}$$
(3)

and

$$s_{Q} = \frac{{N_{Q} }}{N},$$
(4)

respectively, and updated with a decrease in unlabeled points. The probability of choosing a point to be labeled is proportional to its weight. We denote the weight of Zi of labels P and Q as wPi and wQi, respectively. The probabilities of Zi being labeled P and Q are given by

$$t_{Pi} = \frac{{w_{Pi} }}{{\sum\limits_{j} {w_{Pj} } }}$$
(5)

and

$$t_{Qi} = \frac{{w_{Qi} }}{{\sum\limits_{j} {w_{Qj} } }},$$
(6)

respectively. We update these probabilities in the labeling process so that the summations of tPi and tQi are both equal to one. We repeat the above step until all the points are labeled. The following is the algorithm of the labeling process. Lines 5.4 and 6.4 update the probabilities of label choice, while lines 8 and 9 update the probabilities of point choice.

Algorithm 1
figure a

Algorithm PL (Point Labeling).

We call the above process the weighted random labeling hereafter. Points are labeled according to a probability distribution. We call ordinary random labeling the unweighted random labeling. All the points have the same weight and thus have the same probability of labeling. The weighted random labeling differs from the weighted random sampling in that the former assigns two labels in parallel while the latter assigns only one. Our approach is a generalized form of weighted random sampling and thus can be easily extended to treat more than two labels simultaneously.

Figure 1 shows an example of the process of weighted random labeling. There are six points, three labeled P and the others labeled Q. Labeling progresses from top to bottom. The red indicates the point labeled at each step, while the blue represents the already labeled points. The second and third columns indicate the label and point chosen at each step.

Fig. 1
figure 1

Weighted random labeling where Z1Z6 denotes six points. Three are labeled P, while the others are labeled Q. Labeling progresses from top to bottom. The red indicates the point labeled at each step, while the blue represents the already labeled points. The second and third columns indicate the label and point chosen at each step (color figure online)

We calculate the probability that α(r, X) or a larger value is obtained under the null hypothesis and denote it as β(r, X). We then define a measure

$$\gamma \left( {r,X} \right) = 1 - 2\beta \left( {r,X} \right).$$
(7)

The range of γ(r, X) is from − 1 to 1. Positive values indicate that P points are clustered in C(r, X), while negative values indicate that points are sparse.

Figure 2 shows point patterns where the weighted random labeling is expected to lead to the correct judgment of point clusters. Numbers indicate the weight of points to be labeled P. Circles indicate the local studied area C(r, X). Red and black points represent P and Q points, respectively. The red points in Figure 2a look spatially clustered, but it is because of large weight values. It is a pseudo cluster corresponding to Type I errors in statistical tests. The red points in Figure 2b are weakly clustered and may not be regarded as a clustered pattern. However, their weight is very small, implying that these points are less likely to be labeled P. We should regard Figure 2b as a clustered pattern corresponding to Type II error. We can similarly discuss dispersed point patterns shown in Figure 2c and 2d. We should judge Figure 2c as not dispersed while Figure 2d as dispersed.

Fig. 2
figure 2

Examples of point patterns where the weighted random labeling is expected to lead to correct judgment of point clusters. Numbers indicate the weight of points to be labeled P. Circles indicate the local studied area C(r, X). Red and black points are P and Q points, respectively. a Red points look spatially clustered, but it is because of their large weights, b red points are weakly clustered, but their weights are small, c red points are dispersed due to large weights, d red points are weakly dispersed, but their weights are small (color figure online)

We place a lattice on Ξ and calculate γ(r, X) at every lattice point. By visualizing the obtained γ(r, X) as a map, we can discuss the spatial variation in the clusters of P points. Like Ripley’s K-function, the radius r works as a parameter representing the geographic scale of analysis (Lam and Quattrochi 1992; Ruddell and Wentz 2009). A large value gives us a macroscale perspective, while a small value permits us to analyze the local spatial pattern in detail.

We then move to the global analysis. Our question is whether P points are clustered across the region Ξ. If P points are clustered, γ(r, X) varies across locations, while γ(r, X) is uniform when points are dispersed. We thus consider the variance of γ(r, X):

$$\lambda \left( r \right) = \sum\limits_{X} {\left\{ {\gamma \left( {r,X} \right) - \overline{\gamma }\left( {r,X} \right)} \right\}^{2} } .$$
(8)

A large λ(r) indicates that P points are clustered, while a small value indicates a dispersed pattern. We randomize the labels using the earlier method to evaluate the statistical significance of λ(r). We denote Λ(r) as the probability that λ(r) or a larger value is obtained under the null hypothesis. We then define a measure

$$\varphi \left( r \right) = 1 - 2\Lambda \left( r \right).$$
(9)

The measure φ(r) ranges from − 1 to 1. Like λ(r), a large φ(r) indicates a clustered pattern of points, while a small value indicates a dispersed pattern.

4 Applications

To test the validity of the proposed method, we perform two applications. One uses a hypothetical dataset, while the other uses a real dataset. We wrote two programs in C++ and ran them on an i9-12900U CPU 2.40 GHz, RAM 128 GB computer running Windows 10 Professional.

4.1 Application to hypothetical dataset

This subsection evaluates the proposed method using point distributions, each of which consists of 1000 points in a square of side 1.0. We generated 1000 distributions and evaluated their clustering degree by the nearest neighbor method (Clark and Evans 1954; Diggle 1983). We chose five distributions whose spatial clustering degree was evaluated as the 10, 30, 50, 70, and 90 percent high, denoted by D10, D30, D50, D70, and D90. Concerning r, we tried five values r = 0.02, 0.04, 0.06, 0.08, and 0.10, which lead to 5 × 5 = 25 settings. The Gaussian distribution of mean 0 and variance 1 generated ten sets of weights for each setting, and we obtained 1000 labeling patterns according to the weights. We chose five significant and five insignificant clustering label patterns at a five percent level based on φ(r). To evaluate the statistical significance of these patterns, we performed the Monte Carlo simulation at a five percent level based on the unweighted and weighted random labeling.

Table 1 shows the number of types I (false positive) and II (false negative) errors in 10,000 experiments in each setting. Acceptable levels of type I and II errors are often said to be 5 and 20 percent, respectively (Swinscow and Campbell 2002; Suresh and Chandrashekara 2012). Experiments generally satisfy these requirements except for the type I error of the unweighted random labeling in Table 1a. The result clearly shows that the weighted random labeling reduces statistical errors. Type I errors were reduced in all 25 settings in Table 1a. Type II errors were reduced in 17 settings in Table 1b, statistically significant by the binomial test, where the p-value was 0.022.

Table 1 The number of errors in 10,000 experiments in each setting. (a) Type I errors, (b) Type II errors

4.2 Application to a real dataset

This subsection analyzes the spatial pattern of pubs in Shinjuku-ku, Tokyo. Our aim was to evaluate whether pubs are clustered among all the restaurants. We used telephone directory data provided by the NTT TownPage cooperation and building footprint data provided by the Zenrin cooperation. Figure 3 shows the restaurant distribution in Shinjuku-ku. This area contains 4187 restaurants, and 1382 of them are pubs.

Fig. 3
figure 3

The distribution of restaurants in Shinjuku-ku

Pubs prefer small buildings. We thus considered the floor size as the weight for evaluating pub clusters. Figure 4 shows the histogram of the floor size of pubs. We fitted the lognormal distribution to these data by the maximum likelihood method and obtained the distribution represented by the red line in the figure, where (µ, σ2) = (2.474, 0.462). We defined the probability that ith building is assigned to other types of restaurants by

$$t_{Qi} = 1 - \frac{{w_{Pi} }}{{\sum\limits_{j} {w_{Pj} } }}.$$
Fig. 4
figure 4

Histogram of floor sizes of pubs and the lognormal distribution fitted to the floor size distribution

We first performed the local analysis. We performed the Monte Carlo simulation 10,000 times to obtain γ(r, X) at 6173 lattice points. The calculations were completed within 100 min in all the cases. The following shows the results when r = 500, 250, and 125 m.

Figure 5 shows the distribution of γ(r, X) where r = 500 m. The two figures show the unweighted and weighted random labeling results, respectively. Red colors indicate pub clusters, while blue colors are sparse areas. Both figures show that pubs are clustered around the Shinjuku and Yotsuya stations. In contrast, pubs are clustered around the Takadanobaba station only in Fig. 5a and the Iidabashi station only in Fig. 5b. Figure 5a does not consider the floor size of buildings, while Fig. 5b uses the floor size distribution as the weight. Pubs tend to be located in small buildings, as shown in Fig. 4. Figure 5 suggests that small buildings are clustered around the Takadanobaba station, while few are clustered around the Iidabashi station. The red color around Takadanobaba station in Fig. 5a appears because of the clusters of small buildings rather than because of the pubs. They are pseudo clusters.

Fig. 5
figure 5

The distribution of γ(r, X) where r = 500 m. a Unweighted random labeling, b weighted random labeling

Figure 6 shows the distribution of γ(r, X) where r = 250 m. The geographic scale of analysis is smaller; thus, the figures provide detailed patterns of pub clusters. Red colors exist around the Takadanobaba station in Fig. 6a and the Iidabashi station in Fig. 6b. This is consistent with Fig. 5. One difference lies in the area around the Takadanobaba station, as shown in Fig. 6b. The figure indicates that pubs are clustered west of the Takadanobaba station, which is unclear in Fig. 5b. Another difference is the blue colors around the Shinjuku station in Fig. 6b. The pubs are not clustered close to the Shinjuku station.

Fig. 6
figure 6

The distribution of γ(r, X) where r = 250 m. a Unweighted random labeling, b weighted random labeling

Figure 7 shows the distribution of γ(r, X) where r = 125 m. Figure 7b shows a more detailed spatial pattern of pub clusters. Pub clusters around the Shinjuku station exhibit more complicated shapes. Pub clusters appear at the center of Shinjuku-ku and could not be detected in Figs. 5 and 4. Two clusters in the west of the Takadanobaba station are divided into three clusters, as shown in Fig. 7b.

Fig. 7
figure 7

The distribution of γ(r, X) where r = 125 m. a Unweighted random labeling, b weighted random labeling

Table 2 shows φ(r), which represents the clustering tendency at the global scale in Shinjuku-ku. Large positive values indicate that the pubs are highly clustered at these scales. The values are different between the unweighted and weighted random labelings. This finding supports the importance of considering floor size when evaluating pub clusters.

Table 2 The measure φ(r) where r = 500, 250, and 125 m

5 Conclusion

This paper proposed a new method for evaluating point clusters. The measure γ(r, X) is useful for discussing the spatial variation in point clusters, while φ(r) reflects the global tendency of point clusters. To test the validity of the method, we first applied it to a hypothetical dataset. The result statistically supports the advantage of the weighted random labeling. We then applied the method to the analysis of the spatial pattern of pubs in Shinjuku-ku, Tokyo. Empirical findings are useful and support the effectiveness of the proposed method.

An advantage of our method is that it considers all the three important points discussed in Sect. 1, i.e., spatial inhomogeneity, aspatial inhomogeneity, and analytical scale. The method, however, is not free of limitations. We discuss them and extensions for future research.

Firstly, this paper considers a numerical variable as the point attribute. SubSect. 3.1, on the other hand, also mentions categorical variables as the attribute. Categorical attributes of buildings include their structure, availability of parking lots, surrounding land use, and so forth. Weight calculation is easier than numerical variables. This, however, does not assure that the proposed method works successfully for categorical variables. Further applications are required to test the effectiveness of our method.

Secondly, this paper adopts an absolute measure to represent the geographical scale of analysis. As discussed in SubSect. 2.4, however, relative measures have their advantages. One method of relative approach is to replace the number of points in circle C(r, X) with that within the kth nearest neighboring points. We do not have to modify the proposed method in this approach substantially. It is worth trying to use relative measures with resolving the difficult problem of choosing an appropriate k.

Thirdly, we should extend the proposed method to the spatiotemporal domain. Spatiotemporal point clusters have long been discussed in the literature (Diggle et al. 1995; Kulldorff et al. 1998; Alvarez et al. 2016). It may seem easily achievable by replacing the circle C(r, X) with a cylinder. This approach, however, has two problems. Firstly, the scale of analysis depends on two variables, i.e., the radius and height of the cylinders. We will obtain various results, and the comparisons and interpretations of these results may be difficult. Secondly, the computing time will increase. An efficient algorithm is again necessary.

Fourthly, this paper considers the clusters of two labels represented as P and Q. Clusters, however, can occur where more than two labels exist. The colocation quotient developed by Leslie and Kronenfeld (2011) considers the colocation of more than three types of points. We can improve our approach to treat more than two labels, as mentioned in SubSect. 3.2. An extension in this direction seems fruitful and interesting.

Fifthly, this paper assumes categorical labels. Consideration of numerical labels is a useful extension. A question is whether points of similar numerical values are clustered, which is equivalent to the question of spatial autocorrelation analysis. Existing spatial autocorrelation measures use unweighted randomization in statistical tests. Extending our method, we may be able to introduce weighted randomization in spatial autocorrelation analysis.