1 The Voronoi Tessellation in a Spatial Galaxy Distribution: First Works and Basic Approach

The geometrical methods based on the Voronoi diagram deal with a partitioning of space into regions in a specific subset of generators. It was named after Georgy F. Voronoi (April 28, 1868, Zhuravka village, Chernihiv region, Ukraine—Nov 20, 1908, Warsaw, Poland), the outstanding Ukrainian mathematician [76, 89], who studied the general n-dimensional case of these diagrams [101, 102].

In 1984, Matsuda and Shima advanced the idea to apply the Voronoi tessellation method for describing the cellular structure of the Local Universe [63], finding a topological tendency of galaxies “to cluster at the vertices, edges and faces of polyhedral shaped voids”. In 1987, Ling demonstrated that the Voronoi tessellation and the Minimal Spanning Tree being applied to the CfA Redshift Survey of galaxies (the first survey to map the large-scale structure of the Universe) are able to detach filamentary structures and voids [60]. In 1989, Yoshioka and Ikeuchi proposed three-dimensional Voronoi tessellation as a model of the evolution of the negative density perturbations regions, which resulted in the overlapping of shells while the modeled skeleton can be compared with real observed structures and with mass distribution correlation functions [107].

For the first time, the Voronoi tessellation was considered in detail as a pattern of matter distribution in the Universe in work by Icke and Weygaert [48] and series of their following works [49, 50, 104]. These authors concluded that the regions of lower density become more spherical with evolution and matter floods away from expansion centers and accrues at the borders of packing of spheres. This leads to the partition of space on the Voronoi tessellation with nuclei in the centers of low-density regions called the voids. High-density regions—clusters of galaxies—lie at the crossing of vertexes of adjacent cells, filaments at the edge of cells, and pancakes of large-scale structure (LSS) are faces of cells (Fig. 1, right). Sheth et al. [83] have developed its idea and considered the model of a void created in the frame of the Voronoi tessellation paradigm.

The Voronoi tessellation can be constructed as follows. Let us consider a Voronoi cell of finite size in N-dimensional space (usually N \(=\) 2 or N \(=\) 3), where a fixed number of points is distributed according to some statistical law (for example, the Poisson law). Suppose that each point is the center of a spherical expanding bubble structure. If all these structures begin to expand at the same moment with the same rate, the bubbles will be touched in planes that perpendicularly bisect the lines connecting the centers of expansion. These bisecting planes, in turn, intersect each other. As a result of this process, new lines will be generated, which in turn intersect each other and form a network. Using an adopted terminology, we will call such a center of the cell as a nucleus. So, each nucleus will be enclosed by a set of (N −1)—dimensional planes forming a convex cell. Distribution of nuclei forms the Voronoi tessellation.

The realization of Voronoi tessellations for a certain number of expanding nuclei, which is known as the Voronoi foam, can be found in [48, 104]. In the case of two-dimensional realization, the construction of a Voronoi cell consists of the search for all the Delaunay triangles having three nuclei (the center of the circumscribing circle is a vertex of the Voronoi foam). The program proposed by the authors [48], allows one to find all the Delaunay triangles having \(N_{1}\) as a corner and construct the Voronoi cell belonging to \(N_{1}\) by joining the circumcentres of the Delaunay triangles. Having applied this procedure to all nuclei, we obtain the Voronoi tessellation.

The process of forming the Voronoi tessellation is shown in Fig. 1 (left). The points \(N_{0}\), \(N_{1}\), \(N_{2}\) form a Delaunay triangle obtained in a previous search; corresponding Voronoi vertex V is shown within the (dashed) circumcircle of \(N_{0}\), \(N_{1}\), \(N_{2}\) as well as stubs of the Voronoi cell walls. On the left hand side of the diagram, the T are a sequence of trial points, the third of which produces a circle that encompasses two nuclei, \(P_{1}\) and \(P_{2}\). The radius of the circumcircle of (\(N_{1}\), \(N_{2}\), \(P_{1}\)) being smaller than that of (\(N_{1}\), \(N_{2}\), \(P_{2}\)), the point \(P_{2}\) is \(N_{3}\), i.e. the third corner of the Delaunay triangle. Thus, the circumcenter of (\(N_{1}\), \(N_{2}\), \(P_{1}\)) is the next Voronoi vertex which, if connected with V, produces a complete Voronoi cell wall [48].

Fig. 1
figure 1

(Left) The construction of a new Delaunay triangle from two known nuclei \(N_{3}\) such that (\(N_{1}\), \(N_{2}\), \(N_{3}\)) forms a triangle whose circumsphere does not contain any other nucleus in the Voronoi tessellation. (Right) Identification of the four quantities which were calculated in each Voronoi cell: \(l_{i}\), the length of wall i; \(\alpha \), the angle between two walls meeting at vertex; \(d_{w}\), the distance between the nucleus and a wall, where the projection of the nucleus doesn’t necessarily lie on the wall ([48], open astronomy)

The obtained results could explain the heuristic models that supposing Voronoi tessellations as 3D templates for the galaxy distribution as well as could reproduce a variety of galaxy clustering properties. In an ideal scenario, the LSS is organized by equal spherical voids expanding at the same rate. The walls and filaments would be found precisely between expanding voids, and the resulting LSS web skeleton would the Voronoi tessellation.

The Voronoi tessellation method was picked up and also thrived in our research on a spatial galaxy distribution since 1990-is [92] that allowed us to obtain several priority results. Namely, we elaborated three main approaches in Voronoi tessellation application: (1) to describe a cosmic web skeleton in matter distribution as a Voronoi tessellation with nuclei at low-density regions; (2) to use Voronoi tessellation as a tool for direct measurement of galaxy local concentration and environmental description of low-populated galaxy systems such as triplets, pairs, and isolated galaxies; (3) to apply Voronoi diagrams altogether with machine learning methods for 3D mapping of the Zone of Avoidance of our Galaxy [97, 99], where Generative Adversarial Network (GAN) algorithms are very useful [1, 36]. In particular, Coutinho et al. [19] performed verification of various algorithms that can reproduce the cellular structure of the Universe. By comparing the simulated distributions with real observational data, these authors showed that the best algorithm uses the nearest neighbour parameter between galaxies, and that network algorithms can be improved to reproduce the large-scale structure of the Universe.

We give examples in Sect. 2, how manner our developed approach is working. We briefly overview in Sect. 3 various astronomical research with the Voronoi diagrams, accentuating the papers related to the large-scale structure of the Universe, as well as we highlight in Sect. 4 several works and software, where the Voronoi tessellation and machine learning get along well with each other.

2 Voronoi Tessellation of the First, Second and Third Orders: Identification of the Low-Populated Galaxy Groups, Environment Effect, and Dark Matter Content

Because of Voronoi tessellation is a geometrical method based only on galaxy positions, it allows detaching overdensity regions of galaxies in comparison with the background [93]. We tested it with various samples of galaxies. First of all, we used the Local Supercluster of galaxies, which is well studied among other galaxy superclusters, for identifying galaxy groups of various populations. It was revealed that Voronoi’s tessellation method depends weakly on the richness-parameter of groups, and the number of galaxies in the rich structures is growing rather than in the weak structures with an increase of this parameter [64].

In the first-order Voronoi tessellation, the critical parameter is the volume of the galaxy’s Voronoi cell V. This parameter characterizes an environmental galaxy density. The condition of cluster/group membership of a particular galaxy is the relatively small V. This condition is actual when close neighbouring galaxies surround the galaxy. That is why the first order Voronoi tessellation is not corrected for the identification of small isolated galaxy systems [64].

We used the second-order Voronoi tessellation for the identification of pairs and single galaxies. Each galaxy i from set S forms the common cells with a certain number of neighbouring galaxies (Fig. 2). So, under neighbouring galaxies of galaxy i, we understand only galaxies that create common cells with this galaxy. For example, galaxy 1 creates only 4 common cells (\(V_{1,2}\) , \(V_{1,3}\) , \(V_{1,4}\) , \(V_{1,5}\)) with neighbouring galaxies 2, 3, 4, and 5, respectively. Each pair of galaxies i, j is characterized by the dimensionless parameters \(p_{i,j}\):

$$\begin{aligned} p_{i,j} = \frac{\root D \of {V_{i,j}}}{m_{i,j}}, \end{aligned}$$
(1)

where D—space dimension, \(V_{i,j}\)—the area (for 2D) or volume (for 3D) of cell, \(m_{i,j}\)—distance between galaxies i and j. So, contrary to the first-order tessellation, the second-order tessellation for set S distribution of nuclei is the partition of the space which associates a region \(V_{1,2}\) with each pair of nuclei 1 and 2 from S in such a way that all points in \(V_{1,2}\) are closer to 1 and 2 than other nuclei from S. Region \(V_{1,2}\) is a common cell for nuclei 1 and 2. However, these nuclei do not need to lie in the common cell. For example, nuclei 1 and 5 create the common cell \(V_{1,5}\), and they do not lie in this cell. In such a way, the second-order Voronoi tessellation is available for the identification of single galaxies and pairs (Fig. 2b).

Let us introduce the parameter \(p_{e}\), which describes only pair environment and does not depend on the distance between pair members directly. We define it as the mean value of \(p_{j}(1)\) and \(p_l\)(2) parameters of the first and second galaxy, excepting p from both sets:

$$\begin{aligned} p_{e} = \frac{\sum _{j = 2}^k p_j (1) + \sum _{l = 2}^n p_l (2)}{k+n-2}, \end{aligned}$$
(2)

where k and n—number of neighbouring galaxies for 1 and 2 galaxies of geometric pair, respectively. We started sums from j \(=\) 2 and l \(=\) 2 for excepting 2p, because the first galaxy is neighbour for the second galaxy and vice versa. Therefore (\(k + n-2\)) is sum of neighbouring galaxies of pair members excepting of pair galaxies as neighbouring for each other. Parameter \(p_{e}\) depends on the distribution of neighbouring galaxies. A small value of \(p_{e}\) points out that such a pair is located in a loose environment. In such case the average volume of common cells of pair components with neighbouring galaxies is relatively small, and distance between them is significant, see formula (1) and Fig. 3a.

Fig. 2
figure 2

2D Voronoi tessellation of the first- (a), second- (b) and third- (c) order for the same distribution of the random nuclei ([34], open astronomy)

Fig. 3
figure 3

Different configurations of the galaxies: isolated close pair (a) and isolated single galaxy (b) in the second-order tessellation; isolated close triplet in the third-order tessellation (c) ([34], open astronomy)

A single galaxy is a galaxy, which is not a member of any geometric pair. The single galaxies are field galaxies in the environment of geometric pairs. Every single galaxy has the own neighbours; single galaxies and geometric pair members can be among them. According to the second-order Voronoi tessellation, the larger is the degree of galaxy isolation, the larger is the number of neighbours (see Fig. 2b in comparison with Fig. 3b), but these neighbours locate farther. The best parameter that describes the isolation degree of the single galaxy, s, is the mean value of all parameters \(p_{j}\) of this galaxy:

$$\begin{aligned} s = \frac{\sum _{j = 1}^k p_j}{k} \end{aligned}$$
(3)

The third-order Voronoi tessellation is appropriate for the identification of galaxy triplets. It is the partition of the space which associates a region \(V_{1,2,3}\) with each triplet of nuclei 1, 2, 3 in such a way that all points in \(V_{1,2,3}\) are closer to nuclei 1, 2, 3 than other nuclei from S [59]. All points of the common triplet’s cell are closer to galaxies of this triplet than to other galaxies. Similarly to the parameter \(p_{i,j}\) for pairs, we can set up the parameter \(t_{i,j,u}\) for triplets:

$$\begin{aligned} t_{i,j,u} = \frac{\root D \of {V_{i,j,u}}}{max(m_{i,j}, m_{i,u}, m_{j,u}),} \end{aligned}$$
(4)

where D is the space dimension, \(V_{i,j,u}\) is the area (for 2D) or volume (for 3D) of the cell, and \(m_{i,j}\), \(m_{i,u}\), \(m_{j,u}\) are the distances between galaxies in the triplet. A geometric triplet in the third-order Voronoi tessellation contains three galaxies that have a common cell and the same maximal parameters \(t_{max}(1) = t_{max}(2) = t_{max}(3) = t\). The parameter t characterizes a degree of geometric triplet isolation. We can define the parameter of triplet environment \(t_{e}\) as the mean value of parameters \(t_{i}(1)\), \(t_{j}(2)\), and \(t_{u}(3)\), except t from three sets:

$$\begin{aligned} t_{e} = \frac{\sum _{i = 2}^k t_i (1) + \sum _{j = 2}^n t_j (2) + \sum _{u = 2}^q t_u (3)}{k+n+q-3} \end{aligned}$$
(5)

here in the case of the third-order Voronoi tessellation, k, n, and q denote the number of neighbouring triplets which contain galaxies 1, 2, and 3, respectively. Therefore, (\(k + n + q -\) 3) is the number of neighbouring triplets for a certain triplet that contain at least one galaxy from this triplet (see, Fig. 3).

Parameters p, s, and t are the basic ones and define the isolation degree of a galaxy pair, single galaxy, or triplet compared to the background, respectively. Parameters \(p_e\) and \(t_e\) are additional ones and contain information about the distribution of the neighbouring galaxies (environment). Similar to the second- and third-order Voronoi tessellations, it is possible to apply more high-order Voronoi tessellations to identify galaxy quartets and quintets, etc.

So, one can use galaxies as the nuclei of the Voronoi tessellation taking into account the equatorial coordinates \(\alpha \), \(\delta \) and radial velocities of galaxies \(V_h\) only. For the construction of the 3D Voronoi tessellations, it is necessary to determine the distances in 3D space. The spatial distance between two galaxies can be decomposed into projected (tangential) distance r and radial component v (difference of the radial velocities). We can determine the projected distance with a relatively high accuracy. Simultaneously, the radial component has errors due to the inaccuracy of radial velocity measurement of each galaxy and existing strong peculiar velocities (due to virial motions of galaxies in groups and clusters). As a result, the galaxy distribution in the radial velocities space is extended along the radial component, the so-called fingers-of-God effect. This is attributed to the random velocity dispersion in a galaxy volume-limited sample that cause a galaxy’s velocity to deviate from pure Hubble flow, stretching out a group of galaxies in redshift space [54, 66]. Various authors take into account this effect in their way, depending on the specifics of their problem. For example, Marinoni et al. [62] chose some cylindrical window of clustering, which is extended along the radial component. We introduced the weight for a radial component [34], avoiding the problem of tangential and radial distance in equivalence to apply the high-order 3D Voronoi tessellation method.

An efficient way to show Voronoi tessellation advantages was to apply it to the galaxy samples from the Local Supercluster [64, 94, 96] and the Sloan Digital Sky Survey (SDSS), where at the first time we examined it for spectroscopic aims [34, 65, 67, 70, 77, 95]. We did not consider galaxies that located within \(2^{o}\) near borders, because the correct estimation of Voronoi cell volume is not possible in this case. Selecting single galaxies and pairs by the second-order Voronoi tessellation, as well as triplets by the third-order Voronoi tessellation method, we obtained 2196 geometric pairs, 1182 triplets, and 2394 single galaxies. We did not make a clear division between physical gravitationally bound systems and non-physical ones, following the supposition that the more isolated a system is, the higher probability that it is physical (compact pairs are with \(R_h<\) 150 kpc and triplets are with \(R_h<\) 200 kpc).

Fig. 4
figure 4

Mass-to-luminosity ratio diagram for galaxy systems of different population (star clusters, galaxies, galaxy groups, clusters, and superclusters), where the result for the low-populated groups [66] is pointed ([96], open astronomy)

Estimating the dark matter content in the low-populated groups, we obtained the median values of mass-to-luminosity ratio [34]: \(12M_{Sun}/L_{Sun}\) for the isolated pairs and \(44M_{Sun}/L_{Sun}\) for the isolated triplets. Note that for the most compact pairs and triplets (with R < 50 (100) kpc, respectively) there is not a very large difference in dark matter content for pairs and triplets: \(7~M_{Sun}/L_{Sun}\) and \(8~M_{Sun}/L_{Sun}\). The mass-to-luminosity ratio diagram for galaxy systems of different population (star clusters, galaxies, galaxy groups, including the low-populated ones, clusters, and superclusters) is presented in Fig. 4. Several examples of isolated triplets of galaxies are given in Fig. 5. We conclude about the dark matter distribution that for the dynamically younger sparsely groups (triplets), dark matter is more likely associated with the individual galaxy halos, for the interacting and late sparsely groups the dark matter lies in a common halo of galaxy groups.

Using an inverse volume of Voronoi cell (1/V) as a parameter describing the local environmental density of a galaxy, we considered the volume-limited SDSS (DR5 and DR9) galaxy samples (\(0.02<z<0.1\), \(-24<M_r<-19.4\)) [26, 27, 30, 67] and found that

  • the early type galaxies prefer to reside in the Voronoi cells of smaller volumes (i.e., dense environments) than the late type galaxies, which are located in the larger Voronoi cells (i.e., sparse environments);

  • the relationships between the morphological types and the \(u-r\), \(g-i\), and \(r-z\) color indices of pairs of galaxies with radial velocities \(3000<V<9500\) km/s evident that the Holmberg effect is not revealing, by the other words, it can be considered only in historical aspect [28];

  • properties of such small groups as pairs and triplets, where segregation by luminosity was clearly observed, are fit well to Dressler effect: galaxies in isolated pairs and triplets are on average two times more luminous than isolated galaxies;

  • the dependence of the color indices and stellar magnitudes is effective for the automated morphological classification of the galaxies (E—early types, L—late types).

Fig. 5
figure 5

The interacting (VV894), most compact (KTG39), and wide triplets of galaxies, where \(S_{v}\) is the rms velocity of galaxies with respect to the triplet centre, \(R_{h}\)—harmonic mean radii of the triplet, \(\tau = 2H_{0}R_{h}/S_{v}\)—its dimensionless crossing time ([96], open astronomy)

The morphological types of the galaxies were divided into two classes: Early—E (from elliptical to lenticular) and Late—L (from spiral Sa to irregular Irr types). The absolute magnitude

$$\begin{aligned} M_{r} = m_{r} - 5log(V/H_{0})-25-K(z)-ext \end{aligned}$$
(6)

could be corrected for Galactic absorption ext in accordance with [81] and K—correction K(z) according to [16]. Here we used the CDM model of the Universe with the WMPS7 cosmological parameters (\(\Omega _{M} =\) 0.27, \(\Omega _{\Lambda } =\) 0.73, \(\Omega _{k} =\) 0, \(H_{0} =\) 0.71). In order to apply the Voronoi tessellation method we should done transition from equatorial coordinates and velocities to the comoving x, y, z coordinates for each central galaxy in the sample (\(M_r < -20.7\)). To do this we can transform the redshift z to the corresponding distance \(\chi (z)\) for each galaxy by integrating as follows

$$\begin{aligned} \chi (z) = D_{H}\int _{0}^{z} \frac{dz'}{E(z')} \end{aligned}$$
(7)

where \(D_{H}=c/H_{0}\) is the Hubble distance and \(E(z')\) is the Hubble parameter defined as follows

$$\begin{aligned} E(z') = \sqrt{\Omega _{M}(1+z)^3 + \Omega _{k}(1+z)^2 + \Omega _{\Lambda }} \end{aligned}$$
(8)

The coordinates x, y, z of the galaxies in the comoving space are determined as follows

$$\begin{aligned} x = \chi (z)cos(\theta )cos(\phi ) \end{aligned}$$
(9)

where (\(\theta \)) is the declination of each galaxy, (\(\phi \)) is the right ascension, and (\(\chi (z)\)) is the corresponding distance for redshift z. After getting the three-dimensional Cartesian coordinates of the galaxies, we divided the geometrical space occupied by the galaxy sample in mosaic cells (volumes V in the 3D case). Each cell has a galaxy as a nucleus and consists of elementary volumes of space closer to this galaxy than to any other galaxy [63]. The use of the Voronoi tessellation to isolate groups of galaxies in three dimensions has been described in detail by Melnyk et al. [64]. Figure 2a shows an example of the Voronoi tessellation in a two dimensional case. Let us use the value of inverse volume (1/V) of the Voronoi cells to describe the density of galaxy environments; when 1/V is higher, a galaxy is less isolated.

Fig. 6
figure 6

The distribution of the number of galaxies versus inverse volume of the Voronoi cell (local density parameter), with early morphological type E indicated by red lines and late type L indicated by blue lines, for different ranges of redshift; absolute stellar magnitude of galaxies selected from the SDSS at z < 0.1 is \(-22.5<M_{r}<-21.5\). The number of galaxies in each bin is normalized to the total number of \(E \div L\) within the given subsample. The number of central bright E and L galaxies is as follows: E = 1636, L = 459 for \(0.02<z<0.04\), E = 3609, L = 1247 for \(0.04<z<0.06\), E = 9432, L = 3596 for \(0.06<z<0.08\) [27]

Examples of the distributions of E and L galaxies versus inverse volume of the Voronoi cells that contain them are shown in Fig. 6. In work [27] we grouped galaxies from the SDSS sample at z < 0.1 into 4 logarithmic intervals \(1/V<0.001\), \(0.001<1/V<0.01\), \(0.01<1/V<0.1\), and \(1/V>0.1\) for four ranges of the redshift \(0.02<z<0.04\), \(0.04<z<0.06\), \(0.06<z<0.08\), and \(0.08<z<0.1\) (in the rows) and for different ranges of absolute stellar magnitude, \(-21.5<M_{r}<-20.7\), \(-22.5<M_{r}<-21.5\), and \(M_{r}<-22.5\). The number of galaxies in each bin for the E and L types is normalized to the total number of \(E \div L\) galaxies within the given subsample. Figure 6 shows that the fraction of galaxies of spiral and late types becomes larger while redshift increasing, while the fraction of early types, on the contrary, is smaller. It explains the well known evolutionary trend of a reduction in the number of galaxies with suppression of star formation for increasing redshift [21, 90], even at comparatively low redshifts down to \(z<0.1\). Also, for the brighter galaxies in the sample, the fraction of galaxies of earlier types is larger since, on the average, earlier types have higher luminosities (the well-known morphological type versus colour indices/luminosity relation) [8, 44, 73]. The brightest galaxies of earlier types with \(M_{r}<-22.5\) appear preferentially in denser environments: the peak of the distribution of the inverse volumes of the Voronoi cells for the E types lie within the interval \(0.01<1/V<0.1\), while in other intervals of \(M_{r}\), for the L types the peak of the distribution always is within \(0.001<1/V<0.01\) (the morphology-density relation [8, 28, 32, 44]).

We can also determine the density of galaxies in a Voronoi cell, including their faint satellites, i.e., galaxies with \(M_{r}>-20.7\): \((n+1)/V\), where n is the number of faint galaxies in the Voronoi cell, and V is the volume of the Voronoi cell. We also constructed distributions of early E and late L types galaxies in dependence on the parameter \((n+1)/V\) in four intervals: \((n+1)/V<0.01\), \(0.01<(n+1)/V<0.1\), \(0.1<(n+1)/V<1\), and \((n+1)/V>1\). The number of galaxies is normalized to the number of \(E \div L\) galaxies within the given range of \((n+1)/V\). We examined the density of galaxies only in the first two redshift intervals, since we cannot evaluate the evolution of their properties at a higher z because there are not enough faint galaxies. However, we can compare the galaxies’ environmental density as a function of the absolute magnitude and morphological type of the bright central galaxy. Thus, the fraction of early types of central galaxies increases with increasing environmental density, while, on the other hand, the fraction of late types decreases; that is, the earlier types are in a denser environment than the late types. When the central galaxy is brighter, the fraction of early types in a subsample will be larger [27, 100].

3 The Voronoi Tessellation in Astrophysical Research

Ebeling and Wiedenmann [33] were the first to apply the Voronoi tessellation for finding galaxy groups and clusters. Later such an approach was used by Ramella et al. [78], Kim et al. [56], Lopes et al. [61], Barrena et al. [6], Melnyk et al. [65], Panko and Flin [71]. Doroshkevich [31] introduced its for filaments and walls (1D and 2D LSS structures) as well as Neyrinck [68] for the search of voids in a spatial galaxy distribution.

We note some important earlier works as concerns with other applications of Voronoi diagrams to the large-scale galaxy distribution: for revealing the quasi-periodicities in a pencil-beam survey [51, 87], for a description of constraints on the Voronoi model when applied to the isotropic cosmic microwave background [17]. A significant contribution for Voronoi tessellation application to various astronomical tasks was made by Zanninetti, who considered two- and three-dimensional cases of the explosion scenario likely supernova events and developed a dynamical method allowing to describe the explosion phases [108, 109].

Ramella et al. [78] created a Voronoi Galaxy Cluster Finder, which uses positions and magnitudes of galaxies to define galaxy clusters and extract its parameters: size, richness, central density, etc. The 3D Voronoi tessellation for galaxy group identification was realized by Marinoni et al. [62] and Cooper et al. [18]. Weygaert et al. prepared a useful review of the spatial galaxy distribution with Delaunay and Voronoi tessellations [42, 105]. They discussed the Delaunay Tessellation Field Estimator (DTFE) and the concept of Alphashapes for matter distribution; the Multiscale Morphology Filter (MMF), which uses the DTFE for detachment of filaments, sheets, and clusters; the Watershed Voidfinder (WVF) to identify voids.

The era of big data surveys (see, for example, review in work by Vavilova et al. [98] accelerated the Voronoi diagrams application on a spatial galaxy distribution properties and environment influence: \(z = 0.1 - 3.0\), COSMOS survey [82]; z \(\le \) 0.5, Herschel-ATLAS/GAMA [9]; z < 0.1, Coma Supercluster [20]; z < 0.3, ALHAMBRA survey [79]. Söchting et al. used Voronoi tessellation within overlapping slices in the photometric redshift space (0.2\(<z<\)3.0). It allowed them to detach region \(z\sim 0.4\) with a slow emergence of virialized clusters accordingly to the hierarchical scenario and to detect new superclusters as the peaks of a matter distribution up to z \(=\) 2.9 [85]. As for the Voronoi tessellation cluster finder algorithms, we note the work by Soares et al., who developed it to produce reliable cluster catalogs up to \(z=1\) or beyond and down to \(10^{13.5}\) Msun. They built the Voronoi tessellation cluster finder in photometric redshift shells and used the two-point correlation function of the galaxies in the field to determine the density threshold for the detection of cluster candidates and to establish their significance [84].

A principal new galaxy cluster finder based on a 3D Voronoi Tessellation plus a maximum likelihood estimator, followed by gapping-filtering in radial velocity (\(VoML+G\)), was created. Pereira et al. [74, 75] applied it successfully to find optical clusters (\(R_{200}\)) in the Local Universe as well as Santiago-Batista et al. for the identification of continuous filaments in the environment of superclusters [80]. Grokhovskaya et al. developed filtering algorithms of multiparameter analysis of the large-scale distribution of galaxies (identification of galaxy systems and voids) in narrow slices in the entire range of redshifts of HS 47.5-22 constructing density contrast maps, namely with adaptive kernel and Voronoi tessellation [40]. The 3D Voronoi tessellation application to the DEEP2 survey was first introduced by Gerke et al. [38]. Meanwhile, Shen Ying et al. [106] proposed an algorithm which computes the cluster of 3D points by applying a set of 3D Voronoi cells and allows a 3D point cluster pattern can be highlighted and easily recognized.

Hung et al. have demonstrated that Voronoi tessellation Monte-Carlo mapping is beneficial for studying the environment effect on galaxy evolution in high-redshift large-scale structures (\(z\) \(\sim \)1) in the ORELSE survey (Observations of Redshift Evolution in Large Scale Environments) [47]. An exciting application of Voronoi tessellation was proposed by Lam et al. [57]: for constructing the white dwarfs luminosity functions they used parameters of proper motion and colours from the Pan-STARRS 1 3\(\pi \) Steradian Survey Processing Version 2; for improving the accuracy of the maximum volume method they used Voronoi tessellation space binning to recalculate photometric/astrometric uncertainties. It helped to estimate disk-to-halo dark matter ratio as 100. Another a non-parametric method for estimating halo concentration using Voronoi tessellation, TesseRACT, was proposed by Lang et al. [58], who showed that it fits well with non-spherical halos and more accurate at recovering intermediate concentrations for N-body halos than techniques that assume spherical symmetry.

The very interesting algorithm, Void Finder ZOBOV (ZOnes Bordering On Voidness), based on Voronoi tessellation, was proposed by Neyrinck et al. [68]. This algorithm finds density depressions galaxy distribution without free parameters. To estimate local density, it uses the Voronoi tessellation. One of the output of this algorithm is the probability that each void arises from Poisson fluctuations. However, Elyiv et al. [35] have demonstrated a weak spot for ZOBOV void finder. Voids are the lowest density regions, so any method that uses the positions of galaxies directly to measure density for identifying the voids is then prone to shot noise error since voids are the regions with a very low concentration of galaxies by definition (Fig. 7). The Void IDentification and Examination toolkit (VIDE) developed by Sutter et al. [88] includes the parameter-free void finder ZOBOV, where “Voronoi tessellation of the tracer particles is used to estimate the density field followed by a watershed algorithm to group Voronoi cells into zones and subsequently voids”.

Fig. 7
figure 7

The reconstructed displacement field (top panels) and its divergence (bottom panels) obtained with the two void finders, the Uncorrelating Void Finder (left-hand panels) and the Lagrangian Zel’dovich Void Finder (right-hand panels). The displayed region’s size is 80 \(\times \) 80 Mpc \(h^{-1}\), with a thickness of 5 Mpc \(h^{-1}\). Black dots represent dark matter haloes. The amplitude of the vector field components (red arrows) is reduced by a factor of 0.75, for visual clarity ([35], open access)

Zaninetti in series of works [110, 112] developed a practical statistics for the voids between galaxies with two new survival functions and considered the 3D distribution of the volumes of Poissonian Voronoi Diagrams to their 2D cross-sections in the assumption of gamma-function for the 3D statistics of the volumes of the voids in the Local Universe. He also conducted simulations [111] of a spatial galaxy distribution using the Poissonian Voronoi polyhedra and the 2dF Galaxy Redshift Survey and the Third Reference Catalog of Bright Galaxies; Zaninetti gives a brief overview of a current status of the research on the statistics of the Voronoi Diagrams in [113].

Among other astronomical tasks, the Voronoi diagrams have been used for image processing, adaptive smoothing, segmentation, for signal-to-noise ratio balancing [14], for spatial structure of the solar wind and solar-terrestrial connections [7], for spectrography data analysis in different electromagnetic regions [12, 13, 25], in the moving-mesh cosmology simulation [86] and [103] (AREPO Public Code), chemical evolution in the early Universe [15], star formation simulation [46], spatial distribution of lunar craters [45].

For example, Cabrera et al. [11] applied the Voronoi diagram for image reconstruction technique in the interferometric data based on the Bayesian approach. Cadha et al. proposed Voronoi compact image descriptors and showed that Voronoi partitioning improves the geometric invariance and performance of image retrieval [14] as well as they developed a Voronoi-based machine learning method (deep convolution neural network). As for the cosmological simulation, Busch and White [10] used Voronoi tessellation for a hierarchical tree structure that allowed them to associate local density peaks with disjoint subsets of particles and to analyze mass distribution at different levels of threshold. Similar to our work [27], when we introduced parameter of the volume of Voronoi cell to study environment influence on galaxies from the SDSS, Paranjape and Alam [72] also applied the inverse local number density parameter. They studied physical effects for such properties as halo (subhalo) mass, large-scale environment, etc. in various cosmological dark matter models and concluded that the Voronoi volume function gives a new mathematical instrument for galaxy evolution physics and dark sector study. Nightingale et al. developed the PyAutoLens software (https://github.com/Jammy2211/PyAutoLens), where the Voronoi tessellation is applied for reconstruction of strongly lensed galaxies.

Neyrinck developed the sectional-Voronoi algorithms in Python for cosmic-web research, because the Voronoi/Delaunay duals and origami tessellation give a wide class of spiderwebs. “Voronoi edges are perpendicular bisectors of their corresponding Delaunay edges; the ‘bisector’ part can be relaxed. Each Voronoi edge may be slid along its Delaunay edge, closer to one of the generators. They may not be slid entirely independently, though, since the Voronoi edges must still join vertices. There turns out to be one extra degree of freedom per generator, causing its cell to expand or contract. The result is a sectional-Voronoi diagram, a section through a higher-dimensional Voronoi tessellation. A generator’s extra degree of freedom in a sectional-Voronoi diagram can be thought of as its distance from the space being tessellated. A sectional-Voronoi diagram can also be thought of as a Voronoi tessellation in which each generator may have a different additive ‘power’ in the distance function used to determine which points are closest to the generator (thus an alternative term, power diagram). Ash and Bolker [2] showed that 2D spiderwebs and sectional Voronoi tessellations are equivalent” (cited by [69]). The package is available at https://github.com/neyrinck/sectional-tess, https://mybinder.org/v2/gh/neyrinck/sectional-tess/master.

In the present day, the Voronoi diagrams methods have many applications in various fields of science and technology as well as in social sciences and visual art [3, 4]. They are commonly used in computational fluid dynamics, computational geometry, geolocation and logistics, game dev programming, cartography, engineering, liquid crystal electronic technology, etc. For the first time, the Voronoi tessellation was utilized by Debnath et al. [22] for the discoveries in the particle physics beyond the Standard Model at the Large Hadron Collider at CERN. “Since such tessellations capture the underlying probability distributions of the phase space, interesting features in the data can be detected by studying the geometrical aspects of the ensemble of Voronoi cells” (cited by [23]). These methods allow identifying kinematic edges in two dimensions and generalize the technique for robust detection of phase space boundaries, which could be applied to discover new physics. An interesting library of “Voronoi Diagrams: Applications from Archaeology to Zoology” is collected by Scot Drysdale on the website https://www.ics.uci.edu/~eppstein/gina/scot.drysdale.html.

4 The Voronoi Tessellation and Machine Learning

Straight application of classical Voronoi diagram in Machine Learning is the k-nearest neighbors (k-NN) algorithm at the number of neighbors \(k=1\). In the case of the classifier, the output class is choosing among its k the closest neighbors. Each of them gives a contribution to the class with some weight. Normally weight is inverse to the distance between target object and neighbor (closer neighbors will have a stronger influence than further neighbors) or uniform (all points in neighborhood are weighted equally). If \(k=1\), then the object is just linked to the class of the nearest neighbor. From the other side, it could be interpreted as the building of the Voronoi diagram by training objects as nuclei of the diagram. The target object will have a class depending on which Voronoi cell it resides. Bring your data to life.

A set of programming codes for 1-NN visualization (\(k=1\)) with examples (Hover Voronoi, a demonstration of d3-Delaunay, Voronoi Labels, Voronoi neighbors, Voronoi update) are available on the website https://observablehq.com/@d3/ by Mike Bostock (2018). For the color image segmentation problem in computer vision, an adaptive and unsupervised clustering approach with Voronoi diagrams was introduced, which outperforms the existing algorithms [43]. A Python library “Pycobra” contains several ensemble machine learning algorithms and visualization tools based on the Voronoi tessellations [41]. It can be downloaded from the Python Package Index (PyPi) and Machine Learning Open Source Software (MLOSS) at https://github.com/bhargavvader/pycobra.

In the case of \(k>1\), we should use the concept of high order Voronoi diagram, where a cell represents the set of points in space closer to a given k nuclei that to all others (see, Sect. 2 and works by Elyiv et al. [34, 99]). In this case, k–order Voronoi space dividing can help us to find k-near neighbors directly. The crossing of high-order Voronoi diagram borders represent the set of k near neighbors. In k–NN regression, the output value for the target object is the average of the values of k nearest neighbors. If each neighbor has equal weight, it means that for each cell could be assigned pre-calculated averaged value. Next, if the target object resides in this cell, automatic assigned could be done. In all these cases, creating a Voronoi diagram on the training sample could make a faster k–NN algorithm application.

For example, Inkulu and Kapoor [53] presented an algorithm covering the Voronoi diagram with hyperboxes, which provides ANN queries. Another parallel spatial range query algorithm based on Voronoi diagrams and MR-tree, which is benefiting from the k-NN, is developed by Fu and Liu [37].

Voronoi diagram also has a wide application in deep learning. In work [5], the authors studied the geometry of Deep Artificial Neural Networks with piecewise affine and convex nonlinearities. The authors demonstrated that each layer’s input space partition corresponds to the Voronoi diagram with several regions that grow exponentially with increasing neurons. Numerical experiments support these theoretical results, which are expressed by the Deep ANN decision boundary in the input space, a measure of its curvature that depends on the network architecture, activation functions, and weights.

In work [52] the authors presented a Deep Convolution Neural Network (CNN) constructed on a Voronoi tessellation of 3D molecular structures of proteins (VoroCNN model). Both convolution and pooling operations were used as a part of network architecture to predict local qualities of 3D protein folds. They computed Voronoi tessellation of molecular 3D structures and converted them into a protein interaction graph. The graph’s critical property is that it implicitly keeps the information about the spatial relationship between the atoms of the protein model. The authors claim that for presently available amounts of data and computational resources, Voronoi tessellation is the best representation of 3D protein structure than raw volumetric data.

5 Instead of Conclusion

Today, hierarchical clustering is a common scenario for the evolution of galaxies. The fact that galaxies are observed mostly at redshifts to \(z \sim 5\), while the most distant observed galaxy clusters are at \(z \sim 2\), suggests that galaxies and sparsely populated groups were formed first, and galaxy clusters later by subcluster merging and/or via capturing galaxies and galaxy groups. The hierarchical clustering scenario is in good agreement with the cosmological \(\Lambda \)CDM model. Having great success in explaining the formation of the large-scale structure of the Universe as a whole, this model faces potentially severe problems on the scales of individual dark halo of galaxies and galaxy clusters, with statistics of the distribution of galaxies with different morphological types in a wide range of redshifts, with evolutionary properties of sparsely populated groups and galaxy clusters, with the lack of data on the large-scale structure of the Universe behind the Zone of Avoidance of the Galaxy.

In this context, we have demonstrated the perfection and elegance of the Voronoi tessellation in solving many astronomical problems, focusing on its effectiveness for describing the web of large-scale structures of the Universe and data mining of its properties at various redshifts from early epochs to the scales of the Local Universe.

Fig. 8
figure 8

(Left) Vortex theory applied to the Solar system (R. Descartes, 1644) ([4], open access). (Right) Illustration of the Voronoi tessellation for galaxy 2D web distribution

“God first partitioned the plenum into equal-sized portions, and then placed these bodies into various circular motions that, ultimately, formed the three elements of matter and the vortex systems” (cited by Descartes [24], vol. III, article 46, in [4]). “The modern view shoves baryogenesis, leptogenesis, WIMP–genesis, and all very far back in time, but builds up structure continuously, using not-very-special initial conditions and gravity (plus perhaps other forces) to develop what we see today. In between come some remarkable constructs, including Thomas Wright’s hierarchy, Descartes’s Voronoi tessellation of whirlpools in the ether, Alfred Russel Wallace’s (yes, the evolution guy) “Goldilocks” location for the Solar system, Cornelis Easton’s off-center spiral arms, and the Kapteyn Universe” (cited by  Trimble [91]). We have combined this representation, which is consonant with ours, in Fig. 8 as an illustration of partitioning the space into cells for the subsequent extraction of the physical essence of the phenomena: one of them displays classical physics, Vortex theory applied to the Solar system (Descartes, 1644), the other gives a visualization of galaxy distribution through the 2D-Voronoi tessellation.