1 Introduction

Particle shape profoundly affects the engineering behavior of coarse-grained soils. Experimental studies have shown that angular and elongated particles exhibit larger values of index void ratios, internal friction, dilatancy, constant volume friction angle, compressibility, and small-strain modulus than rounded and spherical soils [1,2,3,4,5,6,7,8,9,10,11,12,13,14]. With the development of computer modeling in geotechnical engineering, the discrete element method (DEM) has become a popular method to investigate the relationship between micro-level particle morphology and macro-mechanical behavior of granular soils. However, the typical DEM uses spherical approximations rather than the actual particle shapes to simulate individual particles, which cannot provide adequately accurate insight into the mechanical behavior of granular soils consisting of non-spherical particles [15, 16].

Three-dimensional (3D) imaging techniques have considerably advanced in the last two decades, which have been used by geotechnical engineers to scan three-dimensional (3D) realistic particle geometries for DEM simulations and other analytical research. Therefore, many 3D imaging techniques have been used in geotechnical engineering, such as X-ray computed tomography (X-ray CT) [17,18,19,20,21,22,23], laser scanning technique [24,25,26], optical interferometer [27, 28], stereophotography [29,30,31], and structured light technique [32].

X-ray computed tomography (CT) is an ideal technique to scan 3D particle geometries. However, the sizes of soil specimens for X-ray CT scans are typically approximately 12 mm in diameter and 24 mm in height [17,18,19,20,21,22]. Therefore, scanning a sufficient amount of soil particles for performing a triaxial test simulation (diameter = 50 mm, height = 100 mm) requires approximately 70 scans. In addition, analyzing X-ray CT images to separate air and solid particles requires extensive image processing skills and demanding computational efforts. Therefore, it is not efficient to perform many X-ray CT scans. A more feasible approach to generate realistic particles is to do one, or several of X-ray CT scans to obtain shape characterizations, and then use these characteristics to generate as many particles as necessary for DEM simulations [33].

Many algorithms have been developed to generate realistic particles. Most of these algorithms were based on spherical harmonics techniques. For example, Grigoriu et al. [34] made early attempts to use spherical harmonics techniques to generate realistic aggregates for concrete. Liu et al. [23] combined spherical harmonics with random field theory for sand particle generations. Zhou et al. [35] combined spherical harmonics and principal component analysis to generate realistic particles. Wei et al. [36] combined spherical harmonics with fractal dimension to generate realistic particles. Su and Yan [37] combined spherical harmonic with multivariate random vector techniques to generate realistic particles. These excellent works enabled computers to generate realistic particles for DEM simulations and other mechanical analysis.

These existing studies aimed to generate random particles but with similar morphology to a target soil particle. The spherical harmonics technique was used to analyze a large number of particles from a granular soil to extract morphological properties. Then, the morphological property was used to generate as many particles as necessary. This research aimed to clone a single particle. This was challenging because only limited morphological property from a single particle was available. This research addressed this issue by developing a novel probability-based spherical harmonics technique. The spherical harmonics coefficients were extracted from the particle geometry to identify the morphological property, which is analog of the “morphological gene” of this particle. Then, the probability function was used to add variances to spherical harmonics coefficients to create “gene mutation” to morphological gene, which enabled a computer to generate random morphological variances in the generated particles to create different particle shapes. A dimensionless factor was defined to control the degree of gene mutation. Users can tune the controlling factor to determine the morphological variation of generated particles against the original particle.

The proposed probability-based spherical harmonics technique was simple, effective, and versatile. This technique generated particle based on the morphological property of single particle and eliminated the need for scanning many particles for particle generation.

This paper starts with an introduction to two-dimensional (2D) curve representation by the Fourier series as a simple version of the 3D surface representation by spherical harmonics. Then, this paper integrates spherical harmonics and probability function to develop the probability-based spherical harmonics techniques for generating realistic particles. A series of computational geometry techniques are introduced in this paper to determine particle shape descriptors as a measure of morphological variances. Finally, the effectiveness and accuracy of the proposed technique are validated by comparing particle shapes between generated particles and original particles.

2 Fourier series for representing a 2D curve

Fourier series can be used to represent the 2D curves [38, 39]. Therefore, the perimeter of a 2D particle can be represented by Fourier series f(t), using sines and cosines functions:

$$ f_{n} (t) = \frac{{a_{0} }}{2} + \sum\limits_{n = 1}^{N} {\left( {a_{n} \cos \left( {\frac{2\pi t}{T}} \right) + b_{n} \sin \left( {\frac{2\pi t}{T}} \right)} \right)} $$
(1)

where T is the period of the function; t is time; and an, bn are the Fourier coefficients,

$$ \left\{ {\begin{array}{*{20}c} {a_{n} = \frac{2}{T}\int\limits_{T} {f(t)\cos \left( {2\pi t\frac{n}{T}} \right)} {\text{d}}x} \\ {b_{n} = \frac{2}{T}\int\limits_{T} {f(t)\sin \left( {2\pi t\frac{n}{T}} \right)} {\text{d}}x}. \\ \end{array} } \right. $$
(2)

For example, Fig. 1 illustrates different representations of a rectangle by Fourier series:

$$ \begin{aligned} f_{n} (t) &= \frac{4}{\pi }\left[ \cos (\omega_{0} t) - \frac{1}{3}\cos (3\omega_{0} t) + \frac{1}{5}\cos (5\omega_{0} t) + \cdots \right.\\ &\left.\qquad+\, ( - 1)^{{\frac{n - 1}{2}}} \cos (n \cdot \omega_{0} t) + \cdots \right]\end{aligned}$$
(3)

where \( \omega_{0} = 2\pi /T \) is the angular velocity and n is frequency. No sines bases in this function since the rectangle is axial symmetry by y axis (amplitude direction). Therefore, terms with even n values equal zero. As the n increasing, the higher frequency terms are included in f(t) function, so the reconstructed rectangle by f(t) is closer to the original rectangle as shown in Fig. 1.

Fig. 1
figure 1

Fourier series for representing a rectangle

Figure 2 shows the Fourier expansion of Eq. (3) in the time domain, frequency domain, and phase spectrum. Each term in Eq. 3 represents a cosine curve at different frequency n values and different amplitudes as shown in Fig. 2. The superposition of all the cosine curves on the time domain represents the original rectangle shape. The shape in the time domain can be projected onto frequency domain, which defines the relationship between frequency n and amplitude. The time domain image can also be projected onto the phase plane. Each red dot on the phase spectrum indicates the position of first wave crest at different n values. In this example, lengths of bars at even n value in phase spectrum equal to zero, while lengths of bars at odd n values in phase spectrum equal to \( \pi \) (we defined the range of phase spectrum in a range of \( ( - \pi ,\pi ] \)).

Fig. 2
figure 2

Fourier expansion in time domain, frequency domain, and phase spectrum

In summary, the time domain, frequency domain, and phase spectrum solely determine a curve, which ensures the uniqueness of the Fourier series. Therefore, a 2D shape, such as the soil particle perimeter, can be expressed by either a time domain image or a frequency domain image with spectrum, as shown in Fig. 2.

3 Spherical harmonics for representing a 3D surface

Using the same concept as the Fourier series for representing 2D curves, spherical harmonics can be used to represent 3D surfaces. Fourier series uses a set of sine and cosine functions to represent 2D curves, while the spherical harmonics use a set of orthogonal spherical harmonics functions \( Y_{n}^{m} \) to represent a closed 3D geometry. A soil particle with a closed 3D surface can be represented by the spherical harmonics coefficients \( c_{n}^{m} \) and spherical harmonics functions \( Y_{n}^{m} (\theta ,\varphi ) \):

$$ r(\theta ,\varphi ) = \sum\limits_{0}^{\infty } {\sum\limits_{m = - n}^{n} {c_{n}^{m} } Y_{n}^{m} (\theta ,\varphi )} $$
(4)

where \( r(\theta ,\varphi ) \)(\( \theta \in [0,\pi ] \), \( \varphi \in [0,2\pi ] \)) is coordinates of points on particle surface in the spherical coordinate system. The n and m are the degree and order of spherical harmonics, respectively. The base functions \( Y_{n}^{m} (\theta ,\varphi ) \) can be determined as:

$$ Y_{n}^{m} (\theta ,\varphi ) = \sqrt {\frac{(2n + 1)(n - m)!}{4\pi (n + m)!}} P_{n}^{m} \cos (\theta )e^{im\varphi } $$
(5)

where \( P_{n}^{m} \) is the Legendre function. The Legendre function can be expanded by Rodrigues’s formula:

$$ P_{n}^{m} (x) = (1 - x^{2} )^{\left| m \right|/2} \cdot \frac{{{\text{d}}^{\left| m \right|} }}{{{\text{d}}x^{\left| m \right|} }}\left[ {\frac{1}{{2^{n} n!}} \cdot \frac{{{\text{d}}^{n} }}{{{\text{d}}x^{n} }}(x^{2} - 1)^{n} } \right]. $$
(6)

Figure 3a illustrates the \( Y_{n}^{m} (\theta ,\varphi ) \) for n = 0, 1, and 2. Figure 3b illustrates the \( c_{n}^{m} \) for n = 0, 1, and 2. The spherical harmonics coefficients \( c_{n}^{m} \) are unique for a particle. As shown in Fig. 3b, the zero degree of spherical harmonics coefficient \( c_{0}^{0} \) determines the volume of the particle; the first degree of spherical harmonics coefficients (n = 1), including \( c_{1}^{1} \), \( c_{1}^{ - 1} \), and \( c_{1}^{0} \), determines the spatial displacement of the particle relative to origin, and the second degree of spherical harmonics coefficients (n = 2), including \( c_{2}^{ - 2} \), \( c_{2}^{2} \), \( c_{2}^{0} \), \( c_{2}^{ - 1} \), and \( c_{2}^{1} \), stores morphological properties of the particle. Despite not displaying in Fig. 3b, the larger degrees of spherical harmonics coefficients (n > 2) also store morphological properties of the particle. Naturally, the increase in n in spherical harmonics will contain more detailed morphological properties of the particle, so the reconstructed particle will be closer to the original particle. However, high degrees will significantly increase computational loads. Researchers [36, 40, 41] have found that n = 15 provides satisfactory accuracy for particle representation and generation. Therefore, n = 15 was also used in this study.

Fig. 3
figure 3

Expansion of spherical harmonics for the first two degrees

The spherical harmonics coefficients \( c_{n}^{m} \) are a complex number:

$$ c_{n}^{m} = a_{n}^{m} + b_{n}^{m} \cdot i $$
(7)

where \( a_{n}^{m} \) and \( b_{n}^{m} \) are the real and imaginary parts, respectively. Therefore, \( c_{n}^{m} \) can be determined as a vector in the complex plane consisting of the real axis and imagery axes. For example, nine \( c_{n}^{m} \) values for the first two degrees in Fig. 3b are plotted in the complex plane in Fig. 4a. The spatial displacement of the particle is not useful for characterizing particle shape. Therefore, the coefficients of \( c_{1}^{1} \), \( c_{1}^{ - 1} \), and \( c_{1}^{0} \) are set as zeros in this study for simplicity. Due to \( c_{n}^{ - m} = ( - 1)^{m} \cdot (c_{n}^{m} )^{*} \) where the “*” means conjugate transposition, \( c_{2}^{ - 2} \) and \( c_{2}^{2} \) are symmetric about the imaginary axis; \( c_{2}^{ - 1} \) and \( c_{2}^{1} \) are symmetric about the real axis; and \( c_{2}^{0} \) is on the real axis as shown in Fig. 4a.

Fig. 4
figure 4

The first two degrees of spherical harmonic coefficients in the complex plane

The second norm of \( c_{n}^{m} \) determines the amplitude of spherical harmonics at different degree Ln:

$$ L_{n} = \sqrt {\sum\limits_{m = - n}^{n} {\left\| {c_{n}^{m} } \right\|}^{2} } \quad (n = 0, \, 1, \, 2, \ldots , \, 15). $$
(8)

For example, L0 and L2 can be expanded as:

$$ L_{0} = \sqrt {\left\| {c_{0}^{0} } \right\|^{2} } { = }\left| {a_{0}^{0} } \right| $$
(9)
$$ \begin{aligned} L_{2} & = \sqrt {\left\| {c_{2}^{ - 2} } \right\|^{2} + \left\| {c_{2}^{ - 1} } \right\|^{2} + \left\| {c_{2}^{0} } \right\|^{2} + \left\| {c_{2}^{1} } \right\|^{2} + \left\| {c_{2}^{2} } \right\|^{2} } \\ & { = }\sqrt {(a_{2}^{ - 2} )^{2} + (b_{2}^{ - 2} )^{2} + (a_{2}^{ - 1} )^{2} + (b_{2}^{ - 1} )^{2} + (a_{2}^{0} )^{2} + (a_{2}^{1} )^{2} + (b_{2}^{1} )^{2} + (a_{2}^{2} )^{2} + (b_{2}^{2} )^{2} .} \\ \end{aligned} $$
(10)

The L0 represents the volume of the particle. To remove the influence of particle volume, all the Ln were divided by L0:

$$ \frac{{L_{n} }}{{L_{0} }} = \frac{{\sqrt {\sum\nolimits_{m = - n}^{n} {\left\| {c_{n}^{m} } \right\|}^{2} } }}{{L_{0} }} = \frac{{\sqrt {\sum\nolimits_{m = - n}^{n} {[(a_{n}^{m} )^{2} + (b_{n}^{m} )^{2} ]} } }}{{L_{0} }} = \sqrt {\sum\nolimits_{m = - n}^{n} {\left[ {\left( {\frac{{a_{n}^{m} }}{{L_{0} }}} \right)^{2} + \left( {\frac{{b_{n}^{m} }}{{L_{0} }}} \right)^{2} } \right]} } = \sqrt {\sum\nolimits_{m = - n}^{n} {\left\| {\widehat{{c_{n}^{m} }}} \right\|}^{2} } . $$
(11)

Then, normalized spherical harmonics coefficients, \( \widehat{{c_{n}^{m} }} \), were developed by this study by eliminating the effects of particle volume based on Eq. (11)

$$ \widehat{{c_{n}^{m} }} = \widehat{{a_{n}^{m} }} + \widehat{{b_{n}^{m} }} \cdot i $$
(12)

where \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) are normalized real and imaginary parts as shown in Fig. 4b:

$$ \widehat{{a_{n}^{m} }} = \frac{{a_{n}^{m} }}{{L_{0} }} $$
(13)
$$ \widehat{{b_{n}^{m} }} = \frac{{b_{n}^{m} }}{{L_{0} }}. $$
(14)

A soil particle is shown in the inset of Fig. 5a. Spherical harmonics coefficients \( c_{n}^{m} \) of this particle were determined based on Eqs. (4), (5), and (6). The degree n was set as 15. Therefore, a total of 256 spherical harmonics coefficients \( c_{n}^{m} \) were computed. These \( c_{n}^{m} \) values were complex numbers based on Eq. (7). Therefore, the 256 real part \( a_{n}^{m} \) values are plotted in Fig. 5a, and the 256 imaginary part \( b_{n}^{m} \) values are plotted in Fig. 5b.

Fig. 5
figure 5

The spherical harmonic coefficients and normalized spherical harmonic coefficients for a soil particle

Then, the volume of the particle L0 is computed as 8.8 based on Eq. (10), which was used to normalize spherical harmonics coefficients \( c_{n}^{m} \) to eliminate the effects of volume. The normalized real and imagery parts \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) were determined based on Eqs. (13) and (14) as shown in Fig. 5c, d. The \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values stored the morphological properties of the particle in the inset of Fig. 5a, and they are independent of each other. Therefore, \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) essentially determined morphological gene of this paper.

4 Integrate spherical harmonics and probability density function for particle generation

The morphology information of a particle was preserved in \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values. These two values were used to fit probability functions \( \varepsilon_{m}^{n} \). Then, the \( \varepsilon_{m}^{n} \) functions were used to generate new \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values, which essentially created the morphological gene mutation. The new \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values can be input into Eq. (12) to generate normalized spherical harmonics coefficients \( \widehat{{c_{n}^{m} }} \). Then, the \( \widehat{{c_{n}^{m} }} \) values were used in Eqs. (4), (5), and (6) to generate new particles. The new particles had similar morphological characteristics as the original particles, which will be validated by shape descriptors. It should be noted that the volume of all the generated particles is one. Users can scale up or scale down the generated particles base on the actual particle sizes.

Many probability distributions can be used in this study to generate particles, and different probability functions affect morphological gene mutations and therefore shapes of generated particles This study uses two types of probability distributions for illustration: Gaussian distribution and uniform distribution. Both distributions have simple parameters, which are easy to use and control. Specially, we found that by using the Gaussian distribution, the particle shape descriptors of generated particles were also following Gaussian distributions as will be shown shortly (Figs. 11, 12, 13, 14, 15).

To control the degree of gene mutation, a dimensionless factor \( \eta \) was introduced. The new \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values were determined as \( \eta \times \varepsilon_{m}^{n} \). Five scanned particles in Fig. 6 are used to illustrate the idea.

Fig. 6
figure 6

Integrated spherical harmonics and probability distributions for particle generation

The uniform distributions U(\( - \eta \) \( \widehat{{a_{n}^{m} }} \), \( \eta \) \( \widehat{{a_{n}^{m} }} \)) and U(\( - \eta \) \( \widehat{{b_{n}^{m} }} \), \( \eta \) \( \widehat{{b_{n}^{m} }} \)) were used to create gene mutation. The \( \widehat{{a_{n}^{m} }} \) was randomly selected from the range of −\( \widehat{{a_{n}^{m} }} \) to \( \widehat{{a_{n}^{m} }} \). The \( \widehat{{b_{n}^{m} }} \) was randomly selected from the range of −\( \widehat{{b_{n}^{m} }} \) to \( \widehat{{b_{n}^{m} }} \). The \( \eta \) values were set as 0.1, 0.3, 0.5, 0.7, and 1.0. The generated particle shapes are shown in Fig. 6. Then, the Gaussian distributions, N(\( \widehat{{a_{n}^{m} }} \), \( \eta \) \( \widehat{{a_{n}^{m} }} \)) and N(\( \widehat{{b_{n}^{m} }} \), \( \eta \) \( \widehat{{b_{n}^{m} }} \)), were used to \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values. The expectations were \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \), respectively, and the standard deviation was the absolute values of \( \widehat{{\left| {a_{n}^{m} } \right|}} \) and \( \widehat{{\left| {b_{n}^{m} } \right|}} \). The \( \eta \) values were set as 0.1, 0.3, 0.5, 0.7, and 1.0. The generated particle shapes are shown in Fig. 6.

As the \( \eta \) increases, larger variations of \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values were produced by probability functions, leading to a larger morphological gene mutation. Therefore, particles generated by using larger \( \eta \) values showed larger morphological diversities against the original particles. The next question is how to measure the morphological similarity/diversity between the original particle and generated particles. We introduced particle shape descriptors as a measure.

5 Particle shape characterizations based on computational geometry

In this study, six commonly used shape descriptors in Table 1 were used to measure morphological similarity/divergence between generated particles and the original particles. Computations of these shape descriptors needed determining principal dimensions (d1, d2, and d3), volume (V), surface area (As), minimum circumscribed sphere, maximum inscribed sphere, and 3D convex hull. A series of computational geometry techniques were developed by this study to analyze 3D particle geometries to determine these parameters.

Table 1 Definitions of six shape descriptors

The 3D particle geometries are represented as triangular face tessellations in computer graphics as shown in Fig. 7a, b. The surface area of a given particle can be determined by the sum of the areas of all the triangular faces. A small tetrahedron is formed by connecting three vertices to the particle’s centroid (O) as shown in Fig. 7b, and the volume of this tetrahedron is computed. The volume of the 3D particle (V) can then be determined by the sum of the volumes of all such tetrahedrons.

Fig. 7
figure 7

Computational geometry techniques for determining surface area, volume, length, width, and thickness of a 3D particle

The length (d1), width (d2), and thickness (d3) of a given particle geometry can be determined by a principal component analysis (PCA) [20]. For a 3D image consisting of a point cloud, PCA can identify the largest variance of the point cloud in 3D space, which is called the first principal component. The length of the first principal component is the length (d1) of a 3D particle. Subsequently, PCA identifies the second largest variance, the second principal component, which is perpendicular to the first principal component, is the width (d2) of the particle. The third principal component is perpendicular to both first and second principal components and identifies the thickness (d3) of the particle. Figure 7c illustrates the results of a PCA analysis on a particle and shows the identified d1, d2, and d3 for the particle.

The computational processes of determining the minimum circumscribed sphere and the 3D convex hull of a particle are illustrated in Fig. 8. A 2D particle is used to illustrate the concept. Points on this 2D particle boundary are shown in Fig. 8a. The minimum number of points bounding all points of the particle boundary in Fig. 8a is found as shown in Fig. 8b. This is essentially the convex hull of this particle. The same concept was used to determine the convex hull of a 3D particle as shown in Fig. 8e.

Fig. 8
figure 8

Computational geometry techniques for determining convex hull and minimum circumscribed sphere of a 3D particle

In Fig. 8b, the distance between Point 1 and Point 5 is the longest connection among points constructing the convex hull. In the first step, a trial circle is identified using Point 1 and Point 5 as the diameter in Fig. 8c. However, in this case, Point 4 is not included in the trial circle. In the second step, Points 1, 5, and 4 are used to fit a trial circle. If all the other points are within this trial circle, this is a minimum circumscribing circle. If not, the point which lies furthest outside of the trial circle is added, and a new trial circle is found using any two or three of the four points. The procedure is repeated until no point lies outside the trial circle. This yields the minimum circumscribing circle for the original set of points, as shown in Fig. 8d. The above computational process can be also applied to the 3D point cloud to identify the minimum circumscribed sphere as shown in Fig. 8f.

The maximum inscribing sphere can be determined using a 3D Euclidean transformation. For each point inside the particle in Fig. 9a, the minimum distance to the particle surface is computed, which forms a 3D Euclidean distance map, as shown in Fig. 9b. The maximum distance value in the 3D Euclidean distance map identifies the radius of the maximum inscribed sphere of the particle. The coordinates of the maximum distance value identify the center of the maximum inscribed sphere of the particle. The computed maximum inscribed sphere is superimposed within the particle in Fig. 9c.

Fig. 9
figure 9

Computational geometry techniques for determining the maximum inscribed sphere of a 3D particle

6 Generating realistic particles based on limited morphological information

Many natural sands consist of particles having similar morphological properties because these particles have the same geological formation process. Therefore, the proposed technique can be potentially used to reproduce a soil specimen by analyzing one particle. For example, 4000 particles are randomly selected from Ottawa sand. These particles were filled into a cylinder and scanned by high-resolution X-ray computed tomography (X-ray CT) with a spatial resolution of 12 μm/voxel. The improved watershed analysis technique developed by Sun et al. [20] was used to process the X-ray CT volumetric images and identify individual particles. The result is shown in Fig. 10a. Six of the 4000 particles are zoomed in Fig. 10b.

Fig. 10
figure 10

Comparisons between original Ottawa sand particles scanned by X-ray CT and the generated particles by probability-based spherical harmonics

We chose particle #5 in Fig. 10b as the base particle to generate new particles. Then, this selected particle was analyzed by spherical harmonics to determine its morphological gene (i.e., \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values). Based on the morphological gene, the Gaussian distribution with \( \eta = 0.5 \) was used to generate new \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values to create gene mutation. The new \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values were used to generate 4000 particles based on Eqs. (4), (5), (6), and (14). For example, 50 generated particles are shown in Fig. 10c. The newly generated particles are visually close to the original Ottawa sand particles in Fig. 10a, b.

The computational geometry algorithm was used to determine shape descriptors for the original and generated particles as shown in Fig. 11. Particle shape distributions of original and generated particles generally agree with each other. However, as expected, they are not exactly overlapping with each other because, in natural soils, the particles shapes have some shape variations that may not be fully captured by the morphological gene of a single particle.

Fig. 11
figure 11

Comparison between particle shape distributions of original and generated Ottawa sand particles. The original particle shape distributions are computed by analyzing 4000 Ottawa sand particles scanned by X-ray CT. Then, one Ottawa sand particle is used to generate 4000 new particles using the proposed particle clone method. The divergence of particle shape distributions of generated and original particles is evaluated by the T test

To evaluate the divergence of particle shape distributions of original and generated particles, a statistical approach, T test [46], is introduced. The T test computes a Z value based on standard deviations (σ) and means (μ) of two distributions:

$$ Z = \frac{{(\mu_{1} - \mu_{2} )}}{{\sqrt {\sigma_{1}^{2} + \sigma_{2}^{2} } }} $$
(15)

where μ1 and μ2 are means of two distributions and σ1 and σ2 are standard deviations of two distributions. For example, based on the SA distribution of the original particles in Fig. 11d, the μ1 and σ1 can be computed as 0.8046 and 0.0820, respectively. Based on the SA distribution of the generated particles in Fig. 11d, the μ2 and σ2 can be computed as 0.8422 and 0.0913, respectively. Therefore, the Z is computed as 0.307 as shown in Fig. 11d. The same procedure is used to compute the Z values of the remaining five shape descriptors as shown in Fig. 11. If the Z value is smaller than 1.96, the two distributions are sufficiently close with a confidence level of 95% [46]. All the computed Z values are within 1.96 in Fig. 11, so the proposed particle generation technique can effectively reproduce the particle shape characteristics of original soils.

7 Analysis of the controlling factor \( \eta \)

The controlling factor \( \eta \) is key for governing variability and accuracy of generated particles. Large \( \eta \) values generate large morphological variation in the generated particles. This section investigates effects of \( \eta \) values on the morphological diversities in generated particles.

A total of 214 particles were selected from ten soils with disparate origins, including river alluvium, volcanic sands, colluvium, slags, crushed limestone, crushed concrete, and glass spheres. Each type contains around 21 particles in a range of 1.00 mm (#18 sieve) and 2.83 mm (#7 sieve).

To generate 3D particle geometries, a high-resolution X-ray CT is used because the X-ray CT can penetrate soil particles and capture 3D geometries of all the particles at once. Other techniques, such as the 3D laser scanner, can also be used to scan 3D particle geometries. However, the 3D laser scan must scan particles one by one and significant efforts would be required to perform 214 scans.

All the 214 particles were filled into a cylinder and scanned by high-resolution X-ray CT with a spatial resolution of 12 μm/voxel. Therefore, the 1.00 mm particle approximately has a length of 83 voxels, which is sufficient for delineating particle geometries. Therefore, we decided to scan all the 214 particles at once. The improved watershed analysis technique developed by Sun et al. [20] was used to process the X-ray CT volumetric images and identify individual particles. The result is shown in Fig. 12a.

Fig. 12
figure 12

Comparisons between original 214 sand particles scanned by X-ray CT and the generated particles by probability-based spherical harmonics

Each of the scanned particles was analyzed to determine its morphological gene (i.e., \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values). Based on the morphological gene, the Gaussian distribution with different \( \eta \) values was used to generate new \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values and generate new particles based on Eqs. (4), (5), (6), and (14). Five \( \eta \) values of 0.1, 0.3, 0.5, 0.7, and 1.0 were used. A total of 500 particles are generated for each \( \eta \) value for each particle. Some of the original particles and generated particles are compared in Fig. 12b, c. They are visually close to each other.

The computational geometry technique is used to analyze the particle geometries and determine the particle shape descriptors of the original and generated particles. For example, Fig. 13a shows an original particle. A total of 500 particles are generated using \( \eta = 0.3 \), and seven of them are shown in Fig. 13b–h. The distributions of shape descriptors of the 500 generated particles are shown in Fig. 13i–n. For example, Fig. 13i shows the convexity distribution of generated 500 particles, which follows Gaussian distribution with a mean convexity μ of 0.975 and a standard deviation σ of 0.007. The convexity of the original particle is 0.978. The same comparisons were made for other three particles (\( \eta = 0.3 \)) as shown in Figs. 14, 15 and 16. The computed particle shape descriptors of generated particles are all following Gaussian distribution. For each shape descriptor, the mean value (μ) of generated particles agrees with the value of the original particle.

Fig. 13
figure 13

Comparisons between original particles #1 and the generated particles by spherical harmonics

Fig. 14
figure 14

Comparisons between original particles #2 and the generated particles by spherical harmonics

Fig. 15
figure 15

Comparisons between original particles #3 and the generated particles by spherical harmonics

Fig. 16
figure 16

Comparisons between original particles #4 and the generated particles by spherical harmonics

Particle shape descriptors of generated particles follow Gaussian distributions, so the standard deviation \( \sigma \) of particle shape descriptors could be used to quantify the morphological variation in the generated particles. It is expected that large \( \eta \) values would provide large \( \sigma \) values and therefore large morphological variations in the generated particles.

As discussed before, for each \( \eta \) value, 500 particles are generated by cloning a particle. Therefore, for each \( \eta \) value, a total of 107,000 particles are generated by cloning the scanned 214 particles. For each shape descriptor, the standard deviations of these 107,000 particles are analyzed and the average standard deviation \( \sigma \) is determined. The relationship between the average standard deviation \( \sigma \) values and different \( \eta \) values is shown in Fig. 17a–f. Apparently, as increasing \( \eta \) values, larger \( \sigma \) values are observed in the generated particles, resulting in larger morphological variances.

Fig. 17
figure 17

Effects of \( \eta \) values on morphological variation ranges of generated particles

Users can use Fig. 17 to select appropriate \( \eta \) values based on specific problems. For example, the manufactured sands, such as Ottawa sands, crushed limestone, and slag, typically contain particles having a narrow range of particle shapes. The small \( \eta \) values can be used to generate particles for these sands. However, the typical natural sands contain particles having a wide range of particle shapes. The large \( \eta \) values can be used to generate particles for these sands.

To future validate the proposed algorithm, the particle shape distributions of generated 107,000 particles using \( \eta = 0.3 \) are determined as shown in Fig. 18. The particle shape distributions of original 214 particles scanned by X-ray CT are also shown in Fig. 18. The T test is used to evaluate the divergence of particle shape distributions of original and generated particles following Eq. 15. The computed Z values are also shown in Fig. 18. As discussed before, if the Z value is smaller than 1.96, the two distributions are sufficiently close with a confidence level of 95%. All the computed Z values are within 1.96 in Fig. 18, so the proposed particle generation technique can effectively reproduce the particle shape characteristics of original soils.

Fig. 18
figure 18

Comparison between particle shape distributions of original and generated particles. The original particle shape distributions are computed by analyzing 214 particles scanned by X-ray CT. The generated 107,000 particles at \( \eta = 0.3 \) were analyzed to determine particle distribution of generated particles. The divergence of particle shape distributions of generated and original particles is evaluated by the T test

8 Conclusions

In this paper, a probability-based spherical harmonics technique was developed. This technique can generate realistic particles based on limited morphological information. This technique analyzed a single particle and extracted spherical harmonics coefficients (\( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values), which are analog with the morphological gene of the particle. These \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values were used to determine the probability distribution, such as Gaussian and uniform distributions. The probability distributions were used to determine new \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values, which were the analog of gene mutation. The new \( \widehat{{a_{n}^{m} }} \) and \( \widehat{{b_{n}^{m} }} \) values were used to generate 3D particle geometries with the same morphological properties as the original particles. A controlling factor \( \eta \) was developed to tune degrees of gene mutation. Large \( \eta \) generated particles with a large morphological variance against the original particle.

The morphological variances between generated and original particles were quantified by six commonly used particle shape descriptors, including convexity (or solidity), circularity, aspect ratio, area sphericity, diameter sphericity, and perimeter sphericity. A series of computational geometry algorithms were developed by this research to analyze 3D particle geometries to determine these shape descriptors.

This study used X-ray CT to scan 4000 Ottawa sand particles. Then, one of the scanned particles was randomly selected to generate 4000 particles. The particle shape distributions of original and generated particle agreed well with each other. This validates the effectiveness of the proposed probability-based spherical harmonics.

A total of 214 particles with various shapes were scanned by X-ray CT. For each particle, the morphological gene is extracted. Based on the morphological gene, the Gaussian distribution with different \( \eta \) values was used to generate new particles. Five \( \eta \) values of 0.1, 0.3, 0.5, 0.7, and 1.0 were used. The original and generated particles were analyzed by computational geometry techniques to determine their shape descriptors. The dimensionless factor \( \eta \) controls the morphological variances of the generated particles. By using the Gaussian probability distribution, the particle shape distributions of generated particles are also following Gaussian distribution. Therefore, the standard deviation \( \sigma \) is used to quantify the morphological variation of generated particles. The relationship between \( \sigma \) and \( \eta \) is explored. This study may facilitate to generate realistic particle geometries for discrete element method and geo-mechanical analysis for understanding macro-engineering behavior of granular soils.