1 Introduction

Broadly speaking, the meshfree particle methods discretise the continuum domain in a set of particles (or nodal points) aiming to obtain a computational solution of Partial Differential Equations (PDE’s) [1]. In this class of methods are included: Smoothed Particle Hydrodynamics (SPH), Moving Particle Semi-Implicit (MPS), Moving Least Square (MLS), Reproducing Kernel Particle (RKPM), Finite Point (FPM), Particle-in-Cell (PI), Particle Finite Element (PFEM) and Molecular Dynamics (MD) methods, among others. Different techniques in the searching of neighbour particles (or nodes) and atoms (in the MD specific case) [1,2,3,4,5,6,7,8,9,10,11,12,13,14] can be employed.

The neighbouring search procedure must be performed at each numerical iteration in dynamic cases, and that has a direct influence on the simulation time. In two-dimensional analyses, neighbour lists (linked and Verlet) are commonly employed in meshfree particle methods [15,16,17,18].

The cell-linked or the Verlet list divide the physical domain into a regular grid of congruent rectangles (2-D domain) or rectilinear parallelepipeds (3-D) whose sides measure a cutoff radius. Particles are simply assigned to cells according to their spatial coordinates. Each cell contains a number of particles that can vary during the numerical simulation.

Currently, the octree technique [19, 20] has been applied in conjunction with parallelization, or CUDA GPU processing, in 3-D problems.

In particular, SPH and MPS methods are currently being applied to diverse areas, such as automotive, aeronautics and oil industries, engineering, biomechanics and medicine, geosciences, environmental sciences, oceanography, tribology, applied mathematics, physics and astronomy. The applications of those meshfree particle methods are increasing (some of them are listed below) and the computational efficiency—related to the neighbouring search technique employed—is an aspect that deserves attention.

  • Aerospace and aeronautics, automotive and energy production industries, marine and costal engineering, environmental and geophysical problems [21,22,23,24].

  • Casting [25,26,27].

  • Medicine and biomechanics [28,29,30,31,32].

  • Free surface flows [33,34,35].

  • Wave–structure interactions [36, 37].

  • Floods and tsunamis [38,39,40].

  • Oil flow in porous rocks [41].

  • Multiphase flow and reactive transport in porous media [42].

  • Lubrication and tribology [43,44,45].

Abdelrazek et al. [46] presented a brief comparison between MPS and SPH characteristics.

This paper presents an evaluative investigation of the computational efficiency of the neighbour lists (especially the linked list) in SPH applications. From the literature results obtained in previous studies [47, 48] of our research group, and validated from the comparison with experimental/literature data, it was possible to conclude on the efficiency of the linked list technique in two benchmark engineering problems: lid-driven cavity flow and dam breaking.

The contributions of this work lie on the presentation of the numerical implementation of a linked list, in providing computational efficiency results using or not the neighbours storage in pairs, and in the discussion on the linked list optimization using the Verlet list.

2 Presentation of the Neighbour Search Techniques

2.1 Direct Search

The direct search is the simplest search technique employed in particle methods. Through the simple comparison between the distance of two particles (a fixed and other in any position of the domain), the neighbour particles are found—if the distance is smaller than the cutoff radius (kh)—and stored in a matrix. Figure 1 presents the flowchart of the direct search algorithm and the neighbourhood of a fixed particle.

Fig. 1
figure 1

The direct search algorithm and the neighbourhood of a fixed particle

The storage of the particles and their neighbours is performed in a matrix, basically in two forms. Figure 2 shows the simplest storage form: a matrix whose number of rows is equal to the number of particles used in the discretization of the domain and number of columns is equal to the largest number of neighbours of each particle that can be found at each numerical iteration. At the beginning of the simulation all the matrix is initialised with a zero value and as neighbours are found, their numbers are written in matrix positions. In the solution of the conservation physics laws by the meshfree particle method, the location of the first zero in a matrix line informs that there are no longer neighbours for a specific reference particle (identified by the number in the first column of that line).

Fig. 2
figure 2

The simplest form to storage the neighbour particles

Figure 3 shows the storage of neighbour particles in pairs [18]. This storage method optimises the interpolation process performed by the meshfree particle method (in the solution of physical conservation equations). The information about a pair of neighbouring particles is used in the interpolations performed for both particles, reducing the number of computational operations.

Fig. 3
figure 3

The neighbour particles stored in pairs

2.2 Linked List

In the linked list technique, the domain is divided into a grid containing a certain number of cells. Each cell contains a number of particles that can vary during the numerical simulation. The neighbour search is limited to the cells (8 in 2-D and 26 in 3-D) that are in the vicinity of the cell that contains the reference particle. A cell-linked list with informations on the cell that contains the reference particle and those cells in which the neighbouring particles can be located is created. This procedure is performed for each particle at the domain. The calculations of the distances between the reference particle and the others particles at domain are reduced to the region with linked cells (Fig. 4). Figure 5 presents the linked list algorithm implemented for solving a science/engineering problem. The cell-linked list needs to be updated each numerical iteration, due to migration of the particles to other cells.

Fig. 4
figure 4

The domain and the grid cells defined (with cutoff radius equals kh). The reduced search region is shaded in light blue

Fig. 5
figure 5

The linked list algorithm implemented

2.3 Verlet List

Domínguez et al. [15] and Viccione et al. [16] performed the optimization of the linked list using the Verlet list, whose creation requires the calculations needed to generate the cell-linked list.

In this technique, the neighbour list is built with probable neighbouring particles of each reference particle at the domain, inside a cutoff radius (kh + L) slighter higher than the support radius (kh) used in the linked list.

The definition of the cutoff radius takes into account the maximum possible displacement of the reference particle during a simulation time (related to a certain number of numerical iterations) in which the neighbour list will not be updated [16].

In the Verlet list technique, all particles within the cutoff radius (kh + L), as shown in Fig. 6, will be added to a list of potential neighbour particles, but only those whose distances for the reference particle are less than or equal to kh will be used in the interpolations of the particle method. The neighbour list is not update at the numerical iteration, which results in processing time saving.

Fig. 6
figure 6

The cutoff radius used in the Verlet list (kh + L) and the probable neighbours of the reference particle (in the centre of the circle)

In a first analysis, there are no losses in the accuracy in the location of neighbour particles. However, the choice of the cell size must be done carefully in order to guarantee the accuracy and the advantages of the technique.

3 Algorithmic Implementations

3.1 Direct Search

As explained earlier, the direct search is a technique of simple implementation. Through the simple comparison between the distance of two particles (a fixed and other in any position of domain) the neighbour particles are found and stored in a matrix, as shown in Figs. 2 and 3.

3.2 Linked List

In the linked list, there are more complex operations. The following will be presented a concise description of the algorithm implementation, in FORTRAN Programming Language.

Step 1. Firstly, the limits of the spatial region that will receive particles along the simulation must be provided and the coordinates of each cell must be defined.

figure a

Step 2. A routine used in the identification of the cells that contains particles at each numerical iteration needs to be implemented. This procedure reduces the computational processing time, since neighbouring particles search is performed in a reduced domain. In problems where the domain is large, there may be a considerable number of empty cells at each numerical iteration.

figure b

Step 3. The particles inside each cell must be identified, from the comparison of their coordinates with the coordinates of each cell in the grid.

figure c

Step 4. The neighbours cells of each cell in the grid are identified. This allows that the neighbouring particle search are performed only on the neighbouring cells of a given cell (containing the fixed particle) and not in the whole domain, as in the direct search. During the simulations, the grid remained static. Thus, these operations run only once, at the first iteration.

figure d

Step 5. It is necessary to determine the cell in which the reference particle is located. A scan on the non-empty cells is performed until the fixed particle is found, at this moment this procedure finishes. The algorithm below present the operations performed.

figure e

Step 6. After the cell A—in which is located the reference particle—has been found, the neighbours search is finally carried out. The search for neighbours cells of the cell A has been done previously, in the grid generation, and the result is stored in the matrix cell_v. Thus, it is enough to check which particles (inside the neighbours cells of A) are neighbours of the reference particle. This checking is performed by calculating the interparticles distances (which must be lower than kh). The excerpt of the algorithm is shown below.

figure f

As final result, the neighbours of each particle of the domain are stored in the neighbours_matrix.

4 Lagrangian Particle Modelling

The essence of the particle methods is the discretization of the continuum domain in a finite number of particles and, from interpolations, obtain approximations for physical quantities for each particle of the domain. Only the particles in the vicinity of each particle fixed for study (which are within the domain of influence of the first, that is, within the space region with radius equals to kh, in which k is a scaling factor that depends on the kernel used), will contribute to the prediction of the physical properties of the reference particle. The contribution given by each neighbouring particle to the value of the physical quantity is inversely proportional to its distance to the reference particle. Smoothing functions (kernels) are used in the interpolations of the particle method. The kernels used in this work can be found in [18], and are presented below.

Cubic spline:

$$ W(q,h) = \frac{15}{{7\pi h^{2} }}\left\{ {\begin{array}{*{20}l} {\left( {\frac{2}{3} - q^{2} + \frac{1}{2}q^{3} } \right),} \hfill & {0 \le q \le 1} \hfill \\ {\left( {\frac{1}{6}\left( {2 - q} \right)^{3} } \right),} \hfill & {1 < q \le 2} \hfill \\ {0,} \hfill & {{\text{in the other case}} .} \hfill \\ \end{array} } \right. $$
(1)

Gaussian:

$$ W(q,h) = \frac{1}{{\pi h^{2} }}\left\{ {\begin{array}{*{20}l} {e^{{ - q^{2} }} ,} \hfill & {0 \le q \le 3} \hfill \\ {0,} \hfill & {{\text{in the other case}} .} \hfill \\ \end{array} } \right. $$
(2)

New quartic:

$$ W(q,h) = \frac{15}{{7\pi h^{2} }}\left\{ {\begin{array}{*{20}l} {\left( {\frac{2}{3} - \frac{9}{8}q^{2} + \frac{19}{24}q^{3} - \frac{5}{32}q^{4} } \right),} \hfill & {0 \le q \le 2} \hfill \\ {0,} \hfill & {{\text{in the other case}} .} \hfill \\ \end{array} } \right. $$
(3)

Quintic spline:

$$ W(q,h) = \frac{7}{{478\pi h^{2} }}\left\{ {\begin{array}{*{20}l} {\left( {3 - q} \right)^{5} - 6\left( {2 - q} \right)^{5} + 15\left( {1 - q} \right)^{5} ,} \hfill & {0 \le q \le 1} \hfill \\ {\left( {3 - q} \right)^{5} - 6\left( {2 - q} \right)^{5} ,} \hfill & {1 < q \le 2} \hfill \\ {\left( {3 - q} \right)^{5} ,} \hfill & {2 < q \le 3} \hfill \\ {0,} \hfill & {{\text{in the other case}} .} \hfill \\ \end{array} } \right. $$
(4)

where

  • \( W \) is the interpolation function (kernel).

  • \( r \) is the position occupied by the particle at the domain.

  • \( h \) is the smoothing length.

  • \( a \) and \( b \) are subscripts referring to the reference particle and the neighbouring particle respectively.

$$ q = {{\left| {r_{a} - r_{b} } \right|} \mathord{\left/ {\vphantom {{\left| {r_{a} - r_{b} } \right|} h}} \right. \kern-0pt} h}. $$

Figure 7 shows a continuous smoothing function and its utilisation in the interpolations at the domain discretised by particles.

Fig. 7
figure 7

a The continuous smoothing function (W), defined within the domain of influence of radius kh. b W guarantees the greatest contribution of the nearest neighbouring particles (\( b \)) to the value of the physical quantity in the reference particle (\( a \))

According to a general analysis, the modelling of fluid flow and energy transport is carried out by the physical conservations laws of mass, momentum and energy. In this work, two hydrodynamic cases were studied in a 2-D domain: dam breaking and lid-driven cavity flow. The fluid has been considered as Newtonian, incompressible and isothermal. The mass conservation and Navier–Stokes equations have been discretised and solved by the Smoothed Particle Hydrodynamics method in both problems (Eqs. (5)–(6)):

$$ \frac{{{\text{d}}\rho_{a} }}{\text{dt}} = \sum\limits_{b = 1}^{n} {m_{b} } \left( {{\mathbf{v}}_{a} - {\mathbf{v}}_{b} } \right) \cdot \nabla W(\varvec{X}_{a} - \varvec{X}_{b} ,h) $$
(5)
$$ \begin{aligned} \frac{{{\text{d}}{\mathbf{v}}_{a} }}{\text{dt}} & = - \sum\limits_{b = 1}^{n} {m_{b} \left( {\frac{{P_{a} }}{{\rho_{a}^{2} }} + \frac{{P_{b} }}{{\rho_{b}^{2} }}} \right)\nabla W(\varvec{X}_{a} - \varvec{X}_{b} ,h)} + \\ & \quad 2\upsilon_{a} \sum\limits_{b = 1}^{n} {\frac{{m_{b} }}{{\rho_{b} }}} ({\mathbf{v}}_{a} - {\mathbf{v}}_{b} )\frac{{\varvec{X}_{a} - \varvec{X}_{b} }}{{\left| {\varvec{X}_{a} - \varvec{X}_{b} } \right|^{2} }} \cdot \nabla W(\varvec{X}_{a} - \varvec{X}_{b} ,h) + {\mathbf{g}} \\ \end{aligned} $$
(6)

where d/dt is the material derivative, \( m \) is the mass, \( n \) is the number of neighbouring particles within the influence domain, \( {\mathbf{v}} \) is the velocity, t is the time, \( \nabla \) is the mathematical nabla operator, \( W \) is the kernel, \( \varvec{X} \) is the particle position, \( h \) is the smoothing length, \( P \) is the absolute pressure, \( \upsilon \) is the kinematic viscosity, \( \rho \) is the density, \( {\mathbf{g}} \) is the gravity, the subscripts \( a \) and \( b \) refer to the reference particle and the neighbouring particle, respectively.

For an incompressible and isothermal flow, the rate of change of the internal energy in time is null. Due to this, the energy conservation equation was not solved in the cases presented in this work.

4.1 Dam Breaking

The studies of dam breaks are of great importance in the prevention of accidents, which can cause serious environmental consequences, besides being a risk to the resident populations in the vicinity of the dams.

The 2-D geometry simulated was a tank with a height a length of 0.420 m and a height of 0.440 m, as shows Fig. 8. The damned water, located at the right side of the reservoir, had a height of 0.114 m height and a width of 0.114 m (aspect ratio equals 1).

Fig. 8
figure 8

The simulated geometry and the initial particle setup of the damned water

A static grid of 52 cells along the length of the tank and 15 cells arranged in the y direction (up to the dammed water line) was defined. At each numerical iteration, the cells containing particles were identified. Thus, the neighbour search operations were performed in a reduced domain (see Step 2 in Sect. 3.2). Cubic spline, new quartic and quintic spline kernels [16] have been used in the SPH interpolations. The timestep was 1.0 × 10−5 s. 30,000 numerical iterations were performed in the simulations. Figure 8 shows the initial setup of the particles used in the discretization of the damned water.

Figure 9 presents the evolution of the dam breaking flow in different instants of time. Complete simulation results and validation of the collapse of dammed water can be found in [47].

Fig. 9
figure 9

Evolution of the dam breaking flow until the collision of the wave against the right wall of the reservoir

The simulations (using the direct search and the linked list techniques) finished when the wave achieved the left wall of the reservoir (t = 0.30 s) The storage of the neighbouring particles was carried out in the simple form possible, using a matrix as that shown in Fig. 2. The simulation processing times for different combinations of interpolation function/number of particles are shown in Table 1.

Table 1 Processing times for different combinations of kernel/number of particles in the dam breaking simulations

Considering the surprising results found in the first simulations, with lower performance of the linked list, more studies were carried out aiming to optimise the first algorithm implemented.

Literature [18] presented another form of neighbouring particles storage—(in pairs)—as shows Fig. 3. By using this technique, fewer computational operations are performed and it is expected that there will be a decrease in CPU time. New simulations have been performed for the 2-D lid-driven cavity flow, as presented in the next subsection.

4.2 2-D Lid-Driven Cavity Flow

Cavities are seen in engineering problems related to environmental and atmosphere. Depressions and valleys, dynamics of lakes, fuselage and wings of airplanes, boat hulls and car bodies, sports stadiums, systems for the continuous deposition of photosensitive films on substrates and photographic papers, material processing, metal casting and galvanizing are situations in which fluid flows in cavities are found [48].

The sides of the square cavity simulated were 1.0 × 10−3 m. Grids of 12 × 12 cells, 16 × 16 cells and 20 × 20 cells, disposed in x and y directions respectively, were used in simulations (when employed 30 × 30, 40 × 40 and 50 × 50 particles per side of the cavity, respectively). The constant velocity of the box cover was 1.0 × 10−4 m. Cubic and quintic spline kernels have been used in the SPH interpolations. The timestep was 5.0 × 10−5 s and 30,000 numerical iterations were performed.

Figure 10 shows streamlines inside the square cavity and the formation the primary vortex when the steady state has been reached. A more complete analysis of this problem can be found in [48]. The simulation processing times are shown in Table 2 and Figs. 11, 12, and 13.

Fig. 10
figure 10

Streamlines in certain instants of time and the formation of the primary vortex at the steady state (the quintic spline kernel and 40 × 40 particles per side of cavity were employed in the simulation)

Table 2 Processing times for different combinations of kernel/number of particles in the lid-driven cavity flow simulations
Fig. 11
figure 11

CPU processing times for simulations using the cubic spline kernel

Fig. 12
figure 12

CPU processing times for simulations using the Gaussian kernel

The storage in pairs of the neighbouring particles promoted an improvement in the computational efficiency of the linked list. The differences between the simulation times provided by this technique and the direct search decresead significantly (which is in accordance with the conclusions presented in [18]). Even so, after the CPU times analysis it was verified that the computational efficiency of the linked list technique was only similar to that of the direct search.

Fig. 13
figure 13

CPU processing times for simulations using the quintic spline kernel

5 Concluding Remarks

From the analysis of the results it is possible to conclude that the simple use of the linked list technique is not sufficient for the best efficiency of the neighbouring particle searching. The use of the storage of neighbouring particles in pairs was very important for the reduction of computational time, but it was not enough for a great improvement in the linked list simulation time. It was verified that a linked list optimisation method could improve performance.

The Verlet list is an optimisation technique proposed, that uses calculations required in the generation of the cell-linked list. Despite the advantages of not updating the list of neighbours at each numerical iteration can provide, the choice of an appropriate cutoff radius is fundamental to ensure the good performance of this technique.

Literature shows, however, that the choice of the appropriate neighbour list technique (linked or Verlet) is not so simple. The comparative advantage of the cell-linked list increases with the reduction of the cells size, to the point which the Verlet list approach is practically useless [16].

The CPU simulations of a 3-D collapse of a dam break and its interaction with a rectangular structure, using the SPH particle method, are presented in [15]. Analysing the memory requirements, Verlet was less efficient than linked list. The performance of the Verlet list dependent on the number of steps (ns) that the list kept unaltered. For low and high values of ns, the linked list method was faster than the Verlet list. Only in an intermediate region, the Verlet list was faster, with a maximum improvement close to 6%, for ns equal to 7 timesteps. The runtime improvement is rather moderate, especially when considering the memory requirements of the Verlet list: higher than the requirements of the linked list. An improvement in runtime higher than 8% was obtained when a change in the formulation of the SPH method (using a kernel gradient correction) was used.

In summary, the Verlet list is a linked list optimisation attempt which results are not the best in all cases. From the literature results [15], it is conclude that a wise choice of some parameters, such as cell size and number of timesteps in which the neighbours list will remain unchanged, is necessary so that a reasonable improvement in the processing time (below 10%) is reached. In addition, the memory requirement in the Verlet list was higher than in the linked list in CPU simulations.

Recently, the literature [49] presented a parallel fast neighbour search method and communication strategy for meshless particle methods. An algorithm with a multi-resolution-based hierarchical data structure is employed to construct ghost buffer particles (neighbour particles on remote processor nodes). Cell-linked and Verlet lists were applied in this search technique. Two applications of the parallel algorithm in fluid dynamics simulations using SPH method were performed: (a) with an adaptive smoothing-length and (b) employing a new physics-driven partitioning method [50]. The results achieved demonstrated accuracy and efficiency in the process of construction of the buffer ghost particles and that the new partitioning method is promising.