Introduction

In recent years, the fields of satellite and mathematical geodesy have greatly benefited from advancements in the adjustment and computation of survey field data. These techniques have been instrumental in evaluating error magnitudes and establishing tolerance thresholds (Yakubu et al. 2018). Furthermore, as the collection of survey field data often involves redundant observations, it becomes imperative to modify these data to ensure consistency (Ghilani and Wolf 2006). The pursuit of accurate survey data adjustment remains a focal point for geodesists, geophysicists, surveyors, topographers, and other scholars (Asenso-Gyambibi et al. 2022). GPS observations, as a vital component of surveying, are not excluded from errors. Factors such as satellite orbital errors, signal transmission timing issues due to atmospheric conditions, receiver errors, multipath errors due to reflection from nearby buildings or other surfaces, miscentering errors of the reception antenna over the ground station, and receiver height-measuring errors contribute to inaccuracies (Ghilani and Wolf 2006). These signal errors affect the accuracy of point positions. To mitigate these errors and enhance point location precision, duplicate observations are made, necessitating the adjustment of field data using both survey methodologies (Ghilani and Wolf 2006).

The field of geodesy demands rigorous adjustment and computation methods. Over the years, various techniques have been employed, including the Kalman filter (KF) (Bashirov et al. 2008; Wang 2014), least squares collocation (LSC) (Ruffhead 2012), total least squares (TLS) (Schaffrin 2006), and kriging methods (Erol and Celik 2005; Kleijnen 2009). The ordinary least square (OLS) method has conventionally served as the go-to technique for adjusting surveying networks (Okwuashi and Eyoh 2012). OLS treats observation equations as stochastic to minimize the square of the sum of residuals, addressing errors solely in the observation matrix. Many geoscientific researchers have successfully applied OLS to tackle various scientific challenges (Annan et al. 2016). To improve OLS efficiency, the total least squares (TLS) method was developed (Annan et al. 2016). However, it is worth noting that the SVD (singular value decomposition) used in TLS does not preserve the structure of the extended data matrix in typical scenarios, potentially affecting its statistical idealness (Golub and Van Loan 1989; Peprah and Mensah 2017). Additionally, least squares collocation (LSC) requires changes in the cross-variance function to maintain its modeling effectiveness (Ophaug and Gerlach 2017). However, these classical methods cannot meet the current precision requirements and require more advanced techniques (Lee et al. 2020).

In parallel to these developments, the field of artificial intelligence (AI) has made significant strides. The ability to simulate human functions, traits, and features has been a hallmark of AI success over other traditional methods (Maidaniuk et al. 2022). Akcin and Celik (2013) conducted a comparative test, pitting local geoid heights derived from trained artificial neural network (ANN) models against heights predicted using Kriging methods in the Bursa Metropolitan Area, Turkey. The results favored the trained ANN models, highlighting their potential in geodesy and survey data adjustment. ANN, particularly back-propagation artificial neural network (BPANN), generalized regression artificial neural network (GRANN), and radial basis functions artificial neural network (RBFANN), has gained traction in various mathematical and satellite geodesy applications (Ziggah et al. 2017). These applications encompass adjusting differential Global Positioning System (DGPS) data, coordinate transformation, modeling DGPS data uncertainties, tidal height predictions, road damage detection after earthquakes, orthometric height predictions in mining, air overpressure predictions, soil nutrient forecasts, rainfall predictions, and more. The study aims to fill the knowledge gap in showcasing the most accurate techniques between artificial intelligence techniques (ANN) and classical methods in adjusting real-time kinematics (RTK) GPS field data.

Despite the advancements, it is evident from the existing literature that ANN approaches have seen limited use in Ghana, and their suitability for modifying RTK GPS data remains underexplored due to the lack of technical know how and poor broad band connectivity. A comprehensive comparison of classical regression techniques and AI model training methods for rectifying RTK GPS data in Ghana is yet to be undertaken. This study presents ANN approach as an effective technique for enhancing RTK GPS survey field data. The study aims to investigate the application of BPANN, GRANN, and RBFANN as alternative technologies to existing classical approaches. It evaluates the effectiveness of these strategies in modifying survey field data, employing various performance criteria indices, including horizontal displacement (HD), arithmetic mean error (AME), arithmetic mean square error (AMSE), minimum and maximum residual values, and arithmetic standard deviation (ASD). These indices are used to assess the strength of each model in adjustment by measuring how the predictions are close to the actual values. Thus, the study offers Ghanaian researchers and practitioners a valuable opportunity to assess the potential of soft computing techniques in addressing some of the country’s geodetic challenges and advanced models for survey data adjustment. Beyond its impact on Ghanaian geodesists, this study contributes substantively to the scholarly discourse surrounding the application of soft computing techniques in addressing challenges within the surveying and related fields. By advancing the understanding of these techniques and their efficacy, it paves the way for more precise and efficient surveying practices in a rapidly evolving technological landscape.

Study area

The Greater Kumasi Metropolitan Area (GKMA), which includes the inner Kumasi and other nearby municipalities and districts such as Asokore Mampong, Ejisu, Juaben, Kwabre East, Afigya Kwabre, Atwima Kwanwoma, and Atwima Nwabiagya Districts, is located in the Ashanti Region of Ghana. Its latitudes are 6° 35′N and 6° 40′S and its longitudes are 1° 30′W and 1° 35′E. It has an elevation of between 250 and 350 m above sea level (Ghana Statistical Service 2014). A significant river (Owabi) and streams including Subin, Wiwi, Sisai, Aboabo, and Nsuben cross the undulating ground (Ghana Statistical Service 2014; Caleb et al. 2018). Hence, the intervisibility of various observation points will be affected in geodetic surveys due to the undulating nature of the terrain and the network of streams. With a total population of 3,190,473, it has a land area of 2603 km2 (Oduro et al. 2014). The area experiences a rapid population growth with its associated rise in built up areas, resulting in increasing multipath errors over the years. The War Office 1926 ellipsoid serves as the research area’s horizontal geodetic datum, and mean sea level (MSL), which approximates the geoid, serves as the study area’s vertical datum (Peprah and Kumi 2017). Ghana projected grid, derived from the Transverse Mercator with 1° W Central Meridian and the World Geodetic System 1984 (WGS84) (UTM Zone 30N), is the type of coordinate system utilized in the research region (Yakubu et al. 2018). The city of Metropolis is classified as wet sub-equatorial. The average maximum temperature is around 30.7 °C, while the average lowest temperature is about 21.5 °C. At sunrise and sunset, the average humidity is about 84.16% and 60%, respectively (Caleb et al. 2018) A clear or an unobstructed view of the sky should be considered for best results in RTK GPS surveys. There is a double maximum rainfall regime in the research area, with roughly 214.3 mm in June and 165.2 mm in September (Ghana Statistical Service 2014). The area experiences a considerable amount of rain which tends to scatter or absorb GPS signals, reducing the positioning accuracy due to bending of signals by water droplets. The Metropolis is located in the moist semi-deciduous south-East Ecological Zone, which is a transitional forest zone (Ghana Statistical Service 2014). The forest canopy can introduce errors such as receiver and multipath errors in GPS survey; hence, duplicate observations are made, necessitating the quest to find the most accurate technique for adjustment of GPS field data. Middle Precambrian rock predominates in the study area, which also contains two major lithostratigraphic and lithotectonic complexes, namely the Paleoproterozoic supracrustal and intrusive rocks and the Neoproterozoic to early Cambrian lithologically diverse platform sediments. The study area’s distinctive geological structure has contributed to the growth of the construction industry in the Metropolis, with few small-scale mining operations and the proliferation of stone quarrying and mining (Osei-Nuamah and Appiah-Adjei 2017). Geodetic data is employed by the construction and mining industries in their activities such as building and road set outs, ore markouts, topo pickups, and demarcation of mine concession. A diagram of the placement of control sites within the research region is shown in Fig. 1.

Fig. 1
figure 1

Map of the study area

Resources and methods used

Resources used

Topographic data from a survey conducted by the authors for the Ghana urban water supply project in the Greater Kumasi Metropolitan Area (GKMA) served as the study’s primary source of data. A total of 1000 control points were acquired from September 5, 2022 to October 15, 2022 using real-time kinematics (RTK). GPS equipment make up the sample data. This was to provide enough data to better assess the model performance, identify patterns, trends, and any anomalies. For the chosen controls of the study area, the data consists of three-dimensional coordinates, namely eastings, northings, and ellipsoidal heights designated as (E, N, h), which were recorded using the RTK GPS device. These data forms the actual observation points; hence, the best model for adjustment should be able to predict these values accurately. Table 1 shows sample of the dataset collected from the field.

Table 1 Sample of data used for the study (unit in meters)

Methods used

Radial basis function artificial neural network (RBFANN)

An unsupervised learning algorithm built using functional approximation is the RBFANN model. It is made up of an input layer, a hidden layer, and an output layer, each of which serves a different purpose. Sensory elements in the input layer link the network to its surroundings. A nonlinear transition from the input space to the hidden space was applied in the second layer by the network’s sole hidden layer. The network’s response to the activation pattern given to the output layer is provided by the linear output layer. In this study, the input and output variables were the 2D horizontal coordinates denoted as \(\left({N}_{i,j},{E}_{i,j}\right)\), \(\left({E}_{i,j}\right)\), and \(\left({N}_{i,j}\right)\) respectively. These variables were chosen in the model formulation because they were directly obtained from the field. The dataset used to create the model was split between training and testing data, with training data making up 70% of the overall dataset. A linear function is utilized in the input neurons because RBFANN is an accurate interpolator (Erdogan 2009), and the connections between the input and hidden layers are not weighted (Kaloop et al. 2017). This study applies the Gaussian function, which provides an optimal approximation and the weighted hidden output layer from Eq. (1) is added to the output neuron.

$$y\left(x\right)=\sum_{j=1}^{n}{K}_{j}{X}_{j}(x)$$
(1)

where n is the number of hidden neurons, \(x \in {R}^{M}\) is the input, \({K}_{j}\) is the output layer weights of the radial basis function network, and \({\chi }_{j}(x)\) is Gaussian radial basis function given by Eq. (2) (Idri et al. 2010; Srichandan 2012):

$${\chi }_{j}\left(x\right)=e\left(\frac{{-\parallel {x}_{i}-{c}_{j}\parallel }^{2}}{{{\sigma }_{j}}^{2}}\right)$$
(2)

where \({c}_{j}\in {R}^{M}\) and \(\sigma\) are the center and width of jth hidden neurons, respectively, and \(\Vert \Vert\) denotes the Euclidean distance.

Back-propagation artificial neural network (BPANN)

BPANN is an efficient and popular multilayer perceptron (MLP) model due to its straightforward implementation (Yilmaz et al. 2017; Ziggah et al. 2016). An input layer with M inputs, a hidden layer with q units, and an output layer with n outputs make up a BPANN. In this investigation, the two-dimensional (2D) horizontal coordinates served as the M inputs \(\left({N}_{i,j},{E}_{i,j}\right)\), while the q units were achieved by a trial-and-error training by varying the number of hidden neurons, and the n outputs were the estimated (outputs) achieved \(\left({N}_{i,j}\right)\) and \(\left({E}_{i,j}\right)\) by the BPANN model, respectively. The output of the model (yi) with a single output neuron is represented by Eq. (3) (Mihalache 2012):

$$y\left(i\right)=f\left(\sum_{j=1}^{q}{W}_{j}f\left(\sum_{i=1}^{M}{w}_{j,i}{x}_{i}\right)\right)$$
(3)

where Wj is the weight between the hidden layer and the output layer, wjI is the weight between the input layer and the hidden layer, and xi is the input parameter. In this study, the selected input and output variables were normalized into the interval [− 1, 1] using Eq. (4) (Mueller and Hemond 2013) given as

$$Z\left(i\right)=\frac{{y}_{\text{min}}+({y}_{\text{max}}-{y}_{\text{min}})\times ({x}_{i}-{x}_{\text{min}})}{({x}_{\text{max}}-{x}_{\text{min}})}$$
(4)

where \(Z(i)\) represents the normalized data, \({x}_{i}\) is the measured coordinate values, while \({x}_{min}\) and \({x}_{max}\) represent the minimum and maximum values of the measured coordinates with \({y}_{max}\) and \({y}_{min}\) values set at 1 and − 1, respectively. Based on the lowest arithmetic mean error (AME), arithmetic mean square error (AMSE), minimum residual error(\({r}_{min})\), maximum residual error(\({r}_{max})\), and arithmetic standard deviation (ASD), the best model was found. Their mathematical formulation is described in the section on model performance assessment. In keeping with the approach in the literature, the current investigation used one hidden layer in the BPANN (Hornik et al. 1989). Moreover, the hyperbolic tangent activation function was chosen for the hidden units to incorporate non-linearity into the network, whereas a linear function was used for the output units. Equation (5) gives the definition of the hyperbolic tangent function (Yonaba et al. 2010):

$$Z\left(x\right)=\text{tanh}\left(\text{x}\right)=\frac{2}{1+{e}^{-2x}}-1$$
(5)

where x is the total weighted inputs. The systematic methodology used in this work, which included ANN and traditional regression techniques, is depicted in a flowchart in Fig. 2. The dataset was trained independently using the Levenberg–Marquardt algorithm (trainlm) and the Bayesian regularization training technique (trainbr). Sequential trial-and-error approach was adopted to achieve the optimal results by varying the number of hidden neurons from 1 to 50.

Fig. 2
figure 2

Flow chart of methodology

Generalized regression artificial neural network (GRANN)

GRANN consisting of a single-pass neural network based on a general regression theorem GRANN which was first introduced by Specht (1991) is a different kind of radial basis function neural network (RBFANN), which is built on kernel regression network (Hannan et al. 2010) with one pass learning algorithm and highly parallel structure (Dudek 2011). Four layers make up the GRNN: the input layer, the pattern layer (also known as the radial basis layer), the summation layer, and the output layer. The northings and eastings, designated as\(({N}_{i,j}, {E}_{i,j} )\), were the input variables (independent datasets) in this investigation and the output variables (dependent datasets) were the northings and eastings denoted a\(s \left({N}_{i,j}\right) \text{and }({E}_{i,j} )\), respectively. The total number of observational parameters affects how many input units there are in the first layer. A training pattern and its output are delivered to each neuron in the pattern layer, which is coupled to the first layer. The summation layer and the pattern layer are interconnected. The single division unit and summation unit are the two types of summation that make up the summation layer (Hannan et al. 2010). The output datasets are normalized by the summation and output layer together. Radial basis and linear activation functions are utilized in the hidden and output layers of the network during training. Two neurons in the summation layer are coupled to each unit in the pattern layer. The unweighted outputs of pattern neurons are computed by one neuron unit, while the weighted response of the pattern is computed by the other neuron unit. The estimated output variables are produced by dividing the output of each neuron unit by the other in the output layer. Similar with other network architectures, the input layer receives the inputs from the datasets, the pattern layer computes the Euclidean distance, and the summation layer comprises the numerator and denominator parts. The output is estimated using the weighted average of the training dataset outputs in which the weight is computed using the Euclidean distance between the training and test data, respectively. In GRANN model formulation, the spread constant constitutes the adjustable parameter that needs to be varied until the optimal results are achieved. The weighted hidden output layer from Eq. (6) is added to the output neuron in the current study, which applies the Gaussian function (Erdogan 2009) as

$$y\left(x\right)=\sum_{j=1}^{n}{K}_{j}{\chi }_{j}(x)$$
(6)

where n is the number of hidden neurons, \(x \epsilon {R}^{M}\) is the input, \({K}_{j}\) are the output layer weights of the radial basis function network, and \({\chi }_{j}\left(x\right)\) is Gaussian radial basis function given by Eq. (7) as (Idri et al. 2010; Srichandan 2012).

$${\chi }_{j}\left(x\right)=e\left(\frac{{-\parallel {x}_{i}-{c}_{j}\parallel }^{2}}{{{\sigma }_{j}}^{2}}\right)$$
(7)

where \({c}_{j }\epsilon {R}^{M}\) and \(\sigma\) are the center and width of jth hidden neurons, respectively, \(| {} |\) denotes the Euclidean distance.

Least square regression model (LSR)

The least square approach is a statistical method that can identify the line of best fit for a model and looks for the residuals with the lowest sum of squares. Regression analysis and estimation frequently employ this technique (Miller 2006; Peprah and Mensah 2017). Given a set of equations whose least squares solution is indicated by Eq. (8),

$$DZ\approx L$$
(8)

where \(D\in {R}^{m\times n},X\in {R}^{n\times d},L\in {R}^{m\times d},m\ge n\)(Annan et al. 2016; Schaffrin 2006). \(D\) is the design matrix, \(Z\) is the matrix of the unknown parameters, and \(L\) is the observation matrix.

Equation (9) indicates that the unknown parameters matrix \(Z\) can be solved using the ordinary least squares method:

$$Z={\left[{D}^{T}D\right]}^{-1}\left[{D}^{T}L\right]$$
(9)

Equation (10) can be used to obtain the relevant error vector:

$$V=DZ-L$$
(10)

Kumi-Boateng and Ziggah (2020) improve the estimation performance of classical least squares using artificial neural network techniques in coordinate transformation in Ghana. The findings from their study revealed that the integration of the least square and neural network improved the transformation precision than the independent approach. Moreover, Yakubu et al. (2018) explore the efficiency and performance of ANN to classical least squares in adjusting DGPS survey field data. The results from their investigations showed that both ANN and least squares satisfactory adjusted the data to a good precision and ANN provides an alternative approach in adjusting survey field data.

Polynomial regression model (PRM)

In this study, the northings and eastings were modeled and predicted using a polynomial mathematical model. The independent variables were the horizontal coordinates (N, E). Equation (11), which is abbreviated as, provides the general expression of an m-degree polynomial interpolation (Yilmaz et al. 2017):.

$$Z\left(N,E\right)={\sum }_{i=0}^{m}{\sum }_{j=0}^{m-1}{a}_{ij}{N}^{i}{E}^{j}$$
(11)

where \({Z}_{(N, E)}\) is the ellipsoidal height information of the point with known horizontal coordinates \((N,E)\) and \({a}_{i,j}\) is the unknown polynomial coefficients to be estimated \((i, j=0, \dots , m)\). The effectiveness and performance of the simple planar (SP) polynomial model in calculating local heights was suggested by Dawod et al. (2022) and Peprah and Kumi (2017). The general SP polynomial model is denoted by Eq. (12) given as

$${Z}_{ij}={a}_{o}+{a}_{1}N+{a}_{2}E$$
(12)

where \({Z}_{ij}\) is the estimated ellipsoidal heights, \((N,E)\) are the horizontal coordinates of the stationary positions, and \({a}_{i,j}\) are the unknown parameters that can be determined using least square approach. PRM model has proven its superiority in solving majority of problems in geodesy studies comparing with other models. Notable among them are estimating orthometric heights in a mine (Peprah and Kumi 2017) and geoid modeling (Dawod et al. 2022).

Model performance assessment

Statistical error analysis was done to evaluate the accuracy of the models that were being used. Additionally, to produce a more comprehensive model performance analysis consistent with the training and testing results, dimensioned error statistics indicators such as the horizontal displacement (HE), arithmetic mean error (AME), arithmetic mean square error (AMSE), maximum and minimum residuals (rmin and rmax), and arithmetic standard deviation (ASD) were the statistical markers used. Their mathematical expressions are provided by Eqs. (13) through (18), respectively.

$$HE=\sqrt{{\left({E}_{2}-{E}_{1}\right)}^{2}+{\left({N}_{2}-{N}_{1}\right)}^{2}}$$
(13)
$$AME=\frac{1}{n}{\sum }_{i=1}^{n}\left({\alpha }_{i}-{\beta }_{i}\right)$$
(14)
$$AMSE=\frac{1}{n}{\sum }_{i=1}^{m}{\left({\alpha }_{i}-{\beta }_{i}\right)}^{2}$$
(15)
$${r}_{\text{max}}={\alpha }_{i}-{\beta }_{i}$$
(16)
$${r}_{\text{min}}={\alpha }_{i}-{\beta }_{i}$$
(17)
$$ASD=\sqrt{\frac{1}{n-1}{\sum }_{i=1}^{n}{\left(\mu -\overline{\mu }\right)}^{2}}$$
(18)

where n is the total number of the observations, \({\alpha }_{i}\) and \({\beta }_{i}\) are the measured and adjusted 2D horizontal coordinated from the various techniques, \(\mu\) denote the residual between the measured and adjusted field data, \(\overline{\mu }\) is the mean of the residual, and i is an integer varying from 1 to n.

Results and discussions

Results

In defining the best RBFANN training model, more than one hyperparameter tuning was required in order to achieve the best optimal results. The RBFANN optimal solution leading to a smooth function was reached after sequential training trials with a spread parameter value of 0.8 and a maximum number of hidden neurons set of 20. As shown in Table 2, there is a close to perfect relationship between the observed and predicted output datasets. This is further indicated by the lowest statistical measurement recorded for the training and testing datasets. Moreover, the error figures recorded in the training and testing results are relatively small, as revealed by the statistical indicators adopted. Hence, values predicted by the RBFANN model are very close to the observed values.

Table 2 Model results for soft computing techniques (units in meters)

The authors conducted a comprehensive comparison between soft computing techniques (BPANN, RBFANN, and GRNN) and traditional methods (PRM and LSR) using the entire dataset, as summarized in Table 3. The statistical analysis presented in Table 3 demonstrates that the proposed ANN techniques achieved commendable results comparable to classical techniques. This conclusion is supported by the statistical assessments provided. Figures 3, 4, and 5 depict graphical representations of the AME, AMSE, and horizontal displacement model comparisons for all utilized techniques.

Table 3 Statistical analysis of all the models (units in meters)
Fig. 3
figure 3

AME model graph of the models

Fig. 4
figure 4

AMSE model graph of the models

Fig. 5
figure 5

Horizontal displacement of models

Discussions

Developing of artificial neural network models

The Bayesian regularization learning approach was employed to develop a single-layer BPANN model. The hyperbolic tangent activation was used in the hidden layer, while the linear activation functions were used in the output layer since the dataset observed satisfies a regression problem. During the training of the BPANN model using trainbr, both the hidden and output layers featured Tansig and Purelin functions. The ideal model structure, heavily reliant on the number of hidden neurons, was determined through a sequential trial-and-error method based on the lowest statistical indicators of AME, AMSE, rmax, rmin, and ASD. The model was trained with hidden neurons varying in number from 1 to 20 across iterative training sessions, each comprising 5000 epochs, a learning rate of 0.03, a minimum performance gradient of 0.0000001, a target of zero, a maximum number of validation failures of six, and a momentum coefficient of 0.9. The GRANN predictive model consisted of three inputs, a 0.9 width parameter, and two outputs. Validation concluded during neural network training upon reaching the minimum gradient and maximum epoch. The output of both the trained GRANN and RBFANN models was notably influenced by the spread constant’s value. A sequential trial-and-error approach was utilized to determine the ideal width parameter values for GRANN and RBFANN in each iterative training session. Gradient descent rule was employed to train the GRANN and RBFANN models.

The ANN models (BPANN, GRANN, and RBFANN) were implemented and programmed using MATLAB (R2018a). After multiple training sessions adjusting the 2D horizontal coordinates (northings and eastings), the BPANN model eventually converged to the optimal model structure, which was denoted as [3 10 2]. Statistical analysis revealed that the RBFANN structure provided the lowest minimum values in terms of AME, AMSE, rminimum, maximum x, and ASD. The ideal RBFANN prediction model for correcting 2D horizontal data was achieved with a spread parameter of 0.1. This optimization was achieved by varying the spread parameter from 0 to 1 during each training iteration, resulting in the best model denoted as [3 20 0.1 2], comprising a spread constant of 0.1, a maximum of 20 hidden neurons, 3 input variables (independent variables), and 2 output variables (dependent variables). For modifying the eastings, the most effective RBFANN model was identified as [2 20 0.7 1], featuring a hidden layer with a maximum of 20 hidden neurons, two inputs (independent datasets), two outputs (dependent datasets), and a width parameter of 0.7. The summarized outcomes of training and testing using all soft computing methodologies are presented in Table 2. The statistical findings in Table 2 indicate that soft computing approaches have proven to be effective in modifying survey field data for the study area with significantly enhanced precision. Encouragingly, the residuals reached values of − 1.199E − 08 at the minimum and − 5.93718E − 09 at the maximum. Furthermore, the arithmetic mean error (AME), arithmetic mean square error (AMSE), and arithmetic standard deviation (ASD) for both training and testing demonstrated high levels of respectability and encouragement. Consequently, ANN emerged as a highly effective and practical alternative technique for accurately modifying RTK GPS data for the research area.

Developing the classical regression models

The Minitab 19 program was used to statistically describe the data and determine the correlation between the independent variables (input datasets) and the dependent dataset before the PRM model was created (output datasets). Equation (19) is the ideal PRM equation produced by the Minitab software for calculating the 2D coordinates (E (i), N (i))):

$$E(i)=308987+(0.4794N(i))-19.26Z\left(i\right)$$
(19)
$$(i)=217458+(0.7838E(i))+26.18Z(i)$$
(20)

where E (i), N (i) are the dependent datasets, Z (i) is the independent variables (ellipsoidal heights), {308,987, 0.4794, 19.26}, {217,458, 0.7838, 26.18} are the generated unknown parameters by the Minitab software. The software was more direct in its use and suits the author preference as compared to other statistical software. The final estimated E (i) and N (i) values with the given equation and parameters were coded and implemented in MATLAB environment. The LSR model’s unknown parameters were estimated using the least squares method, which was developed and put into practice in MATLAB. The statistical performance results of the models are shown in Table 2.

Comparing the predictive performance of the ANN models with the classical models

The ranking effectiveness of the models applied BPANN, RBFANN, GRANN, LSR, and PRM are compared using the statistical indicators of rmin, rmax, AME, AMSE, and ASD. Tables 2 and 3 provide the best option for evaluating the model predicted 2D point data of RTK GPS survey field data. Tables 2 and 3 show that the model with the lowest AMSE recorded statistical indicators of the best absolute fit to the data. It can be revealed in Table 3 that the RBFANN model yielded the best AMSE value after computing the horizontal displacement with a value of 8.75363 E − 18 m, followed by BPANN and GRANN which achieved 0.019055 m and 0.010969 m, respectively. These results can be confirmed in Fig. 4.

The rmin and rmax statistical indicators show the accuracy of the predictive models by revealing the minimum and maximum of their differences between the actual dataset and the predicted values as an error. Tables 2 and 3 indicate performance rmin and rmax results for all the models adopted, respectively. In effect, RBFANN achieved a better rmin and rmax values of 1.79091 E − 08 m and 9.24311 E − 09 m, respectively. In comparison, the BPANN and GRANN models achieved relatively results of 0.005732 m, 0.446649 m, 0.130091 m, and 0.263966 m, respectively. The worst performing testing models in rmin and rmax were the LSR and PRM models recording 149.8338 m, 256.1096 m, 149.7132 m, and 256.2342 m, respectively.

The AMSE is commonly used in model formulation assessment to represent the error loss function and indicates how closely the regression line best fit the set of point. The lower the AMSE values, the better the model and output dataset. Table 3 shows minimal AMSE value of 8.75363 E − 18 m for indicating better predictive model. Conversely, from Table 3 results, the RBFANN model proved its superiority in enhancing the survey field data over the other models in terms of the AMSE evaluation with LSR and PRM performing poorly in that regard.

The ASD is the degree of deviation of the variables from their true mean. It is observed from Table 3 that the RBFANN recorded 5.06513 E − 09 m which implies that the predicted values of the dataset did not deviate much from their true mean and therefore there is a good relationship between the observed and the predicted values. However, the best predicted model was observed for the RBFANN model as it recorded the best ASD value. Hence, the RBFANN predictions exhibit higher replication on the historical datasets.

A deeper examination of Table 3 reveals that the predictions generated by the proposed ANN models closely matched observed 2D coordinates (eastings and northings), exhibiting superior prediction accuracy over PRM and LSR models, which demonstrated lower accuracy. Specifically, the maximum residual values for conventional techniques and soft computing techniques were 149.7132 m and 0.130091 m, respectively. Results from the indices used for the accuracy assessment attest to the fact that ANN models achieve optimum results when used for survey adjustment in comparison to traditional methods. The ANN models displayed a superior performance due to its ability to learn and model nonlinear relationships.

It is noteworthy that the PRM model displayed an ASD of 13.57158 m and a minimum residual value of 256.2342 m, aligning with findings from prior studies (Chen and Hill 2005; Poku-Gyamfi 2009). This suggests that the PRM model’s limitations were associated with increased order and distortions in estimated values arising from the least squares approach, reaffirming the recommendation for maintaining a lower order.

In the context of adjusting observed 2D coordinates, soft computing techniques outperformed traditional methodologies. The comparatively less effective performance of conventional procedures may be attributed to unidentified parameters used in model estimation. The statistical analysis of both computing types consistently demonstrates the superiority of soft computing techniques in determining 2D coordinates for the research region.

Conclusions and recommendations

Enhancing survey field data has been a common practice over the years in geodesy in order to determine the magnitude of errors and the acceptable tolerance level for topographic, astronomic, geodetic, cartographic, navigation, and datum-related studies. This procedure is carried out using the traditional least squares regression approach. However, existing literature has shown that little or no alternative technique has been tested to serve as a substitute for the traditional least square. The main contribution of this study is to explore the superiority of ANN as a realistic alternative technology for enhancing GPS survey field data. In this study BPANN, RBFANN based on the supervised learning technique as well as GRANN, LSR, and PRM have been presented. The findings from the study reveal that the RBFANN offered satisfactory prediction of the 2D RTK survey data. However, RBFANN compared to BPANN, GRANN, LSR, and PRM showed superior stability and more accurate prediction results. It can therefore be proposed based on the results achieved in this study that the RBFANN should be used instead of BPANN, GRANN, LSR, and PRM within the study area for enhancing survey field data.

On the basis of the results and analysis achieved, it has been revealed that northings and eastings feed into the RBFANN model could successfully produce accurate estimates of the enhanced 2D datasets. Therefore, this study does not have a localized significance but will also open up more scientific discourse into the applications of ANN in enhancing GPS survey field data within the geoscientific community.

In the era of digital revolution marked by innovative discoveries in science and technology, the conventional (traditional) methods for identifying stationary and non-stationary ground truth locations continue to face challenges. In developing countries in the global south such as Ghana, the Global Positioning System (GPS), as an integral component of the Global Navigation Satellite System (GLONASS), plays a pivotal role in applications spanning cadastral surveying, topographical surveys, and advanced engineering projects. However, the assumption that raw RTK GPS field data is inherently exact and accurate, fit for all engineering endeavors, warrants a closer examination.

The critical domains of adjustment and computation in mathematics and satellite geodesy have garnered significant attention over the years as they serve to ascertain the alignment of data with acceptable tolerance levels. While past research has predominantly emphasized traditional methodologies for survey data adjustment, recent investigations have explored the integration of artificial intelligence (AI) to enhance field data modification in RTK GPS data collection.

This study represents an endeavor to bridge these methodologies, employing artificial neural networks (ANN) alongside conventional techniques to adapt field data from an RTK GPS survey using AI. Through rigorous analysis, the study demonstrates that ANN approaches surpass traditional regression procedures in terms of statistical performance. Specifically, the radial basis function artificial neural network (RBFANN) stands out for its superior accuracy and precision when compared to back-propagation artificial neural network (BPANN) and generalized regression artificial neural network (GRANN), as evident in their respective statistical analyses, and as asserted by Konakoglu and Cakir (2018).

As a result, the application of soft computing methods for RTK GPS data modification in the study area emerges as a more pragmatic approach than reliance on traditional least squares methods. This endeavor contributes significantly to the understanding of Ghanaian geospatial experts, offering valuable insights into the effectiveness of applying soft computing methods to precisely determine stationary points for geodetic purposes.

While this study provides valuable insights, further research in Ghana is warranted to harness various deep learning soft computing methods. These encompass deep learning convolutional neural networks (CNN), least squares support vector machines, extreme learning machine, and various learning algorithms, including Broyden-Fletcher-Goldfarb-Shanno quasi-Newton, resilient back-propagation, Fletcher-Reeves conjugate gradient, scaled conjugate gradient, Polak-Ribiere conjugate gradient, conjugate gradient with Powell/Beale restarts, one-step secant back-propagation, gradient descent, gradient descent with adaptive learning rate, gradient descent with momentum, and gradient descent with momentum and adaptive learning rate. Such investigations will evaluate their performance in addressing more intricate engineering tasks, thus expanding the realm of computational possibilities.

Beyond its impact on Ghanaian geodesists, this study contributes substantively to the scholarly discourse surrounding the application of soft computing techniques in addressing challenges within the surveying and related fields. By advancing the understanding of these techniques and their efficacy, it paves the way for more precise and efficient surveying practices in a rapidly evolving technological landscape.