Keywords

1 Introduction

High-speed railways (HSR) and highway networks have developed rapidly for nearly ten years to meet people’s travel needs. This explosive increase of high speed transportation raises higher requirements for wireless communication systems, including train ground communication (TGC) system [1], communication based train control (CBTC) system, vehicle ad hoc network (VAN) [2], vehicle to vehicle (V2V) communication, etc.

However, the speed of trains can reach 350 km/h and the speed of vehicles is up to 120 km/h, which make the users can not enjoy the smooth and high quality wireless services under low speed environment. In high mobility scenarios, large Doppler frequency shift, fast fading channel and fast handover issue seriously affect communication performances [3, 4].

Wireless channel play a key background role in transmission rate and quality of mobile propagation. Only after channel characteristics in a communication system are thoroughly researched, a variety of physical layer technologies are taken or adapted, such as the best modulation and coding interleaving scheme, equalizer design, or antenna configuration and subcarrier allocation in MIMO-OFDM system. Propagation prediction or channel estimation has been extensively studied in three areas: (1) to provide a theoretical performance bounds with information theory tool for a new physical technology [5]; (2) to assess various candidate schemes in the transmission system design [6]; (3) to estimate or predict channel parameters in the deployment of a new wireless system, and then optimize deployment [7].

Based on theoretical analysis method in modeling, wireless channel model can be divided into deterministic model, stochastic model, and semi-deterministic model [8]. Among them, some famous models as COST 207, COST 231, WINNER ii obtained by field measurements [9, 10] are wildly used in channel estimation. An appropriate channel model can be selected according to a particular scenario, and then its specific propagation parameters are set. Channel estimation in mobile propagation usually has two types of technologies to obtain these parameters: blind and pilot estimation [6, 11]. Pilot estimation is typically achieved by using pilot symbols strategically placed at frame heads or subcarrier. In blind estimation, channel coefficients are predicted by using statistical features of received signals. Once a channel is estimated its time-frequency characteristics, relevant parameters are used to update the pre-set model. As in any estimation application, wireless channel estimation aims to quantify the best performance of wireless systems. However, due to the unlimited number of received signal, it is a challenge to extract optimal channel coefficients.

Feedforward neural networks (FNN) is extensively used to provide models for a natural or artificial phenomena that are difficult to handle using classical parametric techniques [12]. Simsir et al. [13] demonstrated that channel estimation based on neural network ensures better performance than conventional Least Squares (LS) algorithm without any requirements for channel statistics and noise. In the meantime, [14] and [15] also proved FNN can be used in channel estimation for various wireless environments. Unfortunately, the learning speed of FNN has been a major bottleneck in many applications, and fast fading channel caused by high mobility makes this method unsuitable for channel estimation too. Unlike traditional FNN implementations, a simple learning algorithm called extreme learning machine (ELM) with good generalization performance [12, 16, 17] can learn thousands of times faster.

In this paper, we propose a channel estimation scheme based on ELM algorithm for high speed environments. Since researches in wireless channel have concentrate on large-scale and small-scale models [18], we choose path loss coefficient and fading classification as estimation objects. Compared with back-propagation (BP) algorithm, ELM shows its potential in channel estimation, especially for scenarios with high mobility.

The outline of the paper is as follows: In Sect. 2, ELM learning algorithm is present briefly. In Sect. 3, path loss estimation of wireless channel using ELM for high speed environments is proposed, and simulation results are analyzed. Section 4, fading classification estimation in COST 207 model based on ELM algorithm is provided. Conclusion is given in Sect. 5.

The performance of channel estimation based on ELM is in comparison with BP (LevenbergCMarquardt algorithm) which is a popular algorithm of FNN. All of the simulations are carried out in MATLAB 7.12.0. LevenbergCMarquardt algorithm is provided by MATLAB package, while ELM algorithm is downloaded from [19].

2 Review of ELM

Traditional FNN solution iteratively adjusts all of its parameters to minimize the cost function by using gradient-based algorithms. Although BP’s gradients can be computed efficiently, an inappropriate learning rate might raise several issues, such as slow convergence, divergence, or stopping at a local minima.

ELM algorithm steps are as follow:

  1. 1.

    Assign a training set \(\aleph = \left\{ {\left. {\left( {{\mathbf{{x}}_i},{\mathbf{{t}}_i}} \right) } \right| {\mathbf{{x}}_i} \in {\mathbf{{R}}^n},{\mathbf{{t}}_i} \in {\mathbf{{R}}^m},i = 1,2, \ldots ,N} \right\} \), active function g(x) and the number of hidden neurons \({\tilde{N}}\),

  2. 2.

    Randomly assign input weight vector \({\mathbf{{w}}_i},i = 1,2, \ldots ,\tilde{N}\) and bias value \({b_i},i = 1,2, \ldots ,\tilde{N}\),

  3. 3.

    Calculate the hidden layer output matrix \(\mathbf{{H}}\) and its Moore-Penrose generalized inverse matrix \({\mathbf{{H}}^\dag }\),

  4. 4.

    Calculate the output weight \(\hat{\beta }= {\mathbf{{H}}^\dag }{} \mathbf{{T}}\) with the least squares, where \(\mathbf{{T}} = {\left[ {{\mathbf{{t}}_1},{\mathbf{{t}}_2}, \ldots ,{\mathbf{{t}}_N}} \right] ^\mathrm{{T}}}\).

In a word, for a linear system \(\mathbf{{H}}\beta = \mathbf{{T}}\), ELM algorithm finds a least-squares solution \(\hat{\beta }\) rather than iterative adjustment. Seen from the steps, the learning time of ELM is mainly spent on calculating \({\mathbf{{H}}^\dag }\). Therefore, ELM saves a lot of time in most applications. The performance evaluation in [12, 16] shows that ELM can produce good generalization performance in most cases and can learn more than hundreds of times faster than BP.

3 Large-Scale/Path Loss Channel Estimation

3.1 Large-Scale Channel Model

Large-scale/path loss channel models predict the mean signal strength for an arbitrary large transmitter-receiver distance (several hundreds or thousands of meters) in order to estimate the radio coverage area of a transmitter. Since the estimation of large-scale channel coefficients use statistical features of received signals, a blind estimation solution might work.

Both theoretical and measurement-based propagation channel models (such as free-space model, two-ray model, Okumura model, Hata model and etc.) [20] indicate that average received signal power \(P_r\) decreases logarithmically with distance [18]. Considering shadowing effects component \(\psi \) obeys a log-normal distribution, a statistical path loss model [21] is

$$\begin{aligned} \begin{array}{ccl} {P_\mathrm{{r}}}{} { dBm}&{}=&{}{P_\mathrm{{t}}}{} { dBm}\,+\,K { dBm}\,-10\gamma \log _{10}\left[ \frac{d}{d_0}\right] -\psi \\ &{}=&{} {P_\mathrm{{t}}}{} { dBm}\,+\,20\log _{10}\frac{\lambda }{4\pi d_0}-10\gamma \log _{10}\left[ \frac{d}{d_0}\right] -\psi \end{array} \end{aligned}$$
(1)

where \(P_t\) is the transmit power, \(\gamma \) is the path loss exponent indicating the rate at which path loss increases with distance, reference distance \(d_0\) for practical systems is typically chosen to be 1 m, d is the transmitter-receiver distance, and shadowing effect exponent \(\psi \) is a zero-mean Gaussian distributed random variable with standard deviation \(\sigma _{\psi }\) (also in dBm).

\(\gamma \) is obtained by fitting the minimum mean square error (MMSE) of measurements

$$\begin{aligned} F_\mathrm{MMSE}(\gamma )=\min \limits _{\gamma }\sum \limits _{i=1}^{n}\left[ M_\mathrm{measured}(d_i)-M_\mathrm{model}(d_i)\right] ^2 \end{aligned}$$
(2)

where \(M=P_t/P_r\), in dBm. And the variance \(\sigma _{\psi }^2\) is given by

$$\begin{aligned} \sigma _{\psi }^2=\frac{1}{n}\sum \limits _{i=1}^{n}\left[ M_\mathrm{measured}(d_i)-M_\mathrm{model}(d_i)\right] ^2 \end{aligned}$$
(3)

3.2 Approximation of Path Loss Exponent

In Eq. (1), path loss exponent \(\gamma \) and shadowing effect exponent \(\psi \) are determined by carrier frequency and propagation terrain. Typical value of \(\gamma \) is between 1 and 4. The smaller \(\gamma \) is, the less energy loss of wireless signal due to transceiver-receiver distance is. For example, in HSR environment, \(\gamma \) is slightly larger than 2 in rural area (within 250–3200 m) with narrow band communication system while it is near to 4 in hilly terrain (within 800–2500 m) with broadband system.

If \(\gamma \) calculated distance is 1500 m and the vehicle’s velocity is 120 km/h, \(\gamma \) needs to be calculated every 45 s; if the velocity is up to 350 km/h, \(\gamma \) needs to be calculated every 15.4 s. According to Eqs. (2) and (3), the calculation of \(\gamma \) requires hundreds or thousands of receive signal measurements, the introduction of learning algorithm into \(\gamma \) estimation might be effective in simplifying the data processing.

We use ELM and BP algorithms to approximate the path loss exponent \(\gamma \). Without loss of generality, we set velocity \(v=120\) km/h, carrier frequency \({f_\mathrm{{c}}}=2.35\) GHz, transmit power \({P_\mathrm{{t}}}=39.5\) dBm and distance d is obtained by means of GPS [22]. A training set \(\left( {{P_\mathrm{{r}}}_i,{\gamma _i}} \right) \) and testing set \(\left( {{P_\mathrm{{r}}}_i,{\gamma _i}} \right) \) with 1000 data, respectively are created where \({{P_\mathrm{{r}}}_i}\) is uniformly randomly distributed on the interval \(\left( { - 105, - 25} \right) \) dBm [23]. Shadowing effect exponent \(\psi \) has been added to all training samples while testing data are shadowing-free.

3.3 Simulation Results

The number of hidden neurons of ELM is initially set at 20 and active function is sigmoidal. Simulation result is shown in Fig. 1. The train accuracy measured in terms of root mean square error (RMSE) is 0.27734 due to shadowing effect, whereas the test accuracy is 0.012445. Figure 1 confirms that the estimation results of \(\gamma \) are accurate, and there is a visible margin of error only when \({{P_\mathrm{{r}}}}>-30\) dBm.

Fig. 1
figure 1

The estimation of path loss exponent \(\gamma \) by ELM learning algorithm

Average 200 trails of simulation have been conducted for both ELM and BP algorithm, whose results are shown in Table 1. ELM learning algorithm spents 6.6 ms CPU time on training and 6.8 ms on testing, however, it takes 53.6 s for BP algorithm on training and 67.1 ms on testing. ELM runs 8000 times faster than BP. In high speed environment, when a vehicle’s velocity is 120 km/h in cells with radius 1 km, it will carry out a handover procedure per 60 s; when a train’s velocity is 350 km/h in same cell, handover will occur per 20.6 s. Therefore, BP is too time-consuming to be used in wireless system with high mobility. Although ELM has a much higher testing error 0.0475 compared with 0.0028 in BP, this estimation error can be acceptable in our environments. In addition, assuming that network transmission rate is 1Mbps, the collection of 1000 test data takes only 1 ms, so that a packet of 125 bytes can estimate the path loss exponent \(\gamma \) based on ELM with accuracy rate 95 % within a time interval of less than 8 ms.

Table 1 Performance comparison for learning algoritms in large-scale channel estimation

Figure 2 shows the relationship between the generalization performance of ELM and the number of hidden neurons n for \(\gamma \) estimation. Every n simulates 50 times. Obviously, the generalization performance of ELM is stable when \(n\ge 12\). Thus, the simulation result in Fig. 1 is reasonable when n is set to 20.

Fig. 2
figure 2

The generalization performance of ELM in estimation of path loss exponent \(\gamma \)

Figure 3a shows the relationship between RMSE of ELM and the number of train/test data, and Fig. 3b shows the impact of this number on consuming time. Training RMSE number of train/test data is almost a constant (slightly less than 0.3) because ELM use Moore-Penrose inverse matrix calculation to solve the problem of finding the smallest norm least-squares output weight. Unlike training RMSE, testing RMSE decreases with increasing number of test data. The simulation confirms the conclusion in [12] that ELM has no over-trained phenomenon. Both train and test consuming time increases with data number, however, the increase of test time is less than train time. It should also be noted that, even the number of data is up to \(10^4\), the time consuming of ELM is still acceptable, which is less than 70 ms.

Fig. 3
figure 3

Number of train/test data of ELM in estimation of path loss exponent \(\gamma \)

4 Small-Scale/Fading Estimation

4.1 Small-Scale Channel Model

Small-scale/fading models characterize the rapid fluctuations of the received signal strength over very short distances (a few wavelengths) or short durations (on the order of seconds) in order to estimate the influence of multi-path propagation and the speed of a mobile terminal.

COST 207 model [9] for mobile radio specifies the power delay profiles and Doppler spread for four typical environments, i.e. rural area (RA), typical urban area (TU), bad urban area (BU) and hilly terrain (HT). The RA case consists of two distinct channel models, while the other cases each comprises four channel models. Thus, COST 207 has a total of 14 channel models: RAx4, RAx6, TUx6, TUx6alt, TUx12, TUx12alt, BUx6, BUx6alt, BUx12, BUx12alt, HTx6, HTx6alt, HTx12 and HTx12alt.

Due to radio waves’ reflection and refraction, the propagation between transceiver and receiver has several paths, hence each channel model has multiple taps. For example, RAx4 is short for rural area environment with 4 taps, and HTx6alt stands for hilly terrain with 6 alternative taps. Each tap is characterized by a relative delay (with respect to the first path delay), a relative power and a Doppler spectrum category.

ELM and BP algorithms are used to estimate COST 207 channel models \({C_\mathrm{{T}}}\) based on modulated receive signals \({P_\mathrm{{M}}}\). In order to facilitate channel estimation, each channel model is assigned an integer value [24], such as \({C_\mathrm{{T}}}=1\) for RAx4, \({C_\mathrm{{T}}}=2\) for RAx6, and etc. We still set \(v=120\) km/h, \({f_\mathrm{{c}}}=2.35\) GHz. Transmission rate is 1Mbps and sampling factor is 4, so that the simulation sampling rate is \(4 \times {10^6}\) samples per second. PSK modulation and bi-Gaussian Doppler are used in this simulation. A training set \(\left( {{C_\mathrm{{T}}}_i,{P_\mathrm{{M}}}_i} \right) \) has 1000 data whereas testing set with 300 data.

4.2 Simulation Results

The hidden neurons of ELM is initially set at 20 and active function is sigmoidal. Average 50 trails of simulation have been conducted for both ELM and BP algorithm, whose results are shown in Table 2. Similarly, ELM learns up to hundreds of times faster than BP. Although BP can reach the learning rate 90.60 %, its testing rate drops to 31.08 %. On the contrary, ELM learning rate 86.46 % is slightly lower than BP, but it can achieve average testing rate 72.73 %. This is mainly because Matlab BP function newff doesn’t support complex data. Modulated receive signals \({P_\mathrm{{M}}}\) must be turned into real.

Table 2 Performance comparison for learning algorithms in small-scale channel estimation

5 Conclusion

In this paper, channel estimation based on ELM is proposed for high speed environments. In large-scale model, the estimation performance of path loss exponent is developed, whose experimental results show that ELM run 8000 times fast than BP learning algorithm and its testing error is acceptable. In small-scale model, fading classification estimation is provided, which shows ELM is an effective tool to classify channel type. Compared with BP, ELM still works when the elements in training set or testing set are complex. Therefore, ELM is an effective tool in channel estimation.