1 Introduction

The real-time image-sensing platforms using multiple cameras need very fast data transmission (mmW) as well as a video-level QoS adaptation.

A mmW wireless data transmission (i.e., transmission data with the carrier frequencies between 30 and 300 GHz) has been recently considered as one of the most important approaches for increasing network capacity in next-generation wireless networks [8, 17]. In particular, 28, 38, and 60 GHz mmW channels are considered for designing next-generation wireless networking systems [1, 16, 23] and the feasibility of mmW wireless communications has been recently demonstrated [17]. The benefits of mmW wireless communication links are given by the ultra-wide-bandwidth (UWB) which is available at these mmW frequencies (i.e., from 30 to 300 GHz). This leads them eventually suitable for wireless UWB transmission of video streams, which is already the major sender of wireless network traffic [4]. Regarding the coming 5G networks, the mmW frequencies (e.g., 28, 38, and 60 GHz) are actively considered because they could provide high-bandwidth communication links that meet the target bitrate with multi-gigabit-per-second (multi-Gbps) level. For example, Samsung Electronics and Intel Corporation develop the 5G system using 28 and 38 GHz frequencies, respectively.

Nowadays, distributed image-sensing platforms using camera start to get a lot of attention [9] and the network forwards data from wireless camera/video platforms to a final destination in a multi-hop routing manner. For the multi-hop routing, mmW links are considered in this paper. However, mmW wireless propagation channels suffer from high attenuation, since (1) the free-space loss is high, (2) additional path loss in non-line-of-sight (NLOS) situations is higher. Thus, cameras cannot directly communicate with other cameras at longer distances. Consequently, multi-hop routing/relaying is required.

As illustrated in Fig. 1, if we use multi-hop routing capable mobile cameras for data forwarding from end-hop cameras (those are performing visual data perception) to a final destination, we can deal with the two main issues, i.e., (1) extending communication range with intermediate cameras and (2) finding a route with intermediate cameras to overcome NLOS. The key question is then finding appropriate routes from the transmitters to the receivers. To increase the probability of finding a suitable route, we place dedicated relays in the network, such preplanned relays are common in cellular and mobile access networks [24].

Fig. 1
figure 1

Reference real-time image-sensing and streaming network model: suppose that cameras and relays have 28, 38, or 60 GHz mmW antennas. Since (1) the distance of mmW transmission is short and (2) line-of-sight should be guaranteed, we need relaying, i.e., multi-hop routing is desired. In this figure, the session from \(v_{1}\) to \(v_{3}\) needs multi-hop routing due to the short range of mmW signals and the session from \(v_{4}\) to \(v_{6}\) requires multi-hop routing for using \(v_{5}\) to deal with non-line-of-sight

Most of the currently existing routing techniques have concentrated on either maximizing the summation of network capacity or uses max–min flow routing, i.e., maximizing the minimum capacities of flow links is generally used [15]. However, this max–min flow routing is not always good for QoS-sensitive video streaming applications when differentiated quality metrics and weights are considered for all the given individual flows. In other words, the criterion does not reflect the quality criterion for video streams, where there is a monotonous, but nonlinear and saturating relationship between the achieved data rates of the sessions and the perceived qualities. As we have shown for a special case (two-hop relaying), taking video quality as criterion leads to significantly different routes and significantly improved overall quality [1012]. According to the fact that max–min flow routing shows better performance compared to throughput maximization algorithms in video-related applications [15], a quality optimization (as we propose) that outperforms max–min flow obviously will outperform throughput maximization.

Our objective is as follows: when K number of cameras want to construct single-hop or multi-hop device-to-device (D2D) flows using mmW links (i.e., K number of source cameras and K number of destination cameras, i.e., K video sessions), our objective is to find single-hop or multi-hop flow paths for the video sessions which can maximize the summation of differentiated video QoS requirements.

2 Related and previous work

This section provides two main works; (1) our previous work related to the real-time video streaming over wireless sensor platforms, (2) video coding standards HEVC and scalable HEVC (SHVC) with parallel processing tools for real-time image processing.

2.1 Our previous work

In [15], the fundamental concepts and novel features of max–min flow routing are well studied. The proposed algorithm in [8] also discussing about per-flow quality-maximum multi-hop routing, but it is different from the proposed algorithm in this paper in many ways:

  • The proposed algorithm in [8] does not consider 60 GHz radio technologies which is the most popular among mmW radio access technologies. When we start to consider 60 GHz radios, we need to consider their own path loss models and attenuation parameters in oxygen and rains. Therefore, many formulations will be updated with the consideration of 60 GHz radios.

  • In the definition of quality functions, there is no concept of minimum flow amount \(f_{\rm min }\) in [8]. Therefore, the quality formulation in [8] assumes that there exists quality gain even if there is very small amount of information flows. However, in real-time video streaming condition using scalable video coding (SVC) and SHVC with the dynamic adaptive streaming over HTTP (DASH), we at least require a certain amount of bandwidth which is essentially needed for basement layer transmission. Otherwise, we cannot send any bitstream. This baseline bandwidth for transmitting basement layers will be the \(f_{\rm min }\) in this paper.

  • This paper contains real-world video streaming-based performance evaluation. We can observe performance gain with video frames in this paper, whereas the [8] paper does not contain any video-/image-based performance evaluation.

2.2 Video coding standard HEVC for real-time video processing

The proposed system uses next-generation video coding standard HEVC and its scalable extension that was standardized in 2013, 2014, respectively. The HEVC was developed with the goal of providing twice the compression efficiency of the previous standard, H.264/AVC (Advanced Video Coding) [20]. After successfully standardizing H.264/AVC, ISO/IEC MPEG and ITU-T VCEG have been jointly developing next-generation video standard called HEVC. This new standard targets next-generation HDTV displays and IPTV services, addressing the concern of error resilient streaming in HEVC-based IPTV. As shown in Table 1, comparing to H.264/AVC, HEVC includes new features such as extended prediction block sizes, coding tree unit (CTU) (up to 64x64), large transform block sizes (up to 32x32), tile and slice picture segmentations for loss resilience and parallelism, sample adaptive offset (SAO), and so on [21] (Fig. 2).

Table 1 AVC and HEVC
Fig. 2
figure 2

HEVC parallel processing tools; a tiles and b WPP (wavefront parallel processing) for real-time image processing

The HEVC parallel processing tools support different picture partition strategies such as tiles and wavefront parallel processing (WPP). Tiles partitions a picture with horizontal and vertical boundaries and provides better coding gains compared to multiple slices. WPP is when a slice is divided into rows of CTUs in which the first row is decoded normally but each additional row requires that decisions be made in the previous row. WPP has the entropy encoder use information from the preceding row of CTUs and allows for a method of parallel processing that may allow for better compression than tiles [21].

The SHVC as an extension of HEVC can support multiple resolutions, frame rates, video qualities, and coding standards within a single bitstream because the bitstream consists of multiple layers. It is designed to have low complexity for enhancement layers (ELs) by adding the reconstructed base layer (BL) picture to the reference picture lists in EL [22]. In addition, SHVC uses multiple loops decoding to make a decoder chipset simple, while the scalable video coding (SVC) uses single loop decoding. SHVC also provides a standard scalability by supporting AVC with BL and HEVC with EL. Thus, UHD TV services that supporting legacy HDTVs as well as simple bitstream-level rate control (layer switching) need the SHVC.

3 A reference system architecture

Our considering network consists of \(\left| {V}_{s} \cup {V}_{d} \cup {V}_{r} \cup {V}_{i}\right|\) number of cameras where \({V}_{s}\), \({V}_{d}\), \({V}_{r}\), and \({V}_{i}\) stand for the sets of source cameras, destination cameras, relays, and intermediate cameras, respectively. Notice that

$$\begin{aligned} {V} = {V}_{s} \cup {V}_{d} \cup {V}_{r} \cup {V}_{i} \end{aligned}$$
(1)

where V is a set of image-sensing cameras and relays in the given network.

The deployed cameras and relays are equipped with 28, 38, or 60 GHz antennas and RF front-ends as shown in Fig. 3. The real-time image-sensing platform consists of image sensor with HEVC encoder such as camera, millimeter-wave transmitter, and QoS manager with rate adaptation model and proposed SQM scheme. Though the video rate adaptation over WLAN has been studied as [18, 19], the adaptation over mmW networks needs to consider some limitations and options. According to the fact that the given cameras have space limitations (especially for mobile cell phones), suppose that they are equipped with a single mmW antenna, and therefore, cameras are capable of only transmitting/receiving one video stream at once. The relays have less restriction in terms of spaces, and the use of multiple mmW antennas is allowed, i.e., multiple video streams can be handled.

Fig. 3
figure 3

Conceptual system architecture of proposed QoS optimal real-time video streaming system with distributed wireless image-sensing platforms

4 Millimeter-wave multi-hop routing with individual QoS consideration

4.1 Problem formulation

The considering objective function is as follows:

$$\begin{aligned} \max{:} \sum _{s_{k}\in {V}_{s}} q_{k}\left( f_{s_{k}\rightarrow v}^{k}\right) \end{aligned}$$
(2)

where \(f_{s_{k}\rightarrow v}^{k}\) is the amount of bits of outgoing flow from source camera \(s_{k}\) to its next-hop camera v which is attributed to the session \(\left( s_{k}, d_{k}\right)\) where \(s_{k}\in {V}_{s}, v \in {V}\), and \(d_{k}\) is the destination camera of the flow. Therefore, the summation of qualities of all given flows can be maximized by (2). According to the flow constraint in Sect. 4.1.3, the amount of outgoing traffic from a source camera is eventually equivalent to the amount incoming traffic at the destination camera. Moreover, \(q_{k}\left( \cdot \right)\) denotes the function which represents the QoS depending on the given flow amounts for the session \(\left( s_{k}, d_{k}\right)\). The details of this function \(q_{k}\left( \cdot \right)\) are explained in Sect. 4.3.

4.1.1 Formulation for cameras

The variable \({L}_{v_{i}\rightarrow v_{j}}\) represents the link connection between \(v_{i}\) and \(v_{j}\) where \(v_{i}, v_{j} \in {V}\), i.e.,

$$\begin{aligned} {L}_{v_{i}\rightarrow v_{j}} \triangleq \left\{ \begin{array}{ll} 1, &{} \hbox { if the link between } v_{i} \hbox { and } v_{j} \hbox { is used},\\ 0, & \, \text {otherwise}. \end{array} \right. \end{aligned}$$
(3)

Each source camera has a video stream in order to transmit toward next-hop camera, and each destination camera has a video stream to receive from a preceding camera, i.e.,

$$\begin{aligned} \sum _{v\in {V}, s_{i}\ne v} {L}_{s_{i}\rightarrow v}= & {} 1, \forall s_{i}\in {V}_{s} \end{aligned}$$
(4)
$$\begin{aligned} \sum _{v\in {V}, d_{i}\ne v} {L}_{v\rightarrow d_{i}}= & {} 1, \forall d_{i}\in {V}_{r} \end{aligned}$$
(5)

Each intermediate camera is able to receive a video stream from one camera (because it has only one mmW antenna) and transmit data to one camera (because it also has only one mmW antenna):

$$\begin{aligned} \sum _{v_{j}\in {V}, v_{j}\ne v_{i}} {L}_{v_{j}\rightarrow v_{i}}\le & {} 1, \end{aligned}$$
(6)
$$\begin{aligned} \sum _{v_{k}\in {V}, v_{k}\ne v_{i}} {L}_{v_{i}\rightarrow v_{k}}\le & {} 1 \end{aligned}$$
(7)

where \(\forall v_{i}\in {V}\).

If one intermediate camera receives a video stream, it will transmit the video stream and visa versa, i.e.,

$$\begin{aligned} \sum _{v_{j}\in {V}, v_{j}\ne p_{i}} {L}_{v_{j}\rightarrow p_{i}} = \sum _{v_{k}\in {V}, v_{k}\ne p_{i}} {L}_{p_{i}\rightarrow v_{k}} \end{aligned}$$
(8)

where \(\forall p_{i}\in {V}_{i}\) and \(v_{j},v_{k}\in {V}\).

4.1.2 Formulation for relays

Suppose that each relay has \(N_{\text {antenna}}\) number of antennas. Similar to the formulations in (6) and (7), the numbers of incoming and outgoing flows in each relay are limited by \(N_{\text {antenna}}\), i.e.,

$$\begin{aligned} \sum _{v_{j}\in {V}, v_{j}\ne r_{i}} {L}_{v_{j}\rightarrow r_{i}}\le & {} N_{\text {antenna}}, \end{aligned}$$
(9)
$$\begin{aligned} \sum _{v_{k}\in {V}, v_{k}\ne r_{i}} {L}_{r_{i}\rightarrow v_{k}}\le & {} N_{\text {antenna}} \end{aligned}$$
(10)

where \(r_{i}\in {V}_{r}\) and \(v_{j}, v_{k}\in {V}\).

4.1.3 Formulation for information flows

The amount of incoming traffic should be equal to the amount of outgoing traffic for each session in the given intermediate cameras and relays, i.e.,

$$\begin{aligned} \sum _{v_{j}\in {V}, v_{j}\ne p_{i}} f_{v_{j}\rightarrow p_{i}}^{k}= & {} \sum _{v_{l}\in {V}, v_{l}\ne p_{i}} f_{p_{i}\rightarrow v_{l}}^{k}, \end{aligned}$$
(11)
$$\begin{aligned} \sum _{v_{j}\in {V}, v_{j}\ne r_{i}} f_{v_{j}\rightarrow r_{i}}^{k}= & {} \sum _{v_{l}\in {V}, v_{l}\ne r_{i}} f_{r_{i}\rightarrow v_{l}}^{k} \end{aligned}$$
(12)

where \(p_{i}\in {V}_{i}\), \(r_{i}\in {V}_{r}\), \(v_{l}\in {V}\), and \(s_{k}\in {V}_{s}\). Moreover, the flow amounts of each session are limited by its wireless channel capacity, i.e.,

$$\begin{aligned} \sum _{s_{k}\in {V}_{s}} f_{v_{i}\rightarrow v_{j}}^{k} \le {C}_{(v_{i},v_{j})}\cdot {L}_{v_{i}\rightarrow v_{j}} \end{aligned}$$
(13)

where \(v_{i}, v_{j}\in {V}, s_{k}\in {V}_{s}\) and \({C}_{(v_{i},v_{j})}\) denotes the channel capacity in the wireless links between \(v_{i}\) and \(v_{j}\). The channel capacity can be computed based on link budget. For the link budget calculation, Shannon’s equation is used under the consideration of optimal coding and modulation schemes [14]:

$$\begin{aligned} {C}_{(v_{i},v_{j})} = {BW}\cdot \log _{2}\left( 1+\frac{P_{\text {signal}}}{P_{\text {noise}}}\right) \end{aligned}$$
(14)

where \(P_{\text {signal}}\) and \(P_{\text {noise}}\) stand for the signal power and noise power in a linear scale and BW stands for the wireless channel bandwidth which is 800 MHz at 38 GHz and 2160 MHz at 60 GHz [11, 16].

The signal power, \(P_{\text {signal}}\), is obtained as:

$$\begin{aligned} 10^{\frac{{EIRP} + G_{\text {Rx}} - PL\left( d_{\left( v_{i},v_{j}\right) }\right) - O\left( d_{\left( v_{i},v_{j}\right) }\right) - R\left( d_{\left( v_{i},v_{j}\right) }\right) }{10}} \end{aligned}$$
(15)

where EIRP is limited to 47 dBm in 38 GHz and 43 dBm in 60 GHz bands [11, 16]. \(G_{\text {Rx}}\) stands for the receiver antenna gain, and the values are in Table 2. \(PL\left( d_{\left( v_{i},v_{j}\right) }\right)\) is path loss depending on the distance of the wireless link between \(v_{i}\) and \(v_{j}\); the models for mmW channels are obtained in [16] for 38 GHz and in [1] for 28 GHz as follows:

$$\begin{aligned} 20 \log _{10}\left( \frac{4 \pi d_{0}}{\lambda }\right) + 10 n \log _{10}\left( \frac{d_{\left( v_{i},v_{j}\right) }}{d_{0}}\right) +{X}_{\sigma } \end{aligned}$$
(16)

where \(d_{\left( v_{i},v_{j}\right) }\), \(d_{0}\), \(\lambda\), n, and \({X}_{\sigma }\) stand for the distance between \(v_{i}\) and \(v_{j}\), close-in free-space reference distance (5.0 meter in [1, 16]), wavelengths (7.78 mm in 38 GHz [16] and 10.71 mm in 28 GHz [1]), average path loss coefficient over distance and all pointing angles, and shadowing random variable which is represented as a Gaussian random variable with 0 mean and \(\sigma\) standard deviation [14]. n and \(\sigma\) values in 38 and 28 GHz are measured in [1, 16] and summarized in Table 2.

For 60 GHz path loss, a standardized 60 GHz IEEE 802.11ad path loss model is used as follows [13]:

$$\begin{aligned} A + 20\log _{10}(f) + 10n\log _{10}(d) \end{aligned}$$
(17)

in a dB scale where \(A=32.5\) dB which is a specific value for the selected type of antenna and beamforming algorithms, which depends on the antenna beamwidth but for the considered beam range from \(60^{\circ }\) to \(10^{\circ }\) the variance is very small—less than 0.1 dB. In (17), n refers to the path loss coefficient where it is set to 2, and f stands for a carrier frequency in GHz and is set to 60. Note that there is no shadowing effect in LOS path loss model as presented in [13].

In (15), \(O\left( d_{\left( v_{i},v_{j}\right) }\right)\) and \(R\left( d_{\left( v_{i},v_{j}\right) }\right)\) stand for the oxygen and rain attenuation factors those are valid only for 60 GHz (no effects in 28 and 38 GHz mmW propagation channels).

In (15), \(O\left( d_{\left( v_{i},v_{j}\right) }\right)\) captures the oxygen absorption loss that can be as high as 15 dB/km [7]. For oxygen absorption of 15 dB/Km,

$$\begin{aligned} O\left( d_{\left( v_{i},v_{j}\right) }\right) = 15 \cdot \frac{d_{\left( v_{i},v_{j}\right) }}{1000}. \end{aligned}$$
(18)

In (15), \(R\left( d_{\left( v_{i},v_{j}\right) }\right)\) presents the rain attenuation that can be different depending on rain rates in individual regions. According to the region segmentation by International Telecommunication Union (ITU) depending on rain rates, assume that Europe is region E; and USA is region D in the west coast (i.e., California, Oregon, and Washington) [6]. Then, the rain rate is about 6 mm/hour for the region E (Europe) for the 0.1 % of the year (i.e., 99.9 % availability). Similarly, the rain rate is about 8 mm/hour for the region D (California, Oregon, and Washington) for the 0.1 % of the year (i.e., 99.9 % availability). Finally, the rain attenuation factor in Europe is around 2.8 dB/Km, and the rain attenuation factor in USA is around 3 dB/km as can be derived from [5]. For rain attenuation of 2.8 dB/km in Europe and 3 dB/km in USA,

$$\begin{aligned} R\left( d_{\left( v_{i},v_{j}\right) }\right) = \alpha _{r} \cdot \frac{d_{\left( v_{i},v_{j}\right) }}{1000} \end{aligned}$$
(19)

where \(\alpha _{r}=2.8\) in Europe and \(\alpha _{r}=3\) in USA.

The noise power, \(P_{\text {noise,dB}}\) is computed as:

$$\begin{aligned} P_{\text {noise}} = 10^{\frac{k_{B}T_{e}+10\log _{10}\left( BW\right) + F_{N}}{10}} \end{aligned}$$
(20)

where \(k_{B}T_{e}\) is noise power spectral density, i.e.,

$$\begin{aligned} k_{B}T_{e} = -174 \text {dBm/Hz} \end{aligned}$$
(21)

as explained in [14] and \(F_{N}\) is the receiver noise figure, set to 6 dB.

4.2 Mathematical formulation

As discussed in Sect. 4.1, our considering objective function is (2), i.e.,

$$\begin{aligned} \max : \sum _{s_{k}\in {V}_{s}} q_{k}\left( f_{s_{k}\rightarrow v}^{k}\right) \end{aligned}$$
(22)

and considering constraints are (3)-(13), i.e.,

$$\begin{aligned} \sum _{v\in {V}, s_{i}\ne v} {L}_{s_{i}\rightarrow v}= & {} 1, \forall s_{i}\in {V}_{s}, \end{aligned}$$
(23)
$$\begin{aligned} \sum _{v\in {V}, d_{i}\ne v} {L}_{v\rightarrow d_{i}}= & {} 1, \forall d_{i}\in {V}_{r}, \end{aligned}$$
(24)
$$\begin{aligned} \sum _{v_{j}\in {V}, v_{j}\ne v_{i}} {L}_{v_{j}\rightarrow v_{i}}\le & {} 1, \end{aligned}$$
(25)
$$\begin{aligned} \sum _{v_{k}\in {V}, v_{k}\ne v_{i}} {L}_{v_{i}\rightarrow v_{k}}\le & {} 1, \end{aligned}$$
(26)
$$\begin{aligned} \sum _{v_{j}\in {V}, v_{j}\ne p_{i}} {L}_{v_{j}\rightarrow p_{i}}= & {} \sum _{v_{k}\in {V}, v_{k}\ne p_{i}} {L}_{p_{i}\rightarrow v_{k}}, \end{aligned}$$
(27)
$$\begin{aligned} \sum _{v_{j}\in {V}, v_{j}\ne r_{i}} {L}_{v_{j}\rightarrow r_{i}}\le & {} N_{\text {antenna}}, \end{aligned}$$
(28)
$$\begin{aligned} \sum _{v_{k}\in {V}, v_{k}\ne r_{i}} {L}_{r_{i}\rightarrow v_{k}}\le & {} N_{\text {antenna}}, \end{aligned}$$
(29)
$$\begin{aligned} \sum _{v_{j}\in {V}, v_{j}\ne p_{i}} f_{v_{j}\rightarrow p_{i}}^{k}= & {} \sum _{v_{l}\in {V}, v_{l}\ne p_{i}} f_{p_{i}\rightarrow v_{l}}^{k}, \end{aligned}$$
(30)
$$\begin{aligned} \sum _{v_{j}\in {V}, v_{j}\ne r_{i}} f_{v_{j}\rightarrow r_{i}}^{k}= & {} \sum _{v_{l}\in {V}, v_{l}\ne r_{i}} f_{r_{i}\rightarrow v_{l}}^{k}, \end{aligned}$$
(31)
$$\begin{aligned} \sum _{s_{k}\in {V}_{s}} f_{v_{i}\rightarrow v_{j}}^{k}\le & {} {C}_{(v_{i},v_{j})}\cdot {L}_{v_{i}\rightarrow v_{j}}. \end{aligned}$$
(32)

Now, the set of \({L}_{v_{i}\rightarrow v_{j}}=\{0,1\}, \forall i, \forall j\) that optimizes (2) should be obtained. For this purpose, we first figure out that this formulation is mixed integer disciplined convex programming where the given integers are 0-1 binary, i.e., branch-and-bound is used to obtain optimal solutions [3].

4.3 Individual QoS support (IQS)

The QoS of each flow is formulated as a function of the amount of flow traffic. If the QoS level of traffic in \(\left( s_{k},d_{k}\right)\) logarithmically and monotonously increases depending on the flow amount until \(f_{\max }^{k}\), the corresponding quality function is defined as follows:

$$\begin{aligned} q_{k}\left( f_{v_{i}\rightarrow v_{j}}^{k}\right) \triangleq w_{k}\frac{\ln \left( f_{v_{i}\rightarrow v_{j}}^{k}+1\right) }{\ln \left( f_{\max }^{k}+1\right) } \end{aligned}$$
(33)

as represented in Fig. 4 and [11]. Each information flow has its own \(f_{\max }^{k}\), and the traffic in the flow can have full quality from this flow amount in (33). Moreover, when \(f_{v_{i}\rightarrow v_{j}}^{k} \ge f_{\max }^{k}\), the return value of QoS function with this given \(f_{v_{i}\rightarrow v_{j}}^{k}\) is equal to \(q_{k}\left( f_{\max }^{k}\right)\). Thus, allocating a larger flow for \(\left( s_{k},d_{k}\right)\) does not increase the video quality anymore. Last, \(w_{k}\) in (33) is a weight of session flow \(\left( s_{k},d_{k}\right)\), i.e., differentiated weights can be assigned for each individual flow.

Table 2 Path loss exponents (n) and standard deviations of shadowing random variables (\(\sigma\)) [1, 16]
Table 3 Achieved QoS values for each session when IQS and MaxMin are used [the maximum QoS of session 1, session 3, and session 5 is 2 (because \(w_{1}=w_{3}=w_{5}=2\)) and the maximum QoS of session 2 and session 4 is 6 (because \(w_{2}=w_{4}=6\))]
Fig. 4
figure 4

Nonlinear (concave) quality function: as shown in this figure, if the amount of the given flow is less than \(f_{\max }^{k}\), the quality values which is a function of flow amount are logarithmic increased. If the given flow amount is equal to or more than \(f_{\max }^{k}\), the quality values converge to full quality. On the other hand, if the given flow amount is equal to or less than \(f_{\rm min }\), the quality values become zero because it is not enough for video transmission at all. Last, we notice that the quality function of each flow can have its own weight \(w_{k}\)

5 Performance evaluation

This section explains the experimental environments of mmWave-based multi-hop simulations as well as HEVC video performance comparison.

5.1 Simulation results with numerical data

For performance evaluation, link failure probability \(p_i\) is defined and this section obtains our performance results as a function of this \(p_i\) [8]. According to the fact that additional path loss at mmW channels is higher than that at non-mmW channels, especially in NLOS and deep fading situations, link failure probability is also introduced for realistic performance evaluation [8]. As a performance evaluation metric, the concept of average throughput depending on link failure probabilities is used. For this purpose, we compute the throughput for all possible combinations of active/nonactive links (weighted with the probability for this combination to occur) [8]. As a comparison benchmark, the average throughput of max–min flow (MaxMin) multi-hop routing is also computed averaging over the various link failures.

For the simulation-based performance evaluation, the number of cameras and relays is set to 20, the number of relays is set to 4, the number of antennas in relays is set to \(N_{\text {antenna}} = 4\), the number of sessions is set to 5, the receiver antenna gain of relays is set to 25 dBm, and the receiver antenna gain of cameras is set to 13.3 dBm.

To evaluate our protocol with link failure probabilities, we set same \(p_{k}\) values for all given mmW wireless links. Our simulation observes the average throughput while \(p_{k}\) increases from 0.1 (if the value is 0, it means all links are fully connected) up to 0.8 (note all given individual links will be broken with probability of \(p_{k}=1.0\)) with step size 0.1.

In both cases, given five sessions have following settings:

  • Session \(\left( s_{1},d_{1}\right)\): \(w_{1}=2\), nonlinear quality function in (33) and illustrated in Fig. 4. \(f_{\rm min}^{s_{1}}= 0\) and \(f_{\max }^{s_{1}}=1.5\cdot 10^{3}\) bit/s.

  • Session \(\left( s_{2},d_{2}\right)\): \(w_{2}=6\), nonlinear quality function in (33) and illustrated in Fig. 4. \(f_{\rm min }^{s_{2}}= 0\) and \(f_{\max }^{s_{1}}=1 \times 10^{9}\) bit/s.

  • Session \(\left( s_{3},d_{3}\right)\): \(w_{3}=2\), nonlinear quality function in (33) and illustrated in Fig. 4. \(f_{\rm min }^{s_{3}}= 0\) and \(f_{\max }^{s_{1}}=1.5 \cdot 10^{6}\) bit/s.

  • Session \(\left( s_{4},d_{4}\right)\): \(w_{4}=6\), nonlinear quality function in (33) and illustrated in Fig. 4. \(f_{\rm min }^{s_{4}}= 0\) and \(f_{\max }^{s_{1}}= 6 \times 10^{9}\) bit/s.

  • Session \(\left( s_{5},d_{5}\right)\): \(w_{5}=2\), nonlinear quality function in (33) and illustrated in Fig. 4. \(f_{\rm min }^{s_{4}}= 0\) and \(f_{\max }^{s_{1}}= 3 \times 10^{9}\) bit/s.

Now, the simulation results are presented in Table 3 along with the simulation results of the benchmark, i.e., max–min flow routing (MaxMin). As shown in Table 3, our proposed IQS generally shows better performance than MaxMin. We also note that session 1 and session 3 achieve full quality for both IQS and MaxMin when \(p_{k}\le 0.7\) because of their low \(f_{\max }^{s_{1}}\) and \(f_{\max }^{s_{3}}\) setting. In IQS, session 2 and session 4 can achieve full quality when \(p_{k}\le 0.2\). However, MaxMin cannot achieve full quality for session 2 and session 4 even though \(p_{k} = 0\). Last, the improved average achieved QoS value is 5.79875 out of \(2+6+2+6+2=18\), i.e., 32.22 %.

5.2 Visual quality improvement under the video coding standard common test condition

In video streaming system, the numerical gain of the proposed sum quality maximization (SQM) scheme can influence the video quality in objective and subjective quality metrics. To verify the benefit of the proposed scheme, this study experiments the visual quality gain with the next-generation video coding standard, HEVC, reference software (HM ver. 15.0) [21].

Regarding the video coding standard common test condition(CTC) [2], our encoding and decoding experiments are conducted with the subset testing points of the CTC in multi-core Windows 64-bit operating system workstation. Test sequence named PeopleOnStreet (resolution 3840\(\times\)2160 pixels) from the joint collaborative team on video coding (JCT-VC) is used, and two video coding structures random access (RA) and low-delay P (LDP) are used with four different quantization parameters (22, 27, 32, 37) as shown in Table 4.

Table 4 Characteristics of test sequences defined by JCTVC and experimental environments

Figure 5 illustrates the coding structure of RA and LDP. In the example, RA has intrapictures (I picture) every 16 pictures, and the number in the rectangular is picture order count (POC) that represents frame number. The coding structure RA provides very high compression performance, where the decoder can begin a decoding process without decoding all preceded bitstreams. Thus, the RA is generally used for seeking operation in a video streaming system. The LDP has the first I picture and following P pictures in the structure. It does not have the picture reordering issue in the group of pictures (GOP) that causes decoding delays, and video conferencing system uses the LDP.

Fig. 5
figure 5

Examples of the HEVC coding structures; RA (random access) and LDP (low-delay P)

Table 5 shows the experimental results, and the gains in objective video quality vary from 1.49 to 2.21 dB in Y-PSNR (maximum BD-rate gain is 4.2 %).

Table 5 Average Y-PSNR gain in dB and BD-rate gain with two coding structures; RA and LDP

Figure 6 shows the subjective visual quality comparison with the reconstructed and enlarged image sections of PeopleOnStreet (frame number 150). As shown in the figure, there are several noticeable visual differences between the proposed SQM and the reference max–min schemes: the area of legs, shadows, texts, and heads. Thus, the advantage of the proposed scheme SQM is verified in subjective and objective video quality metrics as well as the QoS value measured in previous section.

Fig. 6
figure 6

Reconstructed image comparison with enlarged noticeable sections: (left) proposed scheme (SQM), (right) reference method (max–min)

6 Conclusions

To deal with short-distance wireless communications and non-line-of-sight situations at mmW wireless propagation channels, a QoS multi-hop camera-to-camera routing algorithm with intermediate cameras and relays is desired for designing next-generation mmW cellular and mobile access networks. This paper proposes a novel real-time video streaming method in distributed wireless image-sensing platforms, which uses a multi-hop routing protocol that is assisted by multi-antenna relays. Moreover, QoS-awareness with mmW is also considered for supporting real-time wireless video streaming applications.

Our proposed method (SQM) takes account the differentiated individual QoS metrics for individual video stream flows, i.e., it can achieve better performance than max–min routing which is generally used for QoS-sensitive streaming applications. The simulation-based performance evaluation achieved around 32.22 % improvement than max–min routing, and the channel gain guarantees the average video quality improvement of 1.84 dB in Y-PSNR.