1 Introduction

With the development of wireless access and multimedia compression technologies, great attention has been devoted to wireless video communications. Up to now, two issues still stand to be resolved in this field.

The first problem is how to optimize video communications over wireless links with limited bandwidth. Since video streams require stringent bandwidth, packet loss is inevitable. It is essential to identify the importance of video packets so as to ensure the transmission of high priority packets. Consequently, ideas of cross layer design [13, 27] and unequal protection [1, 24], which pose a bright foreground for video communications, were addressed. At the application layer, packet importance is calculated and packet priority is marked. At the network or MAC layer, unequal packet/frame scheduling is performed to reduce video transmission distortion.

The second problem is how to provide flexibility to the variation of video data rate, coding structure or network load. Solutions of this problem require cooperation between network scheduling algorithm and video encoding algorithm, i.e., cross layer optimization. For example, Luo et al. [20] proposed a mechanism to control transport layer offered load for video streams based on the degree of medium contention information at MAC layer. Liao et al. [17] proposed to estimate the frame loss probability based on the standard ad hoc routing messages and network parameters, and to dynamically select reference frames in order to alleviate error propagation caused by packet loss. The second problem is more complicated because capture and fast reply to the variations are difficult to provide.

As for wireless MAC standard, IEEE 802.11 has been widely used for its maturity in the past. However, it does not provide differentiated guarantees for different kinds of services. To satisfy the distinct QoS requirements of multimedia service and data service, IEEE 802.11e standard [9] was proposed. QoS support in IEEE 802.11e is achieved with the introduction of four access categories (ACs), among which AC2 is defined for video service. Each AC has a transmission queue and a set of parameters to contend for transmission opportunity. However, this standard only provides a static mapping between various service types and ACs. To enhance its flexibility, several studies have been proposed. Nevertheless, the performance needs to be further improved.

In this paper, based on the problem formulation of video transmission over IEEE 802.11e networks, we propose an adaptive unequal protection scheme for video communications. Characteristics and contributions of this work are summarized in the following.

  1. (1)

    We provide a relative queuing delay (D R ) based AC selection mechanism. D R is an approximate value of actual queuing delay. Inserting video frames into the AC with the minimum D R will reduce the transmission delay of each frame as well as overall distortion of the video stream.

  2. (2)

    We integrate the D R based AC selection mechanism with a dynamic frame assignment algorithm (DFAA). DFAA takes video frame priority, D R and queue length (the number of packets in a queue) of each AC as inputs to differentiate frames with different priorities and to provide efficient and dynamic protection of video frames according to the real-time network load. Simulation results show that DFAA reduces video distortion significantly, compared with other schemes.

  3. (3)

    The fuzzy logic controller (abbr. as “FL controller”) is designed to produce appropriate adjustment of DFAA parameter so as to provide flexibility to the variation of environments. An FL controller decides parameter adjustment according to queue length of certain AC and frame loss rates of certain frame priorities. Experiments validate that DFAA with FL controller could achieve a near optimal performance when the DFAA parameter is initialized with an arbitrary value.

To validate the improvement of proposed adaptive scheme, in-depth analysis and comprehensive evaluations are performed, using distinct network environments, different video sequences and various traffic modes of data streams.

The rest of the paper is organized as follows. Section 2 introduces background and some related work. Section 3 gives problem formulation. DFAA details are presented in Section 4. Performance comparison between DFAA and related schemes is shown in Section 5. Then Section 6 describes the fuzzy logic controller. Section 7 and Section 8 provide simulation results in WLAN and multihop wireless network, respectively. Finally, Section 9 concludes the paper and points out the future work.

2 Background and related work

There are existing studies on improving video transmission performance over IEEE 802.11e networks. On one hand, a lot of papers proposed enhanced scheduling mechanisms to reduce video frame dropping probability. On the other hand, researchers explored adaptive algorithms to adjust EDCA parameters so as to improve system throughput and/or reduce video transmission distortion.

In this section we start with the introduction of IEEE 802.11e, and then classify and discuss related studies.

2.1 IEEE 802.11e

IEEE 802.11e standard [9] enhances the traditional IEEE 802.11 MAC standard to support applications with QoS requirements. A channel access function named Hybrid Coordination Function (HCF) is provided in IEEE 802.11e. The HCF is composed of a contention-based channel access method, called Enhanced Distributed Channel Access (EDCA) and a centrally controlled channel access method, known as HCF Controlled Channel Access (HCCA).

EDCA provides differentiated and distributed access to the wireless medium. QoS support in EDCA is achieved with the introduction of access categories (ACs). Each AC has a transmission queue and a set of channel access parameters to contend for transmission opportunity. If an AC has a smaller AIFS, CW min , or CW max , traffic of this AC has a better chance to access the channel earlier. Generally, AC3 and AC2 are reserved for voice and video applications respectively, while AC1 and AC0 are for best effort and background traffic. Streams that fall in the same AC are given identical priority to access the channel. Another parameter called TXOP limit is defined as an interval of time during which a node has the right to initiate transmissions. Depending on TXOP limit , a node may transmit one or more frames.

2.2 Modifications on IEEE 802.11e scheduling mechanism

Scheduling enhancements can be further classified into two categories. The first category focused on priority based scheduling. Chen et al. [4] proposed a cross-layer scheme to ensure the transmission of H.264 video, consisting of slice classification at application layer, dynamic packet selective transmission at MAC layer, and channel condition prediction at physical layer. Similar, Li et al. [16] formulated the QoS guarantee problem as a joint optimization of AC assignment and interface queue control, and solved it by packet prioritization at application level and service differentiation at MAC layer. Ksentini et al. [14] and Lin et al. [19] introduced a static and a dynamic scheduling algorithm respectively, which will be further discussed in Section 2.5. Wang et al. [29] proposed to make video packet sending or dropping decision according to the significance and the estimation value of delay of the video packet. Ideas and preliminary evaluations of this paper were presented in [28].

The second category combined rate allocation at application level and priority based scheduling at MAC level. For example, Lin et al. [18] proposed to assign a different number of redundant packets for each frame according to its video coding significance. In MAC layer, an adaptive cross-layer mapping algorithm was applied to map the original and redundant video packets to appropriate ACs based on their coding significance and network load. Lai et al. [15] designed a dispersive video frame importance scheme in the application layer and a comb-shaped quadratic mapping algorithm in the MAC layer to perform unequal mapping between packets and ACs based on the instant congestion level of the reserved AC queue.

Since video data rate, video coding structure and network load may vary continuously, most priority based scheduling algorithms showed poor flexibility to the environments. Video transmission distortion of these algorithms may depend on video content or network congestion level, especially for static mapping algorithms. Dynamic mapping algorithms, compared to EDCA and static mapping ones, reduce distortion and improve flexibility of IEEE 802.11e at a certain extent. These algorithms often use one or more parameters, but do not provide the rules of optimal parameters selection.

2.3 Algorithms for tuning EDCA parameters

Two parameters are often reconfigured to accommodate EDCA to the environments. One is the retry limit, the other is TXOP limit . Zhang et al. [30] investigated the packet loss behavior in the IEEE 802.11e networks under various retry limit settings and proposed a simple yet effective retry limit based unequal loss protection scheme, which adaptively adjusts the retry limit setting of IEEE 802.11e to maintain a strong loss protection for critical video traffic transmission. Hsu et al. [8] introduced a cross layer scheme for multiple users to set their retry limits properly. The authors analytically modeled how the retry limit setting adopted by one wireless user impacts distortion and delay of their competing wireless stations. Then a distributed, low-complexity, and scalable optimization algorithm was proposed to maximize the utility of video application.

Cranley et al. [5] studied how the distribution of video frame sizes can be used to dimension the IEEE 802.l1e TXOP limit parameter to efficiently deal with video burst and achieve maximum quality of video transmission. Aiming at serving burst packets rapidly, Jansang et al. [10] proposed to adaptively adjust the TXOP limit according to a finite state machine based on feedback queue size information, obtaining a reduced packet delay.

However, most proposals of this category did not consider the importance of video packet. Packets that have significant impact on video distortion would have the same dropping probability as the other ones, leading to poor decoding quality.

2.4 Other modifications

Apart from the above two categories, there are several papers concerning with other aspects of video transmission over IEEE 802.11e networks. For example, MacKenzie et al. [21] investigated several mapping schemes for a variety of video content to see how the quality of the decoded video is affected. Results showed that different mapping schemes exhibit different loss patterns and their impacts on video decoding quality are content dependent. Gan et al. [7] performed a cross layer optimization of individual mobile terminal and inter-user resource allocation for multi-user video streaming to minimize energy consumption of all users and to achieve desired video quality for each user. In order to improve the transmission performance of multimedia streams, Chen et al. [3] suggested using the beacon frame broadcasted by the AP to determine which station can transmit its pending data. Fiandrotti et al. [6] estimated the perceptual impacts of data losses in different types of enhancement layers for a large set of H.264/SVC videos and then proposed a content-adaptive traffic prioritization strategy based on the identification of the most important parts of the enhancement layers by means of a low complexity macro block analysis process.

2.5 Further introduction of reference [14] and [19]

Ksentini et al. [14] proposed a static mapping mechanism for video streams using H.264 codec, in which the AC a video packet should be inserted is determined by its slice type. That is: (1) frames of parameter setting information are classified into AC3; (2) frames of IDR picture slice and partition A belong to AC2; (3) frames of partition B and partition C are inserted into AC1. If applied to MPEG-4 codec, I, P and B frames could be assigned to AC2, AC1 and AC0, respectively.

Such an assignment can not only improve transmission quality of I frame but also occupy the scheduling opportunities of AC1 and AC0. If the proportion of I frames is relatively high, good performance can be obtained. The deficiency is that P and B frames have to contend with best effort and background traffic, leading to higher loss rates especially when the traffic of data services is heavy. Therefore, it sacrifices P and B frames to ensure the transmission of I frames. Whether it can achieve a better performance than EDCA or not depends on coding structure and network load.

To overcome the shortcomings of static mapping, Lin et al. [19] proposed a dynamic mapping algorithm for MPEG-4 streams, in which video frames are dynamically mapped to the appropriate AC based on the importance of video frame and network load. No matter which type a video frame belongs to, the algorithm always tries to assign it to AC2. When congestion occurs, a frame will be assigned to a lower priority AC (called demotion) with a dynamical probability. Details of this algorithm are similar to random early detection (RED) mechanism. There are two thresholds (threshold_low and threshold_high) for average queue length of AC2, which are used to identify congestion level and to calculate demotion probability. Default demotion probabilities for different frame types are denoted uniformly as Prob_TYPE (Prob_I<Prob_P<Prob_B and Prob_I is always equal to 0). Let qlen(AC2) denote the real-time average queue length of AC2, for each video packet, the dynamic probability is obtained by:

$$ Prob\_New=Prob\_TYPE\times \frac{{qlen\left( {AC2} \right)-threshold\_low}}{{threshold\_high-threshold\_low}} $$
(1)

From (1), we can find that if qlen(AC2) is less than threshold_low, arrival video packet will be assigned to AC2 definitely. Otherwise, demotion probability of the packet will increase with qlen(AC2). Obviously, less important video frame always has a larger demotion probability.

Compared to [14], this algorithm will insert some of the P and B frames into AC2 when AC2 is not congested. Therefore, it is the combination of EDCA and [14]. However, the principles for setting two thresholds and Prob_TYPE are not provided. Actually, optimal setting of these parameters is determined by the changing environments.

3 Problem formulation

3.1 Distortion model of IEEE 802.11 link

In an IEEE 802.11 wireless network, compressed video stream is transmitted from the sender to the receiver at a given rate. As described in [25], [32], [31], there are two factors which influence the decoded video distortion (denoted as D dec ) of the receiver. One is quantization errors introduced at the encoder while compressing the media stream and the other is packet loss either caused by transmission errors or due to late arrivals. Therefore, D dec can be computed as:

$$ {D_{dec }}={D_{enc }}+{D_{loss }} $$
(2)

where the distortion introduced by compression at the encoder is denoted by D enc , and the additional distortion caused by packet loss is denoted by D loss .

Due to compression, D enc is distributed across the encoded frames and can be formulated as a decreasing convex function of the encoding rate. According to [25], D enc can be approximated by:

$$ {D_{enc }}=\frac{\theta }{{R-{R_0}}}+{D_0} $$
(3)

where R is the output rate of video encoder, θ, R 0 and D 0 are the parameters of the distortion model which depend on the encoded video sequence as well as on the encoding structure.

Since we focus on the transmission performance in wireless MAC layer, a uniform encoder is employed for different transmission schemes mentioned in section 5. That is, values of D enc in different schemes are equivalent.

On the other hand, relationship between packet loss rate (denoted as P loss ) and its resulting decoded video distortion, which can be modeled by a linear function as (4), was also analyzed in [25].

$$ {D_{loss }}=\alpha {P_{loss }} $$
(4)

where α indicates the sensitivity of the video sequence to packet loss and depends on parameters related to the compressed video sequence, such as the proportion of intra-coded macroblocks and the effectiveness of error concealment of the decoder. Actually, the above linear relation is much too simple and α cannot be found for any video sequence. The main reason is that the importance and the influence on video quality of different packets are distinct. An example for MPEG-4 codec, which is also an inaccurate modeling, is presented in section 3.3.

The main purpose for introducing the above equation and other successive equations is to analyze the performance difference among various scheduling schemes and to discuss the evaluation results. Definitely, D loss cannot be calculated according to these equations. However, they can be used for performance comparison when assuming a uniform importance for all packets (see analysis in section 5 and discussion in other sections, in which packet losses of different packet priorities are identified).

Notice that a video packet is desirable to achieve end-to-end delay of no more than a few hundred milliseconds. When a packet does not arrive at the receiver by its playout deadline, to avoid interruptions, the decoder conceals the missing information and the playout continues at the cost of higher distortion. Therefore, the packet loss rate P loss combines the rate of random losses and late arrivals of video packets. This combined loss rate can be further modeled based on the M/M/1 queuing model. In this case, the delay distribution of packets over a single link is exponential [32].

$$ P\left\{ {Delay>T} \right\}={e^{{-\lambda T}}} $$
(5)

where P{} denotes probability, T reflects the delay constraint and λ is the arriving rate which can be achieved by:

$$ \lambda ={{{\left( {C-R} \right)}} \left/ {L} \right.} $$
(6)

where C is the capacity of the link, R is the traffic rate on that link, and L is the average packet size.

Actually video packets are transmitted over wireless link at regular intervals, while in M/M/1 model packet arrivals follow the Poisson process and each packet goes through the link with exponentially distributed service time. Note that in a bandwidth-limited network, the end-to-end packet delivery delay in wireless network is dominated by the queuing delay at the bottleneck link, so the delay distribution for realistic traffic patterns can still be modeled by an exponential formulation function.

3.2 Distortion model of IEEE 802.11e link

There are four access categories (ACs) in IEEE 802.11e link, and each AC acts as a virtual link. Packet loss rate of each AC can be calculated according to Eqs. (5) and (6), respectively.

$$ {P_i}={e^{{\frac{{{R_i}-{C_i}}}{L}T}}} $$
(7)

where P i denotes the packet loss rate of ACi, C i is the capacity of ACi, R i is the traffic rate of ACi.

When employing IEEE 802.11e as MAC protocol, packets of the same video stream may be assigned to different ACs. Therefore, packet loss rates of various ACs (P i , i = 1,2,3,4) should be considered jointly to obtain packet loss rate (P loss ) of the video stream.

$$ {P_{loss }}=\sum\limits_{i=1}^4 {{\beta_i}{P_i}} $$
(8)

where β i is the percentage of packets which are assigned to ACi.

3.3 Considering video frame priority

If there are several priorities among video frames/packets, different values of α will stand. Taking MPEG-4 encoder as an example, we have three values of α: a I , a P and a B for I, P and B frame, respectively. Let P I , P P and P B denote the packet loss rate of I, P and B frame, then they can be obtained as follows (x = I, P or B):

$$ {P_x}=\sum\limits_{i=1}^4 {{\beta_{ix }}{P_i}} $$
(9)

where β ix denotes the percentage of x type packets which are assigned to ACi. Then the decoded video distortion of the video stream (D loss ) can be calculated as follows:

$$ {D_{loss }}=\sum\limits_x {{\alpha_x}{P_x}{R_x}} $$
(10)

Compared with Eq. (4), we add a new factor R x into the equation. Taking frame priority into account, a video stream is divided into 3 sub-streams and each one has its own output rate. To obtain the overall decoded video distortion, R x should be considered. This is because that large R x indicates great contribution to the overall distortion in case a x R x is determined. Notice that a x here is not equivalent to that in Eq. (4).

Since I frame is more important than P frame and P frame is more important than B frame, we get a I > a P > a B .

4 Adaptive video transmission scheme

Firstly, abbreviations and symbols frequently used in this paper are presented in Tables 1 and 2.

Table 1 Abbreviation description
Table 2 Symbol definitions

4.1 System framework

As shown in Fig. 1, the proposed scheme utilizes the capacities of AC1 and AC0 to improve video transmission performance. DFAA is the key component. Congestion level of each AC is recognized by its queue length. Video frame priority, D R and queue length of each AC, and parameter adjustment are collected to help DFAA be aware of real-time network load so as to decide in which AC the frame should be thrown. D R is calculated to represent actual queuing delay approximately. The FL controller takes statistical information within a time cycle (such as loss rates of different video frame priorities) and queue length as inputs to determine quantitative adjustment of DFAA parameter for the next cycle.

Fig. 1
figure 1

System framework of proposed scheme

Generally we use the term “frame” below to refer to the frame of data link layer unless I, P, and B frames of MPEG-4 codec are discussed.

4.2 Relative queuing delay

To decrease transmission delay, a video frame should be assigned to the AC which has the shortest queuing delay. In IEEE 802.11e multi-queue model, frame queuing delay of an AC is proportional to the amount of queuing bytes and is inversely proportional to the scheduling opportunity of the AC. Although it is impossible to calculate the exact scheduling opportunities, the resulting average throughputs of different ACs within a long period under saturated circumstance can be adopted as their approximations. Let B i denote the amount of queuing bytes in ACi and let T 2:T 1:T 0 denote the ratio of average throughputs of AC2, AC1 and AC0, D R can be calculated as Eq. (11):

$$ {D_{Ri }}={{{{B_i}}} \left/ {{{T_i}}} \right.}, $$
(11)

where D Ri denotes D R of ACi. Notice that B i is a real-time parameter while T 2:T 1:T 0 could be determined in advance. Let AC min , AC mid and AC max denote the AC with the minimum, the middle and the maximum D R among three ACs respectively. Notice that AC min , AC mid and AC max are time variant.

4.3 Dynamic frame assignment

If we do not consider frame priority and try to insert each frame into AC min , different types of video frames will have the same dropping probability. Since distortion caused by the absence of high priority frames is more significant, DFAA performs unequal protection for different types of video frames to decrease the dropping probability of high priority frames. In DFAA the frame priority, D R and queue length of each AC are considered as scheduling parameters. The basic ideas are as follows:

  1. (1)

    Let D R identify the priority of an AC. AC priority is inversely proportional to D R . If congestion level is not high, try to insert each video frame into the AC with the highest priority.

  2. (2)

    Let queue length denote the congestion level of an AC. Define several thresholds for queue length of AC min and AC mid , and match each threshold with a video frame priority. If real-time queue length is larger than the threshold correspondent to its frame priority, the video frame will be inserted into the AC with a lower priority.

DFAA scans AC min and AC mid in turn and determines whether a video frame could be inserted into one of them according to its priority and real-time queue length of AC min and AC mid . If both ACs are congested, the frame will be assigned to AC max . DFAA is suitable for the video codecs with limited priorities. To decrease the number of thresholds, AC min and AC mid can share thresholds. Assume a codec has n + 1 priorities, then n thresholds (denoted as k 1, k 2, …, k n ; and k 1 > k 2 > … > k n ) are required. Algorithm I describes the details of DFAA.

4.4 Achieving T 2:T 1:T 0 by experiment

Predetermined parameters of DFAA include T 2 :T 1 :T 0 and queue length thresholds. T 2 :T 1 :T 0 can be determined through experiment approximately. In this experiment, standard EDCA is employed. We set the bandwidth of IEEE 802.11e wireless link to 1Mbps. Data rate of voice is 64 kbps (a rate commonly used) and data rates of the other three service types are 1Mbps to produce a saturated link. The source, destination and packet length of various service types are uniform. From Table 3 we can find that the difference among various packet length settings is not distinct. We set T 2 :T 1 :T 0 to 9:3:1, a near optimal value.

Table 3 Experiment result of T 2:T 1:T 0
figure f

5 Performance analysis

In this section we analyze the performance of four schemes: (1) standard EDCA; (2) static mapping [14], denoted as ICM; (3) dynamic mapping proposed by Lin [19], denoted as Lin; and (4) DFAA without FL controller. MPEG-4 codec with three frame types (I, P and B) is employed.

In section 4 we argued that sometimes AC2 is not the fastest queue and employ AC min to denote the AC with the smallest queuing delay. In this section, to simplify the description we still use AC2 to denote the fastest AC when talking about DFAA scheme.

5.1 EDCA vs. ICM

Since all video packets are inserted into AC2 in EDCA, we get R 2 = R I + R P + R B , R 1 = R be , and R 0 = R bg . Thus,

$$ {P_I}={P_P}={P_B}={P_2}={e^{{\frac{{{R_2}-{C_2}}}{L}T}}} $$
(12)

From (12) we can find, if R 2 < C 2 and the difference between them is significant then P I , P P and P B will be close to zero, resulting in perfect performance. In such a circumstance, R be and R bg can be ignored.

On the other hand, I, P and B video packets are assigned to AC2, AC1 and AC0 respectively in ICM. That is, ICM provides the highest protection level for I frames and leaves the other video frames contending with best effort and background traffic. We get R 2 = R I , R 1 = R be + R P , and R 0 = R bg + R B . Thus,

$$ {P_I}={P_2}={e^{{\frac{{{R_I}-{C_2}}}{L}T}}},{P_P}={P_1}={e^{{\frac{{{R_P}+{R_{be }}-{C_1}}}{L}T}}},{P_B}={P_0}={e^{{\frac{{{R_B}+{R_{bg }}-{C_0}}}{L}T}}} $$
(13)

Comparing EDCA with ICM according to Eqs. (12) and (13), we can find that \( P_I^{ICM } \) is always smaller than \( P_I^{EDCA } \). As for P P and P B , it is determined by R I , R P , R B , R be and R bg .

  1. a.

    If R be and R P are small and R I is large, then \( P_P^{ICM } < P_P^{EDCA } \).

  2. b.

    Likewise, if R bg and R B are small and R I is large, then \( P_B^{ICM } < P_B^{EDCA } \).

If the above two cases stand, we get \( D_{loss}^{ICM}\ll D_{loss}^{EDCA } \).

Otherwise, P P and P B of ICM are larger than those of EDCA while P I of ICM is smaller than that of EDCA. From Eq. (10) we can know that the difference between \( D_{loss}^{ICM } \) and \( D_{loss}^{EDCA } \) is determined by a x R x (x = I, P or B) and the difference between \( P_x^{ICM } \) and \( P_x^{EDCA } \) (x = I, P, B). Notice that a I > a P > a B , we can find that \( D_{loss}^{ICM } \) is smaller than \( D_{loss}^{EDCA } \) in most cases unless R I is fairly low.

5.2 Dynamic schemes vs. ICM

Both Lin and DFAA aim to improve the performance of ICM in case R I is much lower than C 2. These schemes perform packet assignment dynamically according to the environments, i.e. inserting part of P and B frame packets into AC2 to reduce P P and P B when the usage of AC2 is low. If congestion of AC2 can be always avoided, P I of Lin or DFAA will be slightly increased (compared to ICM) while P P and P B will be significantly reduced. In that case, D loss of dynamic scheme is much smaller than that of ICM. Otherwise, increase of P I becomes dominate, leading to bad performance. Therefore, parameters of Lin and DFAA should be chosen carefully to not only allow P and B frame packets to enter AC2 as much as possible but also keep AC2 away from congestion.

5.3 Lin vs. DFAA

Both schemes try to assign P and B frame packets to AC2. In case the status of AC2 is determined to be congested, these packets should be demoted to AC1 and AC0. The roles of parameters k 1 and k 2 in DFAA are similar to those of threshold_high and threshold_low in Lin. Putting the concept of relative queuing delay and the usage of AC0 in DFAA away, the main difference between DFAA and Lin is the demotion probability. When queue length of AC2 is higher than threshold_low in Lin (means that AC2 becomes congested), P and B frame packets are demoted to AC1 with a probability. Such a probability is always equal to 1 in DFAA. Therefore, \( P_I^{DFAA } \) is much smaller than \( P_I^{Lin } \). As for P P and P B , \( P_P^{DFAA } > P_P^{Lin } \) and \( P_B^{DFAA } > P_B^{Lin } \) are correct in most cases.

Compared to Lin, DFAA raises the protection level of I frames. Since a I is much higher than a P and a B , we get the result that D loss of DFAA is lower than that of Lin when their parameter settings are reasonable and comparative.

If parameter settings of both schemes are unreasonable, D loss of DFAA could be larger than that of Lin. For example, if k 2 and threshold_low are fairly low, most P and B frame packets are demoted to AC1 and AC0 in DFAA, leading to low AC2 usage and bad performance. On the contrary, Lin could achieve better performance than DFAA because only part of P and B frame packets are demoted to AC1. However, in this case performance of either scheme is worse than that of EDCA or ICM because unreasonable parameter settings are employed.

Besides, a serious problem of Lin is that demoting with a probability leads to inconsecutive entry of P and B frame packets into AC2 or AC1, resulting in increased delay jitter and decoded video distortion. What is more, DFAA has less parameter, so that it is possible to explore the optimal parameter setting.

6 Fuzzy logic controller

As mentioned in the previous section, reasonable parameters should be set to ensure the performance of dynamic schemes (Lin and DFAA). Then the problems are, how to determine whether a parameter setting is reasonable or not, and how to find the best setting to achieve optimal performance. In this section, we first analyze the relationship between parameter setting and performance in DFAA. Then an improvement of DFAA (named “Fuzzy Logic Controller”, abbr. as “FL controller”), which can adjust parameter setting to achieve a near optimal performance by itself, is proposed. Finally, details of DFAA with FL controller (denoted as DFAA-FL) are presented.

6.1 Analysis of DFAA parameters

In DFAA the protection level of a certain frame priority is mainly determined by its opportunity of entering AC min . Thus the protection level of the highest priority is inversely proportional to the difference between the upper bound of queue length and k 1. And the protection level of the jth priority is proportional to the difference between k j and k j+1. Therefore, k 1 should be set to the upper bound of queue length to ensure the transmission of video frames with the highest priority. The optimal setting of k j (j > 1) depends on the coding structure and the network load. For example, if the proportion of the jth priority frames is large, it is necessary to increase k j k j+1 and to decrease k j−1 k j . If the network load is very heavy, k 2, k 3, …, k n should be set to a value near zero because maybe only the transmission of video frames with the highest priority could be ensured. On the contrary, k 2, k 3, …, k n should be set to a value near the upper bound of queue length to take full advantage of available bandwidth.

During the transmission of VBR video stream over IEEE 802.11e networks, video data rate, coding structure and network load may vary dramatically, i.e. the optimal settings of k 2, …, k n are dynamic. To find a near optimal setting, we propose a fuzzy logic based self-adjusting scheme.

6.2 Design of FL controller

Each queue length threshold can be equipped with an FL controller. Since k 1 is always set to the upper bound of queue length, n − 1 FL controllers (denoted as C 2, …, C n ) are required. C j is responsible for dynamic adjustment of k j , Large k j often indicates large l j−1 and small l j . Thus C j influences the transmission of video frames with the j − 1th and the jth priorities. As for a specified controller C j , there are three inputs and one output.

  1. (1)

    Input 1: demotion rate of the j − 1th priority, l j−1.

  2. (2)

    Input 2: demotion rate of the jth priority, l j .

  3. (3)

    Input 3: average queue length of AC min during a statistical period, denoted as qlen.

  4. (4)

    Output: k j adjustment.

The former two inputs represent transmission performance of related frame priorities, and the third input indicates congestion level of AC min . C j determines real-time statuses of both network and video streams through these inputs and produces a suitable k j adjustment. Traditional FL controller has three components: fuzzifier, fuzzy inference and defuzzifier. Fuzzifier is responsible for transferring crisp inputs to fuzzy inputs by choosing a proper membership function. Triangular and trapezoid functions are often used. Fuzzy inference produces fuzzy outputs according to the fuzzy inputs and the predefined fuzzy rules. Finally, defuzzifier performs conversion from fuzzy outputs to determinate outputs. In most FL controllers, the fuzzy rule set is the core component and should be designed carefully according to developer’s experiences. Principles for setting fuzzy rules in DFAA-FL are as follows:

  1. (1)

    Try to decrease l j−1 and then try to decrease l j under the circumstance of l j−1 = 0.

  2. (2)

    The k j reduction is proportional to l j−1 and the k j increment is proportional to l j .

  3. (3)

    The k j reduction is proportional to queue length of AC min and the k j increment is inversely proportional to queue length of AC min .

Since the bandwidth of wireless links is limited, one controller will compete with the other controllers. To ensure the transmission of high priority frame, it is notable that FL controllers also have priorities. C j will not be activated unless (C 2, …., C j−1) work well, i.e. the loss rates of the former j − 1 frame priorities are decreased to zero.

6.3 Implementation for MPEG-4 codec

FL controller implemented in this sub-section is employed in simulations.

Since MPEG-4 codec has three frame types and k 1 is set to the upper bound of queue length, only one FL controller (C 2, which is responsible for adjusting k 2) is required.

6.3.1 Inputs and output

Triangular membership functions are employed for inputs. The former two inputs of C 2 are demotion rates of I and P frame packets (denoted as l 1 and l 2). Both l 1 and l 2 have five levels (Low, Medium, High, Very high and Extremely high). Average queue length of AC min has three levels, called Low, Medium and High. Figure 2 shows the triangular membership functions of different inputs.

Fig. 2
figure 2

Membership functions of FL controller inputs

Note that value ranges of l 1 and l 2 are divided unequally (level “Low” has the smallest range) to decrease demotion rate more quickly. Employing (0, 0.01) as the first range of l 1 leads to significant reduction on k 2 when l 1 stays at a low level, so that transmission of I frame packets could be ensured firmly. Upon receiving a video packet, average queue length is updated. At the end of a statistical period, the final average queue length is used as one of the three inputs. Value range of average queue length is also divided unequally to alleviate congestion as quickly as possible.

The output uses singleton membership function and has seven levels including three positive, zero and three negative adjustments. To raise the protection level of I frame packets, the absolute value of a negative adjustment is larger than that of its corresponding positive adjustment. And the step size of negative adjustments is set to a larger value than that of positive adjustments too. Seven adjustments used in simulation sections are (−16, −8, −4, 0, 3, 6, 12), so that notable modification would occur when current performance is far from the optimal one and mini modification would happen when current performance is close to the optimal one.

6.3.2 Fuzzy inference and fuzzy rules

Each fuzzy rule must specify the value ranges of each input and the unique output respectively, with a weight of the association between the input set and the output. That means the same input set could be associated with more than one output, satisfying the limitation that the sum of all these weights is equal to 1. In each rule the inputs are combined by the AND operation. The final weight of a rule is calculated by multiplication of the AND operation result and the weight of the input-output association. The aggregation of all outputs of a rule uses the maximum operator. Table 4 shows example fuzzy rules.

Table 4 Example fuzzy rules

The first fuzzy rule means that when the value ranges of l 1, l 2 and average queue length are (−1,0.01), (0.3,1) and (−1,10) respectively, the weight of producing output 12 is equal to 1. This rule indicates that notable increase should be performed on k 2 when l 1 is very close to 0 and l 2 is fairly large and the average queue length is small.

6.3.3 k2 tuning procedure based on FL controller

Note that Algorithm II is the supplement of Algorithm I. Compared to DFAA, the incremental complexity of DFAA-FL is not significant because k 2 tuning is only performed at the end of each statistical period.

figure g

7 Performance evaluation in WLAN

Simulations are based on the integrated platform of ns-2 [26] and Evalvid [12], implemented by C. H. Ke [11]. Figure 3 shows the WLAN topology, in which wireless nodes w0 ~ w3 are connected to AP via IEEE 802.11e link. Foreman (300 frames) and football (176 frames) with MPEG-4 codec and CIF resolution are adopted as primary test sequences. Other sequences are akiyo and news (300 frames) with CIF resolution, and coastguard and hall (300 frames) with QCIF resolution. The frame rate is set to 30 frames per second. Table 5 shows the differences of data rates and coding structures among these sequences. Especially, foreman has more I frames and B frames with respect to football. From Table 6, we can find that data rates of two primary sequences at each second are quite different. Since there are three types of video frames, two queue length thresholds and one FL controller should be deployed accordingly. The FL controller is responsible for adjusting k 2 to reduce dropping probability of both I frames and P frames.

Fig. 3
figure 3

Simulation topology

Table 5 Total bytes of I/P/B frames
Table 6 Data rate at each second of two primary sequences

7.1 Performance comparison among four schemes

This sub-section is further divided into two parts. The first part presents details of simulation results and discussion, with foreman and football sequences. The second part gives a summary of evaluation results of the other sequences.

7.1.1 Results and discussions of two primary sequences

Bandwidth of IEEE 802.11e link is set to 4Mbps. Parameters of DFAA and Lin are set as Table 7.

Table 7 Parameter setting

To make a comprehensive comparison, video sequences are transmitted under the following three scenarios.

  1. (1)

    Streams of four service types are transmitted from w0 ~ w3 to B respectively.

  2. (2)

    Four streams are uniformly transmitted from w1 to B.

  3. (3)

    Streams are transmitted from B to w0 ~ w3 respectively.

Figure 4 shows the variations of the number of received packets (denoted as “pktNum”), average PSNR (avgPSNR), VQM [2, 22, 23] and average packet transmission delay (avgDelay) as data rates of best effort (R be ) and background traffic (R bg ) increase in scenario 1. When calculating VQM of decoded video, full reference calibration and general model are applied. Notice that the smaller the VQM is, the better the decoded quality is. Also notice that R be :R bg remains to 2 in all simulations. From this figure we can find that pktNum and avgPSNR of four schemes go down dramatically at first and then turn steady as R be and R bg increase. As we know, video frames will occupy scheduling opportunities of BE and BG streams especially when R be and R bg are low. Thus the number of video frames entering AC1 and AC0 reduces as R be and R bg increase. However, the effect is not notable when increasing R be and R bg continuously after they reach the capacity of AC1 and AC0 because almost all increased packets are dropped.

Fig. 4
figure 4

Performance comparison among four schemes in scenario 1

For pktNum performance, DFAA is better than Lin, no matter which video sequence is streamed. Both schemes are better than EDCA and ICM because they not only take full advantage of the capacity of AC2, but also borrow the capacities of AC1 and AC0. As discussed in section 5, the difference between EDCA and ICM depends on the video coding structure. In foreman sequence both R I and R P are relatively high, so \( P_P^{ICM }-P_P^{EDCA } \) is comparable to \( P_I^{EDCA }-P_I^{ICM } \). Therefore, pktNum results of these two schemes are close. However, since R I is fairly low in football sequence, \( P_I^{EDCA }-P_I^{ICM } \) is much lower than \( P_P^{ICM }-P_P^{EDCA } \). That is to say, \( P_{loss}^{EDCA } \) is much lower than \( P_{loss}^{ICM } \), verified by Fig. 4(e).

Next, let’s focus the avgPSNR. From the figure, we know that avgPSNR of DFAA is much better than those of the other three schemes. There are two reasons: (1) DFAA receives more video packets; (2) packets received by DFAA are more important. Lin often has a moderate performance. Similarly, difference between EDCA and ICM depends on video coding structure. Although pktNum results of two schemes are comparable in foreman sequence, ICM has a better avgPSNR performance because it receives more I frame packets. In football sequence, ICM has a very poor avgPSNR performance because R I is too low to take full advantage of the capacity of AC2.

Similar to the avgPSNR results, VQM results show that the performance of DFAA is much better than those of the other schemes, especially for foreman sequence. However, when considering the comparison among the other three schemes, VQM results are slightly different from avgPSNR results, indicating that the performance difference among these schemes is not distinct.

As for avgDelay, we get nearly opposite results. However, it is reasonable because only delays of those received packets are counted. Since packet transmission delay is always inversely proportional to the number of packets in the network, the avgDelay performance of DFAA is poor.

The above figures demonstrate that packet loss rate and average packet transmission delay can not reflect video transmission distortion exactly. Although PSNR is considered less accurate than VQM, it is commonly used in existing video transmission studies. To control paper length, we use avgPSNR as the main metric when presenting figures in this paper and give a summary table of average VQM results of various video sequences in section 7.1.2.

Figure 5 shows the results in scenario 2 and scenario 3. Table 8 gives the average avgPSNR improvements of DFAA compared to other schemes in each case, in which we can find that DFAA reduces the transmission distortions of both sequences greatly. Figures 4 and 5 and Table 8 verify the analysis that existing schemes are not adaptive to the variation of environments. EDCA has a good performance when R I is low (football) and ICM is suitable for the case in which R I is high (foreman). Also the performance of ICM with football sequence is not bad when R be and R bg are relatively low. On the other hand, it is difficult for Lin to achieve better performance with respect to EDCA and ICM. Tables 9 and 10, which depict packet loss numbers of different frame types in scenario 2 when R be  = 1Mbps, presents the reason. Lin always has the lowest overall packet loss rate, but significant P I increment degrades its performance. Although P I of ICM in both cases are 0, too many P and B frame packets are dropped. DFAA achieves a good balance between P I and P P /P B .

Fig. 5
figure 5

avgPSNR comparison in scenario 2 and 3

Table 8 Average avgPSNR Improvement of DFAA
Table 9 Packet loss of foreman, scenario 2
Table 10 Packet loss of football, scenario 2

7.1.2 Results and discussions of the other sequences

Similar experiments are performed in this sub-section, using the other four sequences. Link bandwidths of news, hall, coastguard and akiyo are set to 2Mbps, 0.6Mbps, 1Mbps and 1Mbps respectively to cause congestion. Parameter settings of DFAA and Lin remain the same as those in Table 7. R be range and step size of news are [0.2Mbps, 2Mbps] and 0.2Mbps. And these two parameters of the other three sequences are [0.1Mbps, 1Mbps] and 0.1Mbps. Tables 11 and 12 show average avgPSNR results and average VQM results of different sequences. In these tables, “news-1” means that sequence news is used in scenario 1.

Table 11 Average avgPSNR results of other four sequences
Table 12 Average VQM results of other four sequences

From the above table, we can find that results of avgPSNR and VQM are equivalent. Also we can draw the following conclusions.

  1. (1)

    DFAA always has the best performance. However, the improvement depends on the sequence. The reason is that each sequence has its own coding structure but we use a uniform setting of k 2. Recall the analysis of section 6.1, the optimal setting of k 2 depends on the coding structure and the network load. Although data rate of each sequence is distinct, different bandwidths are set to cause equivalent congestion level. Therefore, the optimal k 2 is mainly determined by the coding structure. A moderate k 2 (25) here is suitable for news and hall because the amount of I frames is equivalent to that of P/B frames in both sequences. Evaluation results with different k 2 are presented in section 7.2.

  2. (2)

    Compared with EDCA and ICM, Lin always achieves a moderate performance. From section 5.3 we know that Lin inserts more P/B frame packets into AC2, which decreases the protection level of I frame packets. Furthermore, it is difficult to find the optimal parameter setting pattern because Lin has too many parameters to be considered. Evaluation results with different parameter setting patterns can also be found in section 7.2.

  3. (3)

    Performance comparison between EDCA and ICM also depends on the sequence. ICM shows its advantage when using news, hall and akiyo because I frames are dominant in these sequences and ICM provides the highest protection level of I frames. Thus the number of packets which have to compete with best effort and background packets is small. On the contrary, the performance of EDCA is better than that of ICM when using coastguard sequence.

7.2 Influence of parameter setting

Experiments in this and the next sub-section are performed in scenario 1. From Fig. 6 we can find that k 2 = 25 is not suitable for every case. To achieve better performance, k 2 should be set to 10 when foreman is transmitted in a 3Mbps wireless link while it should be set to 45 for football in a 4Mbps link. The results verify the analysis that fixed parameter is not flexible to the variation of data rate, coding structure and network load. Furthermore, despite the issue of which value should k 2 be set to, performance difference when applying various k 2 is remarkable. Indeed, differences between the best and the worst avgPSNR when R be  = 2Mbps in four cases are 2.25, 0.74, 4.25 and 4.23.

Fig. 6
figure 6

Performance comparison of DFAA with different k 2

Figure 7 presents avgPNSR comparison of Lin with different parameter settings, in which unmentioned parameters are set the same as the first experiment. Figure 7(a) employs various threshold_low and Figure 7(b) uses different Prob_TYPE (Prob_I=0). The results show that parameter setting also affects the performance of Lin significantly.

Fig. 7
figure 7

Performance comparison of Lin with different parameters

7.3 Performance of DFAA with FL controller

At the end, we focus on the performance of DFAA with FL controller (abbr. as “DFAA-FL”). Since DFAA-FL could adjust k 2 according to the variation of environments, it is expected to provide the following advantages: (1) Achieve a near optimal performance; (2) Have a steady performance no matter which initial value is employed for k 2, so that k 2 can be initialized with an arbitrary value.

In this experiment, the adjustment cycle is set to 0.5 s while R be is set to 2Mbps. Table 13 shows the avgPSNR variation of DFAA-FL when k 2 is initialized with different values. Unlike the results of Fig. 6, the performance of DFAA-FL keeps relatively steady in each case as initial value of k 2 varies. It is notable that in some cases (case 1 and case 2) the performance of DFAA-FL is better than any solution with fixed k 2 in Fig. 6. Also the performance is close to the best one among DFAA-10, DFAA-25 and DFAA-45 in other cases (case 3 and case 4), indicating that DFAA-FL can achieve a near optimal performance.

Table 13 average avgPSNR variation of DFAA-FL with different initial k 2

Finally, Fig. 8 shows the adjustment details of k 2 as the simulation goes on when deployed with different initial value. From the figure, we find that the variation of k 2 depends on link congestion strongly. When video sequences are transmitted in 3Mbps wireless links, frame loss rate remains relatively high. The FL controller has to adjust k 2 continually to find an optimal value, leading to performance degradation. It is shown that the values of k 2 in three schemes with different initial values go closer as the simulation goes on. On the contrary, most I and P frame packets are transmitted successfully when the link load is not heavy (4Mbps links), resulting in slight adjustment of k 2. In Fig. 8(a) after the 15th cycles and in Fig. 8(b) before the 5th cycles, the adjustment of k 2 is remarkable because data rate of video sequence at that time produces heavy congestion.

Fig. 8
figure 8

Adjustment details of k 2

8 Performance evaluation in multihop networks

In this section, evaluations are performed in a multihop wireless network with seven nodes. A video stream is issued from node 0 to node 2. If we limit the maximum number of available path to 2, two paths (N 0, N 5, N 6, N 2) and (N 0, N 1, N 3, N 4, N 2) presented in Fig. 9 will be discovered. Having two available paths with different hop counts, we can combine our adaptive scheduling scheme with different routing algorithms to make a comprehensive evaluation. Bandwidth of IEEE 802.11e link here is set to 1.5Mbps.

Fig. 9
figure 9

Available paths for video stream from node 0 to 2

In multihop wireless networks each node must compete for transmission opportunity, which reduces the network capacity greatly. Since high resolution sequence is not suitable for such an environment, Foreman (400 frames, 659 packets, 13.3 s) with MPEG-4 codec and QCIF resolution is used as test sequence in this section. Besides video stream, there is a background stream. To verify the flexibility of DFAA-FL to the environments, we vary the priority, data rate, and source-destination pair of this stream.

8.1 Performance comparison among four schemes

In this sub-section, AODV is used as routing algorithm. Three experiments were done, using different priorities. In each experiment, every possible node pair is adopted as the source and the destination of the stream. That is to say, we obtain 42 results which are divided into 7 groups according to the stream source. Average pktNum and avgPSNR of each group are calculated and presented in Fig. 10.

Fig. 10
figure 10

Multihop performance comparison among four schemes

First, let’s discuss the average pktNum results. From the figure we find that results of DFAA-FL and Lin are much better than those of EDCA and ICM when the stream priority is set to 0 or 1. Using EDCA and ICM, many background stream packets are scheduled in the multihop network because the priority of this stream is higher than or equal to that of video stream. Since DFAA-FL and Lin can take full advantage of several ACs, their average pktNum results do not degrade significantly. On the contrary, the background stream will not compete directly with video stream when its priority is set to 2 and EDCA is employed. Thus the performance of EDCA in most cases is close to that of DFAA-FL. However, the performance of ICM is fairly poor in this case because P frame packets must compete with background stream packets for scheduling. From the average pktNum results, we do not find any special influence of multihop network compared with WLAN.

Then turn to the average avgPSNR results. As the figure shows, although the performance of DFAA-FL is still better than the other three schemes, the improvement is not so remarkable compared with that in WLAN. Back to the average pktNum results, we find that the difference between DFAA-FL and EDCA is significant. That is to say, more received packets do not bring higher avgPSNR accordingly. Analyzing the reason, we find that multihop forwarding path increases the packet transmission delay greatly. Therefore, some replicated packets are received for retransmission. What’s more, a part of received packets in DFAA-FL are useless for decoding because their transmission delays are unacceptable. To solve this problem we can drop the packets with large delays in advance at the intermediate nodes. However, such a mechanism will cause an increased complexity accordingly.

Although the improvement is not as much as that in WLAN, DFAA-FL is still the best scheme and shows its flexibility to various conditions. As for the other three schemes, their performance depends on environments and parameter settings.

8.2 Performance evaluation combined with routing algorithm

Finally, we evaluate the performance of DFAA-FL and EDCA, combined with different routing algorithms. Three routing algorithms are employed:

  1. (1)

    Standard AODV.

  2. (2)

    Multipath AODV, denoted as MAODV. In this routing algorithm, the maximum number of available paths is limited to 2. If there are two available paths, I frames will be forwarded to the first path and P and B frames will be forwarded to the second path. As for background stream packets, there are all forwarded to the first path.

  3. (3)

    The last routing algorithm uses the bottleneck of AC2 length as routing metric. The bottleneck of AC2 length is the maximum AC2 length among all nodes in the forwarding path. The source node will choose the path with the minimum bottleneck of AC2 length to forward video and background streams. This algorithm is denoted as MBAL.

In the simulation data rate of the background stream is set to 300 kbps, together with a priority of 1. Since average results are presented in the previous sub-section, we show some detailed results in this sub-section. Figure 11 shows the pktNum and the avgPSNR results when the background stream source is set to node 0 and node 5.

Fig. 11
figure 11

Multihop performance, combined with routing algorithms

As shown in Fig. 11, no matter which routing algorithm is employed, it works better when combined with DFAA-FL. For the pktNum results, we notice that the difference among various routing algorithms in DFAA-FL is insignificant, compared with that in EDCA. In other words, we can say that DFAA-FL reduces the performance difference among various routing algorithm. Such a characteristic facilitates routing algorithm selection. Similar to the results of the previous sub-section, we find that the avgPSNR improvement (comparing DFAA-FL and EDCA) is not as much as the pktNum improvement.

We also notice that the improvement is fairly small when MAODV is used. The reason is that when performing video frame assignment, MAODV does not consider the relationship between video frame type and path quality. Such a simple assignment policy can not work well with DFAA-FL. Exploring a suitable multipath routing algorithm together with DFAA-FL is one of our future work.

9 Conclusions

In this paper, after summarizing the related work of video transmission over IEEE 802.11e networks, we propose an adaptive video transmission scheme based on the idea of unequal protection, including the usage of relative queuing delay (D R ), a dynamic frame assignment algorithm (DFAA) and fuzzy logic controllers. Motivation of the scheme, idea of each component and details of several algorithms are presented. Especially, we formulate the problem of video transmission over IEEE 802.11e networks and make performance analysis based on the formulation. To validate effectiveness and flexibility of proposed scheme, we take the following actions in simulations. Firstly, both WLAN and wireless multihop networks are employed. Secondly, six video sequences are tested (foreman, football, news and akiyo with CIF resolution, and coastguard and hall sequences with QCIF resolution). Thirdly, various data rates of data streams are set to produce different congestion levels. Finally, both PSNR and VQM are calculated to verify the decoded video quality.

Simulation results show that the PSNR and VQM results of DFAA are much better than those of EDCA, ICM and Lin in different network and traffic conditions, using different video sequences. FL controller can produce appropriate adjustment of DFAA parameter so that a random initialization of DFAA parameter becomes possible. Moreover, DFAA-FL also shows its good performance and flexibility in multihop wireless network when combined with different routing algorithms.

From the evaluation results, we also find that the performance improvement of DFAA is not significant when using football sequence, showing that DFAA is not good enough for those video sequences whose number of I frame packets is much less than that of P/B frame packets. Another limitation of this paper is the absence of rigorous theory support for the extreme complexity of video distortion models and wireless network conditions.

In future work, we plan to consider the region-of-interest coding method to improve subjective video quality and combine multipath routing with DFAA-FL to further enhance the performance of unequal protection.