1 Introduction

Due to the mass-market adoption of the new multi-mode high-end devices (e.g., smartphones, tablets, etc.) together with the growing popularity of video-sharing websites, like: YouTube, mobile TV, and gaming, mobile operators are being confronted with a massive traffic growth. According to Cisco [1] the global IP traffic has increased eight fold in the past 5 years and will further increase fourfold by 2016. It is estimated that more than 110 Exabytes of data per month will be transferred in 2016; out of which 61 % will be exchanged by wireless devices [2], and 55 % of the data will be generated by rich media-based services [3]. Some of these services (e.g., High Definition TV, 3D TV) put important pressure on both content processing and delivery. Moreover, it is estimated that network-delivered digital media, especially over a heterogeneous wireless environment to mobile customers will become one of the main economic driving forces in the coming years. However, this will only be possible by having the necessary infrastructure to accommodate the increasing number of mobile users and accommodating their expected high Quality of Experience (QoE) levels. In order to deal with this explosion of mobile broadband data, network operators have tried to supplement their bandwidth capabilities by deploying alternative radio access technologies in areas of high user traffic (e.g., in the city-center, shopping malls, sport stadiums and business parks). Wireless-Fidelity (Wi-Fi) offload solutions have already been adopted by many service providers, (e.g., Deutsche Telekom offer WiFi MobilizeFootnote 1). This solution enables the transfer of some traffic from the core cellular network to WiFi hotspots at peak times. In this way users can avail from wider service offerings. However, the overall experience is still far from optimal as providing high quality mobile video services with high Quality of Service (QoS) over resource-constrained wireless networks remains a challenge. In this context, the problem faced by network operators is ensuring seamless multimedia experience at reasonable quality levels to the end-user.

An important user concern is the battery life of their mobile device which has not evolved in-line with processor and memory advances, becoming a limiting factor. This deficiency in battery power and the need for reduced energy consumption provides motivation for developing more energy efficient solutions while enabling always best connectivity to the mobile users.

The ‘Always Best Connected’ vision emphasizes the scenario of a mobile user seamlessly roaming in a heterogeneous wireless environment as illustrated in Fig. 1. Mobile users face a complex decision when selecting the best network to connect to (one that will satisfy their needs) because of the heterogeneity of the criteria: the applications requirements (e.g., voice, video, data, etc.); multiple device types (e.g., smartphones, netbooks, laptops, etc.) with different capabilities; multiple overlapping network technologies [e.g., Wireless Local Area Networks (WLAN), Universal Mobile Telecommunications System (UMTS), Long Term Evolution (LTE)]; and different user preferences (e.g., for personal or business use or location-dependent—crowded train vs. quiet office). In this context, the main challenge for the users is to have their device select the best available network considering their preferences, application requirements, and network conditions.

Fig. 1
figure 1

Heterogeneous wireless networks environment—example scenario

This paper provides a comprehensive study on the performance evaluation of a number of widely used MADM-based methods in the context of network selection. The performance evaluation is done in terms of energy efficiency and user perceived quality levels for multimedia streaming over a heterogeneous wireless environment. This paper reports the results of a realistic study which uses real user data to model the user perceived quality, and real energy consumption measurements taken from our test-bed. Additionally, a mathematical energy consumption equation is designed to model an Android mobile device’s energy consumption, based on real energy measurements.

2 Related works

MADM methods are widely used for solving multi-criteria decision problems including the network selection problem in the research literature. One of the most popular MADM methods used, is the Simple Additive Weighting Method (SAW) method [4]. The basic logic of SAW is to obtain a weighted sum of the normalized form of each parameter over all candidate networks. Depending on the formulation of the problem, the network which has the highest/lowest score is selected as the target network. Wang et al. [5] were the first researchers to apply the SAW method in the area of network selection strategy back in 1999. The authors propose a policy-enabled handover system that selects the “best” wireless network at any moment. A score function is defined and used to translate the serviceability of each network to a score value for comparison of the possible candidate networks. The score value is computed based on several network parameters, like: the available network bandwidth, the network power consumption profile, and the monetary cost charged for the specific network. The score function is the sum of a weighted normalized form of these three parameters. The weights may be modified by the user or the system at run-time. The monetary cost is limited by the maximum sum of money a user is willing to spend for a period of time and the power consumption is limited by the battery lifetime. The network that has the lowest value for the score function is chosen as the target network.

Since 1999 a number of other papers offering variations of this SAW method, have been produced, e.g., Adamopoulou et al. [6]. Tawil et al. in [7] make use of SAW to propose a distributed vertical handoff decision scheme. The calculation of the targeted network is moved from the mobile user side to the network side to conserve the battery lifetime of the mobile device. The network quality is computed among the networks based on the bandwidth, the call dropping probability and the cost parameter.

In order to scale different characteristics of different units to a comparable numerical representation, different normalized functions have been used, such as: exponential, logarithmic and linear piecewise functions [8]. One of the main drawbacks of SAW is that a poor value for one parameter can be heavily outweighed by a very good value for another parameter. For example, if a network has a low throughput, but a very good price, it may be selected over a slightly more expensive network with a much better throughput rate.

Another popular MADM method is the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) method [4] which is based on the idea that the selected candidate network is the closest to the ideal possible solution and the farthest from the worst possible solution. The ideal and worst solutions are calculated with the best and worst possible values of each parameter, respectively. TOPSIS method was used in [9, 10] in order to rank the candidate networks based on the closeness to the ideal solution. The parameters considered in the decision matrix are: available bandwidth, QoS level, security level, and cost in [9] and cost per byte, total bandwidth, available bandwidth, utilization, delay, jitter and loss in [10]. The results show that TOPSIS is sensitive to user preference and the parameter values. In order to compensate for the ranking abnormally introduced by TOPSIS, Bari and Leung [11] propose the use of an Iterative TOPSIS. The authors argue that the new approach can improve the results obtained by considering only the more likely network candidates in the decision process.

Nguyen-Vuong et al. [8] examine the disadvantages of previously proposed SAW algorithms and instead they propose the use of a Multiplicative Exponential Weighted (MEW) method in the decision making mechanism. In general, MEW [6] is a MADM method that uses multiplication for connecting network parameter ratings. The authors conducted a numerical analysis and the results show the inaccuracy of the SAW method and the benefits of using their proposed utility function together with a weighted multiplicative method. MEW was also used in [12] in order to propose a power-friendly access network selection mechanism in a multimedia-based heterogeneous wireless environment.

The Elimination and Choice Expressing Reality (ELECTRE) [13] is another MADM method which is based on a pair-wise comparison among the parameters of the candidate networks. The concepts of concordance and discordance are used to measure the satisfaction and dissatisfaction of the decision maker when comparing the candidate networks. Bari and Leung [14] propose a modified version of ELECTRE in order to solve the network selection problem. They compute a concordance set (CSet) which consists of a list of parameters indicating that the current network is better than the other candidate networks. On the other hand a discordance set (DSet) is defined which provides a list of parameters for which the current network is worse than the other candidate networks. Two corresponding matrices are constructed using CSet and DSet. In order to indicate the preferred network, the elements of each matrix are compared against two thresholds: Cthreshold and Dthreshold.

Other two popular MADM methods are the Analytic Hierarchy Process (AHP) and Grey Relational Analysis (GRA). The idea behind AHP is to decompose a complicated problem into a hierarchy of simple and easy to solve sub-problems. Whereas, the GRA method ranks the candidate networks and selects the one with the highest rank. Cui et al. [15] propose a Hierarchy Multiple Attribute Decision with Possibilities, referred to as HMADP. The authors use AHP to determine the weights for each criterion: bandwidth, delay, response time, jitter, bit error rate (BER), packet loss rate, security and cost. After the weight for each criterion is computed, a SAW function is used to score the networks. The network with the highest score is selected as the target network. The AHP method in combination with an utility function is used by Pervaiz [16]. AHP and GRA are used in [1719]. The AHP method computes the relative weights of the various parameters used in the decision model whereas GRA prioritizes the networks. The network with the largest Grey Relational Coefficient is considered to have the highest similarity to the ideal solution and is selected as the target network.

The authors in [20] and [21] propose the use of a combination of two MADM methods, namely AHP and TOPSIS. The AHP method is used to compute the weights for different criteria, such as: throughput, delay, jitter, packet loss, cost and security in [20] and cost per byte, total bandwidth, allowed bandwidth, utilization, packet delay, packet jitter, and packet loss in [21]. TOPSIS is then used to rank the candidate networks. The network with the highest score is selected as the target network. Several extensions of the AHP solutions have been proposed such as: the analytic network process (ANP) [22] and fuzzy analytical hierarchy process (FAHP) [23]. The authors use the extensions to compute the weights for each criteria and the TOPSIS method is then used to rank the candidate networks. However along with the classic criteria such as: cost, security, bandwidth, jitter, packet loss and delay, the authors consider the handover history as a prime factor in the decision making process. Even though this method reduces the number of handovers, the mobile station is forced to stay connected to the same network even though the current QoS dropped below a predefined user threshold, which might cause the decrease in the user satisfaction and increase in the churning rate.

Bari and Leung [24] propose the use of GRA with a non-monotonic utility and argue that this solution is more efficient than the other MADM methods which assume monolithic increasing and decreasing utilities for the attributes.

Several studies have proposed solutions that combine fuzzy logic with other approaches such as MADM, genetic algorithms, utility functions, etc. [2528]. Fuzzy logic is used when some of the criteria cannot be precisely obtained due to the complexity of the heterogeneous environment. In this context, the imprecise data is mapped to crisp numbers followed by the MADM method for network selection. The authors in [25] argue that TOPSIS is more sensitive to user preferences while SAW provides more conservative ranking results.

Comparison studies of the MADM methods for network selection under various network conditions and for different service classes have been conducted in [2933]. The findings of these studies are listed in Table 1.

Table 1 Performance studies—summary

Despite the amount of research done in the area of network selection especially on the performance evaluation of the MADM methods, not much focus has been placed on the impact of the MADM methods on the energy efficiency and user perceived quality level. Moreover, most of the existing works based their performance evaluation on simulation data, and they do not consider real user data. To this extent, this paper brings the following contributions:

  • a comprehensive performance evaluation study on the impact of four widely used MADM methods based on real user and network data;

  • the results from a real experimental test-bed are used to model the energy consumption utility equation for multimedia streaming over a heterogeneous wireless environment;

  • energy consumption measurements and subjective video quality assessment test results are used to study the impact of four MADM methods on the energy versus quality trade-off.

3 Network selection mechanism

3.1 Network selection concept

Today’s multi-user multi-technology multi-application multi-provider environment requires the development of new technologies and standards that seek to provide dynamic automatic network selection decisions through seamless global roaming within this heterogeneous wireless environment. The network selection process is part of the Handover Management which consists of three major sub-services, as illustrated in Fig. 2: (1) Network Monitoring—monitors the current network conditions (network availability, signal strength, current call connection etc.) and provides the data gathered together with information related to the user preferences, current running applications on the user’s mobile device and their QoS requirements to the Handover Decision Module; (2) Handover Decision—handles the Network Selection process (which ranks the candidate networks and selects the best target) and is initiated either by an automatic trigger for a handover for an existing call connection or by a request for a new connection on the mobile device; and (3) Handover/Connection Execution—once a new target network is selected, the connection is set up on the target candidate network (and the old connection torn-down).

Fig. 2
figure 2

Handover process—block diagram

Traditionally, the network selection decision was made by the network operators for mobility or load balancing reasons, and was mainly based on assessing the values of a single parameter: Received Signal Strength (RSS). However, the network selection problem has become a more complex problem, and many static and dynamic, and sometimes conflicting, parameters influence the decision-making process. As all of these parameters present different ranges and units of measurements, they need to be normalized in order to make them comparable. Utility functions are used for normalization to map all the parameters into dimensionless units within the range [0,1]. This normalized information is then used in the decision-making process in order to compute (through the use of score functions) a ranked list of the best available network choices (e.g., best value networks in terms of quality-price trade-off). Different score function methods have been proposed for network selection: using different MADM methods including Analytic Hierarchy Process (AHP) and Grey Relational Analysis (GRA) [1719], or using Game Theory [34]. User or network operator preferences for the main trade-off criteria can be represented by the use of different weights in weighted score functions. Different methods have been suggested for determining or gathering the weights combination to reflect the preferences (which could for example be high quality for a business call with reasonable power savings, or reasonable cost-quality trade-off for a fully charged smartphone). The candidate network with the highest score is selected as the target network if that differs from the current network connection (or it is for a new connection) it prompts a handover execution (or new network connection setup).

3.2 Utility functions

As previously mentioned, the utility functions are part of the overall score function of the decision-making process and they are used to normalize the decision criteria/parameters into dimensionless units (e.g., within [0,1]) in order to make them comparable. The shape of the utility function describes the user’s perception of performance and satisfaction and expresses the trade-off the user is willing to accept between acquiring more resources (e.g., bandwidth) and saving resources (e.g., money, energy, etc.). Previous studies have shown that in case of rate-adaptive real-time applications, sigmoid shape utility function can be used to describe the user satisfaction as a function of bandwidth [35, 36]. Whereas for other parameters such as cost or energy, linear functions were used to map them to the user preferences [37, 38]. A common goal of all the approaches defined in the literature is to optimize the network performance by maximizing the utility function. In this work, there are three criteria considered: energy consumption, quality of the multimedia stream, and the monetary cost. A utility function is defined for each of the criteria, such as: energy utility, quality utility and cost utility as in our previous work in [39]. All the utility functions defined, follow the principle ‘the larger the utility value the better’. In order to analyze the performance of the MADM-based methods fairly, the same utility functions are used for each of the four MADM methods. Table 2 presents a summary of the parameters used throughout the paper.

Table 2 Parameters summary

3.2.1 Energy utility: ue

The energy utility is defined in Eq. (1) and is computed based on the estimated energy consumption of the mobile device. The energy utility has values in the [0,1] interval, and no unit.

$$u_{e} (E) = \left\{ {\begin{array}{*{20}c} {1,} & {E < E_{\hbox{min} } } \\ {\frac{{E_{\hbox{max} } - E}}{{E_{\hbox{max} } - E_{\hbox{min} } }},} & {E_{\hbox{min} } < = E < E_{\hbox{max} } } \\ {0,} & {otherwise} \\ \end{array} } \right.$$
(1)

where Emin is the minimum energy consumption (J), Emax—the maximum energy consumption (J), and E—the energy consumption for the current network (J). Emin and Emax are calculated for throughputs Thmin and Thmax respectively. The energy E is modeled using the real experimental test-bed results and will be introduced in a later Section.

3.2.2 Quality utility: uq

A zone-based sigmoid quality utility function is used to map the throughput to user satisfaction for multimedia streaming applications as defined in Eq. (2). Our previous studies in [40] have shown that in the case of real-time multimedia streaming applications, the zone-based sigmoid shape function best maps the throughput levels to the user satisfaction with the streamed video. Below a certain throughput value the quality of the streamed video is just unacceptable (Zone1). On the opposite end of the scale, once the throughput exceeds a certain level the user will not perceive any increased quality level on their handset screen with further increases in throughput (Zone3). Between Zone1 and Zone3 the quality experienced by the user increases with increase in throughput (Zone2).

The utility is computed based on: minimum throughput (Thmin) needed to maintain the multimedia service at a minimum acceptable quality (values below this threshold result in unacceptable quality levels i.e., zero utility) and maximum throughput (Thmax), that maps high user satisfaction with quality to the highest utility; values above Thmax result in quality levels which are higher than most human viewers can distinguish between and so anything above this maximum threshold is a waste. The quality utility has values in the [0,1] interval and no unit.

$$u_{q} (Th) = \left\{ {\begin{array}{*{20}c} {0,} & {Th < Th_{\hbox{min} } } \\ {1 - e^{{\frac{{ - \alpha * Th^{2} }}{\beta + Th}}} ,} & {Th_{\hbox{min} } < = Th < Th_{\hbox{max} } } \\ {1,} & {otherwise} \\ \end{array} } \right.$$
(2)

where α and β are two positive parameters which determine the shape of the utility function with no unit, and Th is the predicted average throughput for each of the candidate networks (Mbps). The values for α and β used in this study are 5.72 and 2.66 [40], respectively.

3.2.3 Cost utility: uc

As there is a natural human tendency to want to reduce the monetary cost, the cost utility is very important and it is defined in Eq. (3):

$$u_{c} (C) = \left\{ {\begin{array}{*{20}c} {1,} & {C < C_{\hbox{min} } } \\ {\frac{{C_{\hbox{max} } - C}}{{C_{\hbox{max} } - C_{\hbox{min} } }},} & {C_{\hbox{min} } < = C < C_{\hbox{max} } } \\ {0,} & {otherwise} \\ \end{array} } \right.$$
(3)

where C is the monetary cost for the current network (euro), Cmin—minimum cost that the user is willing to pay (euro) and Cmax—the maximum possible cost that the user can afford to pay (euro). The cost utility has values in the [0,1] interval, no unit and is considered to be a flat rate cost expressed in Euro/Kbyte. It is assumed that the flat rate charged will not change during a user-network session.

4 Experimental test-bed environment and results

This section presents the energy consumption measurements conducted for an Android mobile device in several scenarios while performing video delivery over an IEEE 802.11g network and UMTS cellular network as illustrated in Fig. 3. In our previous work [41] we presented an in-depth study on how the wireless link quality and the network load impact the energy consumption of an Android device while performing on-demand streaming over WLAN. In this paper, the results from the test-bed are used to validate the mathematical model of the energy consumption equation and to analyze the performance of various MADM-based methods under realistic conditions.

Fig. 3
figure 3

Experimental test-bed setup

4.1 Experimental setup and test case scenarios

The energy consumption measurements were collected for and Google Nexus One Android Device when performing Video on Demand (VoD) over two types of radio access networks: WLAN and UMTS as illustrated in Fig. 3. The Multimedia Server consists of Adobe Flash Media Server 4Footnote 2 which uses the proprietary application level streaming protocol, referred to as Real Time Messaging Flow Protocol (RTMFP) running over User Datagram Protocol (UDP).

The Blender Foundation’s 10 min long Big Buck BunnyFootnote 3 animated clip was used for testing. The video clip was encoded at five different quality levels, following recommendations for encoding clips for multi-bitrate adaptive streamingFootnote 4 as illustrated in Table 3. The video play-out is scaled to the device screen resolution. The Power Consumption Monitor integrates an Arduino DuemilanoveFootnote 5 board connected to the Android mobile device and a laptop that stores the energy measurements. More details about the WLAN test-bed can be found in [41]. For the cellular network, the power measurements were run over UMTS provided by the eMobileFootnote 6 service provider in Ireland. Relevant information about the cellular network [e.g., network type, maximum downlink rate, cell id (CID), location area code (LAC), mobile country code (MCC), mobile network code (MNC), signal strength (SS)] is listed in Table 4.

Table 3 Encoding settings for the multimedia levels
Table 4 Cellular network characteristics

The experimental test-bed measurements were collected under five test-case scenarios as illustrated in Fig. 4 and described below. The Multimedia Server stores the five ten-minute clips corresponding to different quality levels and streams them sequentially to the Android mobile device over UDP.

Fig. 4
figure 4

Considered scenarios

Scenario 1No Load, Near AP the mobile user is located near the AP (~1 m away), with no extra background traffic in the network, and the mobile device SS varies between −48 and −52 dBm.

Scenario 2No Load, Far AP the mobile user is located in an area with poor SS, varying between −78 and −82 dBm. There is no extra background traffic in the network.

Scenario 3Load, Near AP similar to Scenario 1, except that background traffic is added to load the network. A Candela LANforge traffic generator was used to create between 25 and 28 virtual wireless stations, each of them generating traffic. The size and choice of the background traffic type is based on the traffic forecast provided by Cisco [1]: 66 % video traffic with 98 % downlink traffic and 2 % uplink traffic; and 34 % other traffic type (e.g., web-browsing/e-mail, file sharing, etc.) with 76 % downlink traffic and 24 % uplink traffic. The overall network traffic load was selected in the range of 20–21 Mbps, so that the network is maintained at high load without being overloaded or used at its maximum capacity. The stations generating background traffic were located near the AP with the signal strength varying between −28 and −32 dBm and generating a mix of UDP traffic with data rates between 0.25 and 2 Mbps and packet sixe of 1514 b, and Transport Control Protocol (TCP) traffic with data rates between 0.250 and 1 Mbps and packet size in the range of 300–1514 b. The overall video traffic load was maintained at 66 % of the total background traffic for all scenarios.

Scenario 4Load, Far AP similar to Scenario 2, except that background traffic was added as in Scenario 3 (Load, Near AP).

Scenario 5Cellular the mobile user is performing VoD over the cellular network. The UMTS network provided by the eMobile cellular network operator was used.

4.2 Experimental results

An in-depth study and a more detailed view of the results within the WLAN environment (Scenario 1 to Scenario 4) are presented in [41]. A summary of the results is presented in Table 5. The average energy consumption (Avg. Energy) of the mobile device was measured while performing VoD Streaming over UDP for the five quality levels. The actual average throughput (Avg. Th.) received by the mobile device on the wireless network, was captured with Wireshark. The results obtained over UMTS from eMobile are detailed in [41] and summarized in Table 6. Because cellular networks have lower transmission rates than WLAN (e.g., UMTS has a maximum theoretical data rate of 384 kbps, whereas IEEE 802.11g has a maximum theoretical data rate of 54 Mbps), a subset of three out of the five quality levels were considered for streaming over UMTS.

Table 5 Results summary for UDP VoD streaming in the wireless environment
Table 6 Scenario 5—UDP VoD streaming in the cellular environment

These results were further used to validate the energy consumption equation and to analyze the performance of various MADM-based methods in the network selection context.

4.3 Subjective video quality assessment results

The quality of the choice of the five quality levels for the multimedia streams was validated using two methods: an objective method in terms of Peak Signal-to-Noise Ratio (PSNR) and a subjective method based on a study conducted where the subjects had to individually rate the quality of each sequence on a 5-point scale (e.g., 1-Bad, 2-Poor, 3-Fair, 4-Good, 5-Excellent) [40]. For each sequence, the mean value represented by the Mean Opinion Score (MOS) was computed. The results of both assessment methods are listed in Table 7 along with the perceived quality and impairment mapping.

Table 7 Objective and subjective results

The test sequences were played locally in full screen on the Android device and displayed in a random order (to minimize the order effect), maintaining similar testing conditions for all the participants. In the case of the five considered scenarios, the wireless link was good quality and had enough available bandwidth to support VoD, allowing smooth and un-interrupted playback which maintained the same user perceived quality and thus the same subjective MOS values as for local playback. The only difference in MOS appears in Scenario 4, where the background traffic and the distance from the AP affect the MOS for QL1–QL3. In this case the estimated MOS would be less than 3 for QL1, 3.58 for QL2, and 3.43 for QL3, with QL4 and QL5 maintaining the same MOS as for local playback [39].

4.4 Modeling the energy consumption pattern

This section provides the model for the energy consumption pattern of an Android mobile device using real experimental energy measurements. The r t (the mobile device’s energy consumption per unit of time), and r d (energy consumption rate for data/received stream) parameters are computed using the energy measurement results from the experimental test-bed, for all test-case scenarios: (1) WLAN—No load, near AP; (2) WLAN—No load, far AP; (3) WLAN—Load, near AP; (4) WLAN—Load, far AP; (5) UMTS and presented in Table 8. By using these results the energy consumption pattern of the Google Nexus One can be modeled as a mathematical Eq. (4) given below:

$$E_{i} = t(r_{t} + Th_{i} \cdot r_{d} )$$
(4)

where: E i is the estimated energy consumption (J) for Radio Access Network (RAN) i; t represents the transaction time (seconds) taken from the experimental measurements for each of the test scenarios; r t is the mobile device’s energy consumption per unit of time (W); Th i is the throughput (kbps) provided by RAN i; and r d is the energy consumption rate for data/received stream (J/Kb). The two parameters, r t and r d , are device specific and differ for each network interface (WLAN, UMTS, etc.). In this study, they were determined by running different simulations for various amounts of multimedia data (i.e., quality levels) while measuring the corresponding energy levels and then used to define the energy consumption pattern for each interface/scenario. Similar studies could be run on other mobile devices, however, these parameters could also be provided in the future by the device manufacturer in their device specifications and by making use of them, the solution could be generalized across a wide range of devices.

Table 8 RT and RD computed values

To validate the energy equation, the Wireshark trace files, captured from the experimental test-bed, were used to extract the received throughput of the Google Nexus One during the video delivery of each multimedia quality level in each considered scenario. Wireshark captured the network conditions every 10 s. The extracted throughput was then used in Eq. (4) to compute the energy consumption. During the experimental test-bed the energy consumption of the Google Nexus One was measured with the Arduino board. The Arduino board measures the energy consumption of the device every 1 s. The computed energy was then compared against the measured energy. Figures 5 and 6 illustrate the received Throughput (Wireshark), Measured Energy (Arduino board), and Computed Energy [Eq. (4)] for QL1 and QL5, respectively in each considered scenario. Note that the throughput and the computed energy are represented by 60 points, while the measured energy by 600 points. This represents a reason, together with the possible synchronization issues between the trace files generated by different tools (Wireshark and Arduino), for which the plots might present slight variations. However, despite these issues, the energy equation provides a good approximation of the average energy consumption of the mobile device. The average values in all considered scenarios and for all the quality levels are presented in Table 9. By performing t tests on the Measured Energy and Computed Energy results for each multimedia quality level and for each considered scenario, it is shown that there is no statistical difference between the average values of the two sets of results. The t tests compare the two sets of data assuming equal variances. The results listed in Table 10 show that in all cases the test statistic (t Stat) < critical value (t Critical) and the p value > significant level (α). This accepts the null hypothesis and demonstrates that there is no statistical difference between the average results provided by the energy equation (Computed Energy) and the average values from the real test measurements (Measured Energy). This finding is stated with a very high level of confidence of 95 % (the significant level, α = 0.05).

Fig. 5
figure 5

Throughput versus measured energy versus computed energy for QL1 for each of the four scenarios. a No load, near AP. b No load, far AP. c Load, near AP. d Load, far AP

Fig. 6
figure 6

Throughput versus measured energy versus computed energy for QL5 for each of the four scenarios. a No load, near AP. b No load, far AP. c Load, near AP. d Load, far AP

Table 9 Measured energy versus computed energy (J)
Table 10 t test results: two-sample assuming equal variances

The results show that the proposed energy equation provides a good approximation of the average energy consumption of the Google Nexus One device. The r t and r d values have been mapped to the corresponding quality levels and used in the comparison.

5 Evaluation of the ranking methods

This section evaluates four of the MADM methods: GRA, MEW, SAW, and TOPSIS, in order to analyze if they produce similar results under different conditions. All the methods are analyzed in terms of energy-quality trade-off. In order to accomplish this, the candidate networks considered are the networks from the experimental test-bed. The candidate networks list is as follows: WLAN1—No Load, Near AP; WLAN2—No Load, Far AP; WLAN3—Load, Near AP; WLAN4—Load, Far AP; UMTS—eMobile network. Because each network can deliver the video at five quality levels (except three quality levels for UMTS), it is assumed that the network selection is performed between the quality levels and the five networks. A total number of 23 options are considered. The outcome will be the best value network that provides the best quality-energy trade-off. Each ranking method will assign a score to each network and for each quality level. The network that has the highest score for a certain quality level will be selected as the target network. In SAW [Eq. (5)] and MEW [Eq. (6)] the score for a given network i is calculated using additive and multiplicative operations. Whereas GRA [Eq. (7)] uses the best reference network in order to describe the similarity between each of the candidate networks, and TOPSIS [Eq. (8)] scores the networks based on the distance from the best and worst reference networks. Here, the best and worst reference networks are defined with the best and worst values of each parameter. To analyze the efficiency of each ranking method, the parameter utility functions were kept the same between them.

$$SAW_{i} = w_{e} \cdot u_{{e_{i} }} + w_{q} \cdot u_{{q_{i} }} + w_{c} \cdot u_{{c_{i} }}$$
(5)
$$MEW_{i} = u_{{e_{i} }}^{{w_{e} }} \cdot u_{{q_{i} }}^{{w_{q} }} \cdot u_{{c_{i} }}^{{w_{c} }}$$
(6)
$$GRA_{i} = \frac{1}{{w_{e} \cdot |u_{{e_{i} }} - u_{e}^{b} | + w_{e} \cdot |u_{{q_{i} }} - u_{q}^{b} | + w_{c} \cdot |u_{{c_{i} }} - u_{c}^{b} | + 1}}$$
(7)
$$TOPSIS_{i} = \frac{{D_{w,i} }}{{D_{b,i} + D_{w,i} }}$$
(8)

where w e , w q , and w c represent the weights of energy, quality, and cost; u e , u q , u c are the energy utility, quality utility, and cost utility; u b e , u b q , and u b c are the utility values for the best reference network. D w,i and D b,i represent the Euclidian distance of a network i from the worst and the best reference network and their values are given by Eqs. (9) and (10), respectively:

$$D_{w,i} = \sqrt {w_{e}^{2} \cdot (u_{{e_{i} }} - u_{e}^{w} )^{2} + w_{e}^{2} \cdot (u_{{q_{i} }} - u_{q}^{w} )^{2} + w_{c}^{2} \cdot (u_{{c_{i} }} - u_{c}^{w} )^{2} }$$
(9)
$$D_{b,i} = \sqrt {w_{e}^{2} \cdot (u_{{e_{i} }} - u_{e}^{b} )^{2} + w_{e}^{2} \cdot (u_{{q_{i} }} - u_{q}^{b} )^{2} + w_{c}^{2} \cdot (u_{{c_{i} }} - u_{c}^{b} )^{2} }$$
(10)

where u w e , u w q , and u w c are the utility values for the worst reference network.

The quality utility, cost utility, and energy utility were previously described. Emax and Emin are computed as the average of the energy measurements presented in Table 11 for QL1 and QL5 in each considered scenario, respectively. Thus their values are Emax = 983.4 J and Emin = 434.75 J. In terms of user preferences, represented by the weights’ values, there are many ways of collecting data from the users. As previously mentioned, some of the existing weighted solutions obtain the weights through questionnaires on users and service requirements. Other solutions integrate a GUI in the user’s mobile terminal in order to collect the user preferences and some other solutions look into using AHP or ANP as methods to determine the weight values. An important aspect is to find a trade-off between the cost of involving the user and the decision mechanism. One solution for minimizing the user interaction may be by implementing an intelligent learning mechanism that could predict the user preferences over time. We will consider this for future work. In this work, to analyze the energy-quality trade-off of each ranking method, the weight for the cost was considered to be zero whereas the weights for energy and quality are considered to be equal: we = 0.5, wq = 0.5, and wc = 0.

Table 11 Ranking method results: GRA versus MEW versus SAW versus TOPSIS

The best reference network is built from the best values of each parameter while the worst reference network, considers the worst value of each parameter. In this context, from the five networks, the best reference network is considered to be the one that provides the highest quality level QL1 (u b q  = 1), with the lowest energy consumption of 413 J (u b e  = 1), whereas the worst reference network is considered to provide the lowest quality level QL5 (u w q  = 0.0292) with the highest energy consumption of 1,300 J (u w e  = 0). The results of each ranking method (e.g., GRA, MEW, SAW, and TOPSIS) for each quality level and for each network are given in Table 11. The first three choices of each ranking method within each network are indicated by colors, such that: the first choice is represented in green, the second choice is marked by blue, and the third place is marked by orange. Looking at the results from a global point of view, all the methods select QL2 WLAN1 as their first choice. When looking at the results within one network only (e.g., WLAN1) it can be noticed that GRA and SAW provide similar results, as they rank the quality levels as follows: QL2, QL1, and then QL3, demonstrating that they are more quality-oriented methods. An aspect to note is that both of them provide very small differences between the scores. For example, between QL1 and QL3 for WLAN1, GRA score difference is 0.0007 only whereas SAW score difference is 0.0013. This makes them very sensitive to the changing conditions. For example, looking at WLAN2, WLAN3, and WLAN4, their quality levels order is QL2, QL3, and then QL1, but again the difference between scores is very small.

On the other hand, looking at the results provided by TOPSIS, the method provides a clear distance between the best solution and the rest for each individual RAN, but the differences between the scores of the remaining solutions are small for TOPSIS as well. The only method that provides a clear distance between all the quality levels is MEW. Also looking at the results provided for WLAN4, which can be considered the worst case scenario for WLAN choice, as the mobile user will be located in a poor signal area and a loaded network, GRA, SAW, and TOPSIS provide the same score order (QL2, QL3, QL4, QL1, QL5) whereas MEW totally eliminates the choice of QL1 (QL2, QL3, QL4, QL5). This is because QL1 has the highest energy consumption, and in extreme situations the user will be better off with a Fair quality (QL5) and moderate energy consumption than with high quality (QL1) and risk reaching the mobile device battery lifetime.

Figure 7 illustrates a comparison of the four ranking methods with varying quality weight (wq) within the same network (WLAN1) . For each method the total rank score versus quality level versus quality weight is illustrated in a colored 3D graph. The dark red color is associated with high score values while the dark blue color is associated with low score values. The quality weight (wq) is varied between 0 and 1 (quality-oriented) meaning that the energy weight will vary between 1 (energy-oriented) and 0. For example, we = 0 when wq = 1, which means that the user is quality-oriented, and does not care about the energy conservation at all. This is visible in Fig. 7, as when wq = 1, all the ranking methods will have the highest score (dark red color) for QL1. Whereas we = 1 when wq = 0, meaning that the user is highly energy-oriented, and wants to conserve the energy of the mobile device, no matter the quality level is. In this situation the methods provide the highest score for QL5 (dark red color—see Fig. 7). QL2 keeps, more or less, the same rank score (same range of color) for all quality weights and therefore indicates a more stable choice overall. It can be seen that MEW provides a more distinct difference between the choices of quality levels for the same value of the quality weight.

Fig. 7
figure 7

Ranking methods comparison with varying quality weight for QL within WLAN1 (no load, near AP), QL1—highest quality level, QL5—lowest quality level

Considering a varying quality weight (w q ) but for a choice of different networks (e.g., WLAN1, WLAN2, WLAN3, and WLAN4) at the same quality level (QL1), the score results of each ranking method are illustrated in Fig. 8. As it has been seen in the experimental part the impact of the network conditions (WLAN4—loaded network and far from the AP) is more visible on QL1 than other QL. This causes increase in the playout duration of the multimedia stream (because of re-buffering) and leads to an extreme increase in energy consumption and decrease in MOS. The increase in energy makes QL1 (WLAN4) the worst option among the 23 possible ones. This is translated in u e being zero. However, with all the presented disadvantages GRA, SAW, and TOPSIS all end-up selecting QL1 on WLAN4 as seen in Fig. 8. MEW will select QL1 but only in the case that wq = 1.

Fig. 8
figure 8

Ranking methods comparison with varying quality weight within WLANs for QL1 WLAN1 (no load, near AP), WLAN2 (no load, far AP), WLAN3 (load, near AP), WLAN4 (load, far AP)

The analysis of the main ranking methods, presented in this section, have shown that MEW models the network selection in the best way, in comparison with other well-known ranking methods: GRA, SAW, and TOPSIS. The main advantages of MEW over the other methods, is that it provides a clear difference between the score results of each option, and that MEW penalizes alternatives with poor criteria values more heavily.

6 Conclusions

This paper conducts a performance evaluation analysis of the widely used MADM methods for network selection using real user data. The performance evaluation is done in terms of energy efficiency and user perceived quality levels for multimedia streaming over a heterogeneous wireless environment. Real energy measurements were conducted on a Google Nexus One Android mobile device for various amounts of multimedia data (quality levels) received streams. A mathematical model of the energy consumption pattern for each of the available interfaces (e.g., WLAN and UMTS) was then built based on the real energy consumption measurements. Similarly, measurements could be taken for other smartphones for each of the wireless interface technologies supported (e.g., 802.11n, LTE, etc.). This energy-related information could, in future, be provided by energy conscious device manufacturer in their device specifications. In this study, the experimental results were used here to validate the choice of the energy equation, for a multimedia-based wireless environment.

The well-known MADM ranking methods (e.g., GRA, MEW, SAW, and TOPSIS) are evaluated through mathematical performance analysis in order to examine if they produce similar results under different conditions. The results analysis shows that MEW finds a better quality-energy trade-off and its main advantage is that provides distinct differences between the score results for each multimedia quality level. It also penalizes alternatives/options with poor parameters/criteria values more heavily than the other tested MADM schemes.

Nowadays the network operators consider that if they provide individual high throughput this is translated into satisfied users. However, as this paper shows, the excellent perceived quality of service does not always result from providing highest throughput and a good trade-off between quality-energy is needed in order to keep today’s battery conscious user satisfied. Thus, network operators need to integrate adaptive mechanisms in order to cater for the user preferences and enable a good balance between energy and quality.