1 Introduction

Wireless Multimedia Sensor Networks (WMSNs) have enhanced the data gathering capability of the traditional Wireless Sensor Networks (WSNs) which were restricted only to gathering scalar data. WMSNs have sensor nodes equipped with cameras and microphones that enable these networks to gather multimedia data in various forms like live data streams, videos, audio, images and so on [1]. Recent advances in feature engineering, image-processing techniques, machine learning and communication technologies have given birth to various research to applications of WMSNs. Applications include health care industry, military and general surveillance systems, real time intelligent transportation systems and environmental monitoring [2, 3].

The WMSNs are a descendent of WSNs hence the same benefits such as self-organization, flexibility, disposition simplicity and scalability are also characteristic. However, the added features and capabilities in WMSNs present a number of challenges that are inherent with these constrained networks such as limited energy, storage, communication bandwidth as well as processing capacity. The large volumes of data generated by these multimedia networks require reliable transmission over the wireless medium in real-time further exacerbating these challenges. Research, on this domain aims at development of computation algorithms and protocols that are highly energy-efficient and Quality of Service (QoS) aware. Due to these variations, solutions developed for WSNs do not directly apply to WMSNs. Therefore there is need to modify these techniques before they can be applied to WMSNs. Furthermore, new techniques at all layers from physical layer to application layer suitable for these networks are required. Surveys on such research ranging from hardware to the network model layers and other cross-layer designs are [1, 4,5,6]. Some extensive studies on various hardware and software architecture test beds are in [7]. Transport protocols designed to be reliable are in [8]. A comparison of energy efficient and QoS aware routing protocols is done in [9,10,11]. Accordingly, a review of QoS cognizant and multi-channel Media Access Control (MAC) protocols are in [12, 13]. AlSkaif et al. [14] present a comparative study on WSNs MAC protocols investigating their suitability on WMSNs through the analysis of some network parameters on node energy drain. References [15, 16] identifies cross-layer optimization solutions to problems inherent in WMSNs packet delivery, energy preservation and error recovery. Discussions of security requirements in WMSNs and classification of the security threats as well as some protection mechanisms are in [17, 18]. Finally, [19] discusses energy-efficiency issues with regard to all sensor application designs as well as extension of network lifetime while [20] proposes a classification of energy-efficient target tracking schemes according to sensing and communication subsystems on a particular node.

This survey will thus concentrate on the important aspects required to deliver QoS-aware routing protocols in WMSNs, thus energy-efficiency, real-time multimedia streaming and data volumes. The paper will also highlight challenges and proffered solutions to guide related research. Network designers and architects will also immensely benefit from the clarity on characteristics and requirements of WMSNs as well as existing solutions. Furthermore, presented is a survey of communication MAC and routing protocols with emphasis on energy-efficiency, scalability, QoS guarantee, prioritisation schemes, multipath routing and service differentiation. The conclusion will also give future directions on discussed issues.

The remaining paper is as follows: Sect. 2 highlights the characteristics and design requirements of WMSNs with design challenges and existing remedies. Classification of WMSNs routing protocols is in section 3, followed by proposed QoS WMSNs routing protocols in sect. 4. Furthermore, sect. 5 presents quality of service aware MAC protocols for WMSNs. Lastly section 6 draws conclusions to the survey.

2 Wireless multimedia sensor networks

WMSNs are an emergent technology out of the traditional WSNs. As such, they inherit many constrains that exist in these networks as well as new challenges and requirements that come because of the requirement for real-time multimedia services and handling of increased volumes of data. The gathered data traffic handled by these networks requires delivery in real-time due to the nature of applications that require the data. Examples of such applications include security surveillance, health systems and traffic management systems. The multimedia data collected by the camera sensors is voluminous for a particular event; hence, bandwidth requirements for the transmissions are increased. As summarised in Table 1, WMSNs have opened many doors to research due to their characteristics and capabilities. This section discusses the characteristics, design requirements of WMSNs as well as proposed approaches.

Table 1 Characteristics and design requirements of wireless multimedia sensor networks

2.1 Power constraints

The camera sensor nodes in WMSNs are generally battery-powered. The batteries should power the sensor nodes for protracted periods without replacement. Therefore, the functionality of such nodes should take into cognizance these power constraints and limit energy consumption in its computations and communication [52]. In traditional WSNs, energy drain due to computations can be insignificant compared to WMSNs where computations tend to consume extremely high energy. According to [40], capture and processing of a simple frame in a vehicle tracking system can constitute up to 12% of total energy consumption of the overall event. It is therefore recommended to adopt energy-efficient algorithms in image processing [21,22,23,24,25,26] and likewise in video compression [27,28,29]. Due to the large volumes of multimedia data to transmission, it is prudent that the communication protocols at every layer be energy-efficient. For example, the transport layer protocols reduce the number of control messages according to desired levels of reliability [33], with routing protocols employing load balancing and energy estimation techniques across the network [34, 35] and at the MAC layer protocols can avoid idle listening by inactive nodes [36, 37]. Dynamic power management is another important technique to be used as it ensures that idle components of a sensor node are selectively shutdown or hibernated to prevent unnecessary power consumption [30,31,32].

2.2 Real-time multimedia data

In most applications involving multimedia data, QoS is difficult to achieve. Transmission of data to the sink without any packet loss or delays above threshold is very crucial in WMSNs. Therefore there is need to impose severe QoS demands on the networks. Applications that involve multimedia data for example in security surveillance or traffic management systems cannot tolerate delays. This implies that prioritisation and service differentiation will play a pivotal role in these real-time systems. MAC protocols should give access or assign greater quality channels to higher priority data [39]. Routing protocols need to select paths that will have the least delay to meet the required QoS as illustrated in [38]. Reliability is also crucial in ensuring QoS to WMSNs. Retransmissions are done at transport layer for example in TCP while redundancy is at bit-level or at packet-level as presented in [8, 53, 54]. However, these methods must be used with consideration that they increase traffic hence consume more networks resources. The heterogeneous traffic in WMSNs that include multimedia and scalar data intended for different applications with varying QoS demands will require variable levels of priority even within the same traffic type [42].

2.3 Volumes of multimedia data

Typically, WMSNs have limited bandwidth hence transmission of large volumes of sensory data presents a major challenge to QoS guarantee. Techniques for data compression and redundancy reduction are vital to decrease data volumes prior to transmission. One such technique is local processing where on-board analysis of the captured images is used to extract only important events [40, 41]. The downside of local processing is the requirement for added hardware resources. Another technique is In-network processing of multimedia data that encompass data fusion where the sink node collects heterogeneous data from various nodes and create a summarised version of events to reduce data redundancy and enhance inferences [40, 41, 43]. To deal with the resource limitation problems associated with centrally coding data from multiple sensor cameras, WMSNs use distributed source coding (DSC) where encoding of data is done independently at each sensor before transmission to the sink for decoding [44, 45]. This reduces the power consumption as well as required hardware resources [55, 56]. Typically, WSNs transmit all collected data to the sink for subsequent processing and querying. Due to technological advancements, it is now possible to equip sensors with processors and flash memory that enable them to process and store data [47, 57]. After processing, only analysed data transmits to the sink. In terms of queries, only the result goes to the network after querying historical data. However, proper data ageing schemes needs to be incorporated into the local databases as they fill up in order to maintain data integrity [46, 47]. It is also important to note that the sensors will form distributed databases which require efficient query engines to retrieve the data efficiently [58, 59]. Mitigating the bandwidth constraint that is extreme in WMSNs due to the large volumes and nature of traffic is also an important factor in achieving QoS communications. At the MAC layer, sensor nodes can communicate simultaneously using different channels [48, 49]. Data traffic can be routed through multiple paths [42]. However, radio equipment that have considerable bandwidth such as ultra-wideband (UWB) can be utilised in WMSNs [50, 51].

3 Classification of wireless multimedia sensor networks routing protocols

The classification of WMSNs depends on various aspects ranging from the mode of operation, the architecture employed, desired QoS parameters, type of gathered sensory data and the multimedia delivery mode (Table 2).

Table 2 Wireless multimedia sensor networks routing protocols classification

4 Quality of service aware multipath routing protocols for WMSNs

There are extensive studies over the years on routing techniques for WSNs to improve communications. However, the techniques are not directly applicable to WMSNs due to variations with traditional WSNs. Routing in WSNs aims at finding the shortest path for transmission scalar data. Applying the same routing concepts to large volumes of multimedia data will result in network congestions and increased power drain on nodes [60, 61]. Therefore, the robust approach will be to send data in parallel through multiple paths. Routing in WSNs is particularly concerned with energy-efficiency whilst WMSNs also consider the QoS due to real-time traffic and reliability concerns.

This section presents some multipath routing protocols in WMSNs with QoS assurances. This survey looks at different protocols than those recently surveyed in [62,63,64]. Furthermore, the chosen multipath routing protocols have single path routing support. For further comparison of the surveyed multipath routing protocols with QoS assurances, particularly to WMSNs refer to Table 2.

A multipath routing protocol based on ant colony optimization called AntSensNet with QoS assurances is presented in [38]. It has three phases of operation: Formation of the cluster, route discovery phase, data transmission and route maintenance. The cluster formation initiates from the sink that releases some cluster ants (CANTs). Those within close proximity to the sink become Cluster Heads (CH) and receives the CAs first. Upon receiving the CANTs, they will be responsible for the reduction of the time-to-live (TTL). The cluster head will then advertise the CANTs to non-cluster heads within its communication radius so that those who are willing to join the cluster can join. Once clusters formation ends, the CH begins route discovery. Each CH manages a pheromone table and shares with its neighbours according to traffic classes following four parameters i.e. Energy, packet drop, memory and delay. Traffic specific paths to the sink is created by broadcasting a forward ant (FANT) which will collect traversed node identities and the four parameters (queue delay, ratio of packet, residual energy and available memory) as it propagates. When a node receives a FANT, it updates its information before sending it to the next hope that satisfies the QoS requirements and a corresponding backward ant (BANT) transmits in the reverse path for path reservation. On receipt of the BANT, nodes update their pheromone tables. For establishment of multiple paths for video transmission, a video forward ant (VFANT) disseminates in the same manner as the FANT and the sink responds by sending multiple VBANTs. The VBANTs chooses paths for sending video data. Once routes are ready data, delivery starts. A maintenance ant (MANT) is for route maintenance. This protocol gives differentiated service to ensure QoS delivery by offering each traffic separate routes. The use of cluster heads is a drawback on scalability. However, the multipath routing technique is viable for video data only.

Bidai et al. [65] proposed the ZigBee Multipath Hierarchical Tree Routing (Z-MHTR) protocol. It allows source to use non-parent neighbours to search for other paths. The source node maintains a record of all branches used for tree routing (TR). The source node will construct disjoint paths using three basic principles. If a selected next hop node branch has not been utilised for TR path by the source node then a node disjoint establishes from that node to the sink using TR. If the branch has already been utilised for TR path by the source then the next hop will depend upon the depth of a node common to the TR path used by the source and the node that has used the node branch for TR. If all neighbours’ branches have been utilised in TR then it selects the neighbour node that is not in any TR path. The rules applies to any subsequent nodes until the sink. The number of disjoint paths corresponds to the number of branches forming the topology. Furthermore, the author proposed [66] for reduction of interference in which nodes lists interfering neighbours except the ones on the same paths. This is through checking whether they can hear data packets that are not destined to them. The disjoint paths that reduce inter-path interloping are preferred. Based on the ZigBee tree topology and address assignment, multipath routing is through neighbour table and a record of routing tree usage on a particular branch. The further work mitigates multiple paths interferences caused by route coupling. However, the restriction is only to ZigBee tree topology hence the paths are proportionate to available branches.

Chen et al. [67] recommended the directional geographical routing (DGR) protocol for real-time video communications. The nodes in this protocol implements the global coordinate system to create virtual coordinates upon receipt of a broadcast probe. The virtual coordinates obtained by mapping the source and sink position along the x-axis to the destination or intermediate node. A node selected to be a forwarding candidate falls within the transmission range, the optimal mapping location and the threshold of the source. Next hop will be a candidate that has the smallest distance to the optimal mapping hence; it will have a smaller timer than other competing nodes. If a timer expires, the node sends a reply message REP to the source. On receipt of an REP, the source confirms with SEL message. Nodes that hear the REP or SEL cancels their timers. The winner node will not establish any other path to the same source in order to guarantee path disjointedness. In turn, the connected node will send its own probing messages following the same procedure with an adjusted deviation angle to create a path towards the sink. For establishment of multiple paths, the source will send a number of probe messages with variations in the initial deviation angle. For video routing, the source broadcasts the complete frame initially to all single hop neighbours. Those neighbours within the chosen paths will retransmit the video using respective paths only those packets specified by the source. The packet delivery in this protocol is fast and reliable through multipath and the forwarding equivalence class. It also scales well due to the stateless geographic based routing paradigm. However, if a node fails, the path recovery takes longer as well as the new route discovery. In addition, it considers only a single active source for video transmissions that might not be practical in some scenarios.

Bhattacharya and Sinha following the principles of ad-hoc on-demand distance vector routing (AODV) [68] developed the least common multiple routing (LCMR) protocol [69]. As opposed to calculating the shortest path by number of hops, it uses the routing time taken or end-to-end delay to choose multiple paths. During route discovery, the route reply message RREP has to arrive before the deadline otherwise it will not be accepted. The source node uses the RREP message to check the routing time taken by the corresponding route request message RREQ before reaching the destination. From the accepted x paths that have routing time {T1, T2, …, Tx}, it calculates the least common multiple L of {T1, T2, …, Tx}. The packets sent over path i are decided such that = \(\sum\nolimits_{i = 1}^{x} L /Ti\) packets, L/Ti packets will be routed along that path i. The total time it takes to deliver k packets gives the maximum routing time Tmax of {T1, T2, …, Tx}. This protocol ensures avoidance of congested routes through the end-to-end calculation of routing time during its route discovery process. In order to reduce the transmission time, the number of packets allotted to a particular route reduces according to time L and the routing time Ti of the path. However, this may lead to early node death if most traffic continuously routes through a node with least end-to-end routing time. Adaption to congestion and route breakage needs improvement.

Unlike DGR [67], that uses the deviation angle for controlling the directions of multiple paths, Li et al. [70] proposed the division of the topology into different districts for specific paths using the geographic energy-aware non-interfering multipath routing (GEAM) protocol. After division into virtual coordinates just like in DGR, the source and sink areas are restricted within the transmission radius. There is packet piggybacking with boundary information of the selected district by the source before transmission. The subsequent nodes will then use greedy perimeter stateless routing (GPSR) [71] to forward the packet to the respective district. For load balancing and even distribution of energy, GEAM organises the data transmissions in runs of same lengths. To further avoid interference within multiple routing paths, it applies division of runs into three rounds, where a district Dx belongs to round k if Dx %3 = k. During the first run, load distribution is even to all districts. After each run, the sink collects residual energy from all nodes within a district and sends back to the source. Based on these statistics the source adjusts the rate of utilisation for every district and those with higher energy levels get more loads in the next run. GEAM achieves balanced traffic loads and energy consumption as well as avoids interference by the division to the topology into various districts. Scalability guarantee is through GPSR. However, piggybacking every packet with border information and making it collect network statistics increases the overhead. It also does not consider some QoS metrics such as delay and reliability that are of paramount importance to delivery of multimedia data.

A multi-agent based context aware multipath routing scheme (MACMR) is presented in [72]. The scheme uses static agents to determine the context of sensed multimedia data. According to the particular context, it triggers mobile agents from the event node to find the node disjoint path to the sink. The mobile agent clones traverse through intermediate nodes carrying resource information which includes bandwidth available, node energy, hop count and so on before delivering it to the sink node. The sink node then calculates the node disjoint paths according to resources and context presented. After the computation, it sends mobile agents with path information on the shortest path to the event node. The event node eventually sends information on the available multiple paths to the sink node. The route discovery and maintenance process makes this protocol less scalable since traversing a large network might increase the network latency.

The work in [73] presents the Hierarchical Multi Path Routing (HMPR) Protocol. The protocol uses a cluster-based approach in which data transmission to the sink is through the resource rich nodes elected as cluster head. The cluster head aggregates the data before forward to sink node as a means to reduce transmissions thereby saving energy and ultimately preserving network life. The transmission is within the cluster (one hop) as a means to reduce energy consumption within bounded delay and at the same time improving accuracy. This ensures maintenance of QoS requirements within individual clusters and across clusters. There is need to include another layer to achieve optimal QoS by considering the cluster heads routing to the sink.

A Cross Layer Energy Location Aware Routing Protocol (XELARP) proposed in [74], makes use of cross layer design with multipath routing using the application, network and MAC layers. Prioritisation of the frame at application layer is by encapsulating the frame type and priority as well as the group of picture size (GOP) to the frame header before passing the frame with its priority mark to the network layer. The network layer then discovers three paths from the source node to sink node. The protocol establishes three paths with consideration of node residual energy and the distance to sink node. In turn, the MAC layer makes use of the header information for dynamically mapping the frame based on traffic type and network load. Generally, the protocol achieves QoS for multimedia data in terms of throughput, latency, packet delivery ratio, and network lifetime.

The Efficient Multipath Routing based on Genetic Algorithm (EMRGA) presented in [75] is a cluster and GA based multipath protocol. The cluster formation is by sensor nodes who are in close proximity to where the event occurs. The node with better resources becomes the Cluster Head (CH). Upon sensing data, the cluster members forwards it to the CH that aggregates the data then forwards to the sink node. In turn, the CH transmits the aggregated data to the base station until end of the event. The CH uses more energy than other nodes therefore; all nodes participate as CHs to avoid premature death of a particular node. A genetic algorithm finds multiple paths for data transfer. It selects the best path considering the cost function with least energy consumption and minimum distance.

A Lyapunov optimization framework aims to handle two main challenges of WSMNs such as constrained energy and optimal QoS in diverse applications. The framework exploits multiple algorithms in routing the multimedia streams. It also utilizes Differentiated Queuing Services (DQS) method to regulate data queues efficiently. This framework also discusses two different algorithms (Distributed Gradient Projection Power Control and the Block Coordinate Descent Power Control algorithms), as well as handles the constrained energy issue. The framework improves network lifetime at the same time achieving scheduling fairness.

5 Quality of service aware media access control protocols for WMSNs

Mac protocols present a challenge during their design and implementation when aiming for energy efficiency and coordinating transmission of large volumes of multimedia sensory data and meeting QoS in MWSNs. The dynamic and burst traffic predominant in WMSNs it requires application of duty cycling techniques in saving energy deeper analysis. Reduction of collisions is also an important factor in MAC protocol design especially when it involves real-time multimedia data. Controlling media access through prioritisation and differentiation of services is also an important factor when handling heterogeneous traffic. This section will elaborate some of the energy-efficient MAC protocols that have QoS assurances. A summary of the same is in Table 3.

Table 3 Comparison of multipath routing protocols under review

Arifuzzaman et al. [76] proposed the intelligent hybrid MAC (IH-MAC) protocol. The protocol combines CSMA/CA and TDMA techniques as a single mechanism that implements local synchronisation. The protocol prioritises the node holding data with high QoS such as real-time data. If nodes have same priority and mapped to same slot, then they contend for that slot. For energy preservation, it adjusts its transmission output during the contentions. The protocol scales well and reduces collisions as well as improves on channel utilisation and access delays that are challenges in CSMA/CA by fusion of CSMA/CA and TDMA.

An energy-efficient hybrid MAC scheme (EQ-MAC), was proposed by Yahya and Ben-Othman in [77]. It uses the cluster mechanism in which the cluster head schedules slots using TDMA. It uses frames for communication. The cluster head sends the initial broadcast frame for synchronisation. Once synchronisation is completed, the cluster members start transmission of data through the cluster head. The cluster head issues TDMA slots upon request from the cluster members with consideration of traffic priorities. The cluster head then broadcasts allocated TDMA slots to cluster members for transmissions to begin. Sleep mechanism will also apply to those cluster members without data to transmit. Real-time data goes in a queue instantaneously processing. The sleep mechanism saves energy and channel utilisation. The protocol assures delivery of real-time data especially multimedia due to prioritisation of traffic. However, this may starve low priority traffic (Table 4).

Table 4 Comparison of media access control protocols under review

An efficient QoS provisioning protocol by Souil (AMPH) [78], is a hybrid channel access method. The notable difference between AMPH and IH-MAC is that the latter is CSMA/CA centred and AMPH is TDMA centred. AMPH divides transmissions into slots and two-hop radius for each node. Prioritisation for medium access is through separation of real-time and best effort traffic and based on slot ownership. Contending nodes separation is into four groups according to traffic priority: real-time by owner, real-time by non-owner, best effort by owner and best effort by non-owner. To avoid starvation, the protocol allows best effort traffic ahead of real-time traffic in limited slots per cycle. To conserve energy, it allows nodes to switch of their radios in the waiting state. The use of any slot coupled with traffic prioritisation achieves optimum channel utilisation and QoS guarantees to heterogeneous traffic. However, there is need for a robust differentiation of traffic that caters for more traffic types that exist in WMSNs.

A multi-channel priority based adaptive MAC protocol (PA-MAC) that is based on the IEEE 802.15.4 standard. The protocol traffic classification is into four categories according to priority: emergency (medical), on-demand, normal, non-medical. It uses the contention access periods (CAP) following the four classifications of traffic. Traffic with higher priority gets access to slots for lower priority traffic and the lower priority traffic transmits during the contention free period (CFP). The nodes enter into sleep until next transmission. Collisions mitigation is by traffic differentiation and transmission of lower priority data (e.g. multimedia data in medical scenario) at CFP. However, the protocol gives less priority to multimedia data hence cannot apply directly to WMSNs.

Related CSMA/CA based protocols with QoS assurances were proposed by Saxena et al. [36] and Diff-MAC [37]. The protocols use adaptive contention window (CW) and dynamic duty cycling mechanisms. The CW sizes for real-time traffic are set to be less than low priority traffic. The protocols differ in that, Saxena et al. aims for fairness by making sensors adjust their CW size after checking with neighbouring sensors if chances of a collision remain after last CW size changes whereas sensors in Diff-MAC continue to change their CW sizes towards the threshold CW size. Diff-MAC also employs the hybrid weighted fair queuing (WFQ) technique to allow channel access to real-time traffic while Saxena et al. uses a FIFO mechanism. Diff-MAC avoids starvation to same traffic type by prioritisation of packets belonging to the same queue prioritising them based on traversed hops. It further segments video frames and transmit the in bursts to lower retransmission cost. Both protocols uses the dynamic duty cycle technique. The protocols offer good QoS, fairness and energy-efficiency in WMSNs. However, constantly monitoring of various states in a network leads to idle listening and as for Diff-MAC, the constant intra-queue prioritisations may not scale well with high traffic.

MQ-MAC [39] is a cluster based slotted CSMA/CA MAC protocol. The cluster head is responsible for key responsibilities that include channel sensing, time slot allotments and channel allocation. It divides its super frame into active and sleep periods, with the active being sub-divided into three phases namely; sensing, channel selection and data transmission requests. Once the cluster head receives results of channel sensing and transmission requests from the cluster members, it will allocate slots and transmission channels. QoS guarantee is through slot allocation. The requests once received from cluster members classification is according to arrival time and traffic type as well as consideration of the packet lifetime. Early slots allocation is to requests with higher priority. The slot allocations are allows data traffic from cluster members to the cluster head to be collision free. After the transmission phase, the sensor nodes will sleep and wake up when another super frame starts. QoS guarantee is through allocation of slots and channels for different traffic types according to priority. However, the presence of many control messages during sensing and switching are not desirable due to overheads.

6 Conclusion

WMSN challenges and issues are prevalent due to their distinctive characteristics and resource constraints highlighted in Table 1. This paper covered the unique characteristics and requirements for WMSNs as well as some design approaches to mitigate the constraints. Multipath routing is fundamental to QoS provision and delivery of multimedia data in WMSNs. It is important for the protocols to counter interference in multiple parallel paths to circumvent route-coupling issues. However, most multipath routing protocols consider load balancing and energy management without due diligence for other QoS metrics such as prioritization and differentiation of services that are prevalent in these networks. Traffic in these networks is heterogeneous in nature therefore prioritisation and service differentiation is of paramount importance. Route recovery and congestion control is of great significance to the provision of QoS in WMSNs. Finally, Efficient MAC protocols intended for WMSNs must be able to handle heterogeneous traffic and vast volumes of multimedia data that is characteristic to these networks. In literature, there exist CSMA/CA based MAC protocols that are scalable and adapt to different variable traffic situations although suffer bottlenecks in QoS provision and energy efficiency. Hybrid protocols combing CSMA/CA and TDMA are an important part of WMSNs since CSMA/CA and TDMA can handle low data rates and high data rates respectively thereby improving throughput and reduce collisions.