Keywords

1 Introduction

The sensor networks are comprised of a large number of sensors with limited energy and calculation sources. Each sensor maintains the ability to receive especial data such as temperature, light, sound, movement, and so forth, and can deliver its received data from the environment to its neighbors. In the wireless sensor network this communication and contact is established via radio waves. In the wireless sensor networks, the sensed data are sent toward the sink node via a node in direct manner and/or via other nodes. Given the limitation of sources of energy of sensor nodes, the efficiency of consumption of energy in these networks is of paramount importance. Clustering is one of the appropriate approaches for optimization of consumption of energy in wireless sensor networks. In a sensor network, whose structure is based on clustering, the sensor nodes are in the form of categorized clusters, with each node sending its data via the cluster head toward the sink node. In data routing toward the sink node, in a sensor network which is based on clustering, the cluster heads close to the sink node, in addition to delivery of the data of the members of their cluster, are also duty-bound for routing the delivered data from cluster heads, further apart from the sink hole. This in turn results in a surge in consumption of energy by the cluster head nodes, close to the sink node. One method for prevention of this process is to reduce the size of clusters close to the sink hole and to increase the size of clusters which are further apart. This approach is referred to as unequal clustering. The application of unequal clustering technique leads to balanced distribution of load across the network and optimization of the energy consumption parameters, in addition to improvement of the rate of delivery of package. One of the important topics in regard to unequal clustering algorithms is the determination of the cluster head and the size of each cluster. Selection of optimal values for these parameters can shape an optimal topology, and lead to the improvement of the fundamental parameters of the wireless sensor network. The technique of reinforcement learning can be employed as an efficient approach for resolution of this issue. In the reinforcement learning, the learner, throughout the learning process, and repetitive interaction with the environment, attains an optimal control policy. The effectiveness of these interactions with the environment is evaluated via the maximum (minimum) numerical reward (fine) which is received from the environment. The main advantage of reinforcement learning in comparison to other learning methods, is the lack of need for any data from the environment in this approach. One of the methods of reinforcement learning, is the random learning automata. The random automata, in the absence of any data on the optimal act, intends to find an answer to the related issue. One act of automata is chosen randomly and is implemented in the environment. Thereafter, the response of the environment is received and the possible actions are updated based on the learning algorithm, with the said procedure being repeated. Based on this approach, an algorithm can be presented for controlling the topology based on unequal clustering in wireless sensor networks. In this study we intend to make use of learning automata in order to determine the cluster head and the size of sensor network clusters, and to present a new algorithm for the purpose of controlling topology based on unequal clustering in wireless sensor networks. The purpose of this study is to optimize parameters of consumption of energy, rate of delivery of package, and rear-to-rear delay in wireless sensor networks [1,2,3,4,5].

2 Methodology

The constant changes of topology in wireless sensor networks are the consequences of impairment and potential stimulation of host nodes. Moreover, the unsustainable presence of nodes in the said networks, alongside restriction of the band width of wireless channels, has caused many complexities and difficulties in regard to controlling topology within these networks. In order to design efficient algorithms for their implementation in wireless sensor networks; topics such as the band width of wireless connections; stimulation of nodes and low energy of the network; which are some of the main features of mobile calculations, should be attended. The stimulation and impairment of the nodes within the corresponding wireless networks lead to continuous changes in the topology of the network and surges the volatility of the network’s data. The goal of this thesis is to design an efficient algorithm in a bid to solve the issue of control of topology in wireless sensor networks, so that despite the restrictions of the said networks; by reliance upon the features and capacities of learning automata and their combinations, an appropriate analytical and applied solution would be presented for this topic of importance [6, 7].

The impairment and stimulation of the nodes in wireless sensor networks is one of the main reasons behind constant topological changes across the network, which in turn escalates the volatility of data. On this basis, it is necessary to continuously update the data of the network; as a result of which a significant control is exerted on the network. On the other hand, given the restriction of the band width of wireless connections in these networks, in comparison to wired networks; in addition to restriction of sources of energy in these networks, the models and patterns which are common in wired networks are inappropriate for usage in corresponding mobile network environments that maintain a much higher level of dynamism. During the recent years, different algorithms have been designed for development and control of topology in wireless sensor networks; the majority of which are based on unrealistic common assumptions, making the theoretical analysis of the algorithms feasible. However, on the other hand, their implementation practically leads to many difficulties and challenges with due regard to the restrictions and potentials of the said networks. The development of the network’s backbone is one of the most applicable and inspirational approaches for establishment and control of wireless sensor networks. Development of the spanning tree of the network, formation of a maximal independent set for the network; development of dominating set from the network’s graph, and the clustering technique of the network are part of the common and most applicable methods for development of the network’s backbone. In this thesis, a topology control algorithm is presented based on clustering for wireless sensor networks.

In the topology control algorithm proposed in this thesis, which is hereinafter abbreviated as CTCM (Topology control mechanism based on cluster); the network’s topology is shaped based on establishment of the network’s backbone. In order to build the topology of the network, in the proposed approach; initially the network is completely clustered, while later on the network’s topology is shaped with the connection of cluster head nodes. On this basis, it can be said that the process of control of topology in the proposed algorithm takes place in two phases. In the first phase, the network’s clustering is carried out, and in the second phase, the development of the backbone via the connection of cluster head nodes takes place [8].

Within the first phase of the proposed algorithm, upon the usage of learning automata, efforts are made for clustering of inner-network nodes in a manner that the said clusters (or cluster heads) would maintain the maximum energy, and thereby maintain the longest lifetime. For this purpose, efforts are made in each section to select a network of nodes, which maintains the highest remaining energy compared to its other adjacent nodes, as the cluster head node for that section. The selected node plays the role of cluster head for the other nodes within that section, as long as its average remaining energy is higher than the average energy of the nodes in that section. In this method, each node is equipped with a learning automata, within which each action of the automata is described as selection of that node and/or one of its adjacent neighbors as the cluster head node for that section. Throughout a learning repetitive process, each node with the assistance of its automata, selects the best and most energetic adjacent node as the cluster head node for itself. All nodes in a section will reach a joint decision on the selection of the cluster head node. On the other hand, given that the cluster head node maintains the duty to respond to all requests for transference of data, allocation of a channel and so forth from the other nodes of their cluster; it loses it energy at a faster pace compared to other nodes, as the result of which it loses its priority on playing the role of a cluster head. Hence, throughout consecutive intervals, the cluster head node should be re-selected and be once again replaced as the most energetic node in comparison to other nodes in that section. In this method, upon the application of the repetitive learning process in the automata, the permanent selection of the cluster head takes place until the end of the network’s lifetime [9].

In the second phase of the algorithm, the topology of the network should be carried out via the connection of the cluster head nodes. To this end, each cluster head node, upon sending a message, identified and connects with its adjacent cluster head nodes. In the end, the most resistant network topology is established via the formation of the most energetic backbone of the network and upon the connection of the most energetic cluster head nodes. Furthermore, prior to presentation of the proposed topology control method, and with due regard to the features, specifications and strong points of learning automata; the reasons behind usage of this method in this thesis as an efficient approach for clustering wireless sensor networks, in accordance to an energy-based method, are detailed [10, 11].

As you know, in order to use learning automata; the fundamental features of learning automata, and nature of its application should be concurrently taken into consideration. The learning automata, in accordance to following features, have been applied as powerful tools for resolution of many problems.

  1. 1.

    Learning automata perform appropriately while there is no data at hand.

  2. 2.

    Learning automata perform appropriately when there is lack of assurance.

  3. 3.

    Learning automata search in the possibility atmosphere.

  4. 4.

    Learning automata need a simple feedback from the environment in order to improve their situation in each phase.

  5. 5.

    Learning automata are highly useful as a model for learning in distributed and multifactor environments, with restricted communications and inadequate data.

  6. 6.

    Learning automata maintain a simple structure and therefore can be easily placed in a software or hardware.

  7. 7.

    Learning automata need to use a theoretical and derived efficiency criterion for optimization applications.

  8. 8.

    Learning automata maintain a negligible computational load and can thus can be simply used for immediate functions.

  9. 9.

    There are powerful mathematical analytical methods for analysis of learning automata.

In addition to above features, usage of learning automata is beneficial for applications which maintain one or few of the following traits. Meanwhile, given the innate features of the wireless sensor networks, as a completely distributed environment with a high degree of dynamism and uncertainty, usage of learning automata would be highly effective.

  1. (1)

    The application is sufficiently complicated and uncertain, such that there would be no mathematical approach available for it

  2. (2)

    The application would be able to control distribution and create role models via a series of self-autonomous factors

  3. (3)

    The reinforcement signal would be a random variable and would be produced based on an efficiency criterion

  4. (4)

    Quantitative improvement in efficiency would be highly economically justified

  5. (5)

    There would be no certain algorithm in regard to the considered application [12, 13]

Meanwhile, learning automata also maintain a number of restrictions; which makes their usage for many applications, as problematic. These restrictions are as follows:

  1. (1)

    Learning automata make use of a few initial data and the additional data about the environment cannot be always used by learning automata

  2. (2)

    Learning automata maintain a low rate of convergence for many functions

  3. (3)

    Learning automata are non-reminiscent models

Given the above, and the features, specifications, restrictions, and capacities of wireless sensor networks, which we will mention later on; the usage of learning automata in this thesis for resolution of topology control issue based on clustering in the wireless sensor network can be justified [14].

As previously mentioned, in the wireless sensor networks, the continuous changes in the network’s topology are due to emergence of malfunction of nodes; limitation of the band width of wireless connections; restricted energy of nodes; interferences of channel; noisy connections, and a number of other factors; resulting in uncertainty, unpredictability, and changeability of the specifications of the said networks, such as channel’s band width; channel’s capacity; potential for delivery and receipt of nodes; and so forth; with the passage of time. This in turn causes numerous challenges in analysis and design of topology control algorithms. The majority of algorithms which have been presented for control of topology in wireless sensor networks, have been based on the following assumptions:

  • In majority of algorithms, the assumption is that each node should always have complete and accurate data on the network’s topology. The assumption that in wireless sensor networks, in addition to absence of a fixed and determined network infrastructure; the movement (limited) of nodes and/or their malfunction leads to emergence of continuous topologic changes across the network; imposes many communication slags on the network in order to update nodes’ data. This, in turn, leads to wastage of energy in network’s nodes, in addition to wastage of band width and the existing capacity of the network’s channels.

  • Determination of the network’s topology based on momentary image of the topologic structure of the network and adoption of measures for supporting future changes; is an approach which has been adopted in majority of the proposed algorithms. Assuming a complete separation between the two phases of formation of clusters and support for change, within networks whose one innate nature is continuous topologic changes, will result in a fall in the efficiency of presented algorithm and wastage of energy within the nodes and band widths of wireless connections.

  • Given the continuous changes of topology of network, due to movement (limited) of nodes and/or their impairment; the environments of wireless sensor network are completely dynamic, unpredictable, and uncertain environments. Thus, assumption of the accumulation of all topological data of network’s nodes, and implementation of algorithm in a single node, especially within networks that maintain highly volatile nodes, on one hand results in transformation of the single node into a point of failure for the algorithm; and on the other hand will lead to wastage of the existing sources such as the communication channels and source of energy.

  • In the majority of proposed algorithms, given the necessity to coordinate and match the performance of nodes; the presence of assuring models for publication and/or collection of control messages across the network is an obvious assumption. This comes while the wireless base of the said networks, movement (restricted) of mobile nodes; and their dynamic presence, despite the corresponding costs, put assuring conditions in place for materialization of this assumption [15] .

  • The wireless sensor networks maintain a completely distributed network, in which collection of data in one node and/or their publication across the network, imposes countless excessive communication demands on the network; especially in networks with large scales. Additionally, the existing limitations in the band width of wireless connections and energy of nodes, raise inclinations toward design of algorithms that are capable of making different decisions across the network solely based on the data of nodes, and with the low consumption of sources in a distributed manner. Design of such distributed algorithms makes it possible to carry out updating and supporting operations, in a local form and shape, in case of need and upon the emergence of topologic changes [16].

  • Continuous topologic changes of the network, resulting from limited stimulation of nodes and/or their impairment, restriction of the band width of wireless connections, limited energy of nodes, interferences of wireless channels; noisy connections; and a number of other factors have led to focusing upon, and assessing the specifications of the said networks, such as the channel band width; channel’s capacity; capacity of node for delivery and receipt; within the framework of topology control algorithms, turning them into random, unpredictable and changeable specifications with the passage of time, causing numerous challenges in the analysis and design of the said algorithms for these random networks. Negligence of the random and variable specifications, and assumption of presence of a network with completely fixed, stable, and certain specifications leads to adoption of inappropriate decisions which are distant from the innate realities of the said networks, and are therefore inappropriate to implement, resulting in a sharp fall in output.

    Given the said cases, and upon precise focus on the features and capacities of learning automata model, which were previously mentioned, it is evident that majority of assumptions are only capable of easing and justifying the presentation and theoretical analysis of proposed algorithms, while implementation of some of them has proven to be a difficult task due to the existing capacities and limitations in wireless sensor networks, and could possibly result in a sharp fall in the efficiency of said algorithms. On this basis, learning automata, and their corresponding combined models can be considered as an appropriate model for resolution of the said issues in wireless sensor networks, due to their following features.

  • Learning automata are appropriately capable of matching themselves with the environmental changes. This is a highly appropriate feature for usage in the wireless sensor networks environments that maintain a very high degree of dynamism.

  • The learning automata, in addition to low computational needs, impose minor communication costs on the environment throughout their interaction with the environment. This feature makes learning automata an appropriate option for usage in environments, within which there are restrictions of energy and band width, in comparison to other models.

  • Learning automata in their interactions with each other are capable of building models of distributed nature of wireless sensor networks and to simulate the changeable behavioral patterns of nodes in communication with each other and the environment, given learning automata’s learning and matching potentials [17].

  • Learning automata, in interaction with each other, are capable of convergence for resolution of issues related to optimization, only by reliance on local decisions. Thus, algorithms which are based on learning automata are considered as an appropriate choice for wireless sensor networks, given that they dispel the slags resulting from the collection and/or dissemination of information in centralized algorithms.

  • Learning automata, in a repeatable process, and with the passage of time, complete the information that they need for decision-making from their living environment. On this basis, the endurance of algorithms which are dependent on learning automata, in the face of possible errors in receipt of data by nodes, which is common in wireless sensor networks, will not leave an impact as such on the performance of algorithm, in comparison to other algorithms.

The main goal of this thesis is presentation of a smart algorithm based on learning automata model, which upon consideration of the limitations of the said networks, and upon reliance on prudent assumptions would be able to provide an appropriate solution for the issue of topology control within wireless sensor networks.

  • Proposed topology control algorithm based on learning automata

Given that the proposed algorithm is an algorithm aware of energy and efficient energy; with due regard to the remaining energy of the network’s nodes, and within the framework of a repeatable learning process, it tries to form the most durable topology of the network via selection of the most energetic nodes of the network and establishment of the most resistant clusters against impairment of the sensor nodes.

The phases of the proposed algorithm are as follows [18,19,20]:

Initially, a network of learning automata in harmony with the wireless sensor network is established. For this purpose, a learning automata (For instance Automata Ai) is allocated to each of the network’s nodes (For instance node Nj). Each automata shapes its internal structure (Action set and possibility vector in selection of action) as follows:

Given that in the proposed algorithm, the action set of each automata are defined based on adjacent nodes; it is necessary for each node of the proposed algorithm to continuously update its topologic data. To this end, each node periodically airs a hello message within the boundary of its delivery. This message is sent by all nodes once in a while in order to update the topologic data of the network. The proposed algorithm in this thesis, is an energy-aware algorithm and on this basis any node, in its hello message, in addition to its ID number, sends and notifies its remaining energy. In this manner, all of the inner-network nodes are continuously informed of the amount of remaining energy in their neighboring nodes.

Each node, upon the receipt of hello message, includes its ID No. and the data on the remaining energy of the node, which has received the said message, within its list of neighbors. The number of fields for each of the nodes of the network in the list of neighbors is equal to the number of its neighboring nodes. In this manner, each of the nodes will maintain the local data of its neighbors. In fact, list of neighbors of each node such as Ni node is presented as equivalent to the length of its neighbors and each of its homes belongs to a neighboring node such as Ni node which maintain the two fields of ID and Energy within the framework of List of Neighbors.

  • Assume that Ai Automata is equivalent to Node Ni.

  • Assume that αi is the action set for Automata Aj.

  • Assume that Ei is the remaining energy of Ni Node.

  • Assume that Ei is the average remaining energy of nodes adjacent to Node Ni.

  • Assume that α is the action selection of Nj Node by Ni Node.

  • Assume that ρi is the vector on possibility of selectin of action of Automata Ai.

  • Assume that ρj is the possibility of selection of action αj.

    Now, the action set of each automata allocated to each node is defined as follows with due regard to the location of that node in the network.

  • The letter r is the number of actions of each automata which is equivalent to the number of neighboring nodes, from which the automata has received a hello message, plus one.

  • Each action of automata is equivalent to its own selection and/or the selection of one of the neighboring nodes from the List of Neighbors.

  • Selection of each action of α by the automata is equivalent to Node Ni, tantamount to selection of Node Nj as the cluster head for Node Ni.

  • Action probability vector is formed with allocation of the initial value of 1/r to each action when r is equal to the number of actions of each automata.

Given the possibility of impairment of nodes; the network’s topology is changeable with the passage of time, and since the structure of automata is shaped based on data on nodes’ neighbors; the data of the topology of network should be continuously updated in nodes. Thus, the proposed algorithm continuously updates its topologic data. If, after a specified duration, a hello message is not received from a node which has been adjacent to Node Ni, it would be eliminated from the list of neighbors of the said node and its equivalent action from Automata Ai is also eliminated. Thus, the learning automata which is used in this thesis is the learning automata with a number of variable actions. One of the other reasons for usage of learning automata with a variable action set is to prevent the emergence of loop in the process of development of topology, which will be fully detailed later on.

As previously mentioned, the process of control of topology in the proposed algorithm is carried out in two phases. The first phase is the clustering of network, while in the second phase; the backbone is built via the connection of the cluster head nodes.

  • The first phase: Clustering of the Network [21,22,23]

  1. (1)

    Initially, each Ni node activates its allocated Ai.

  2. (2)

    Automata Ai chooses one of its possible actions in random and based on the Action Probability Vector ρi.

    Each action selected by automata means that the equivalent node to the selected action is considered as the cluster head for the said node.

  • Assume and select Nj as the cluster head

  • Now, the learning automata equivalent to each node, assesses the optimality of the selected action as follows:

  1. (3)

    Ni Node calculates the average energy of the remaining adjacent nodes Ei.

  2. (4)

    If the remaining energy of the chosen cluster head (Ei) would be larger and/or equivalent to the average energy remaining from all neighboring nodes to Node Ni; in this case.

  3. (5)

    The learning automata rewards its selected action via the usage of following relationship and raises the possibility of selection of that action for the following repetitions.

$${\text{Pi}}\,(\upeta + 1) = {\text{Pi}}\,(\upeta) +\upalpha(1 - {\text{Pi}}\,(\upeta))$$

The possibility of selection of other actions of the automata is reduced upon fining them via the following relationship; with a and b respectfully being the fine and reward rates [24, 25].

  1. (6)

    Otherwise, if the energy of the node of selected cluster head would be less than the average remaining energy of all neighboring nodes.

  2. (7)

    The learning automata reduces the possibility of selection of the said action (node) via fining it based on the following relationship:

$${\text{Pi}}\,(\upeta + 1) = (1 - {\text{b}}){\text{Pi}}\,(\upeta)$$

The learning automata increases the possibility of selection of other actions of automata upon rewarding them via the following relationship [26]:

  1. (8)

    The steps one to seven in each node is repeated until the automata equivalent to that node selects one of its actions with the possibility of more than one threshold limit referred to as Pr; that is also referred to as the condition for termination of algorithm. Obviously, the higher the value of Pr, the more will be the cost of implementation of algorithm, and on the other hand the convergence of automata with the optimal cluster (most energetic cluster) takes place with further precision. This means that a more appropriate node is chosen as the cluster head. However, on the other hand, with the reduction of the value of Pr, the possibility of convergence with the optimal cluster is reduced, in addition to the reduction of the cost of implementation of algorithm.

In this manner, in each section of the network, the node which maintains the highest level of energy in comparison to other neighboring nodes, is chosen as the cluster head node of that section. The selected node, as long as having a remaining energy higher than the energy of other nodes in that section, plays the role of cluster head for the clusters within that section. As a result, upon the usage of learning automata, the inner-network nodes are clustered such that the said clusters (or cluster heads) maintain maximum energy and thereby the maximum lifetime.

  • Second Phase: Development of Topology

In the second phase of algorithm; the topology of the network is established via connection of nodes of cluster head. To this end, each node of the cluster head identified itself and connects to them, via delivery of the message of nodes of its neighboring cluster head. The process of connection of cluster heads is as follows:

Each node of the cluster head CH forms a message of request for connection, known as CR, in short, sending it to its neighboring nodes.

In this case, it is possible for one of the following cases to occur for each node neighboring Nj, which receives the message CR from cluster head CH [27]:

  1. (A)

    The two cluster heads are located a step from each other. The node neighboring Nj is a cluster head node. In this manner, two cluster heads Chi and CHj are directly connected to each other.

    In this manner, Node CHi also receives its CR message from CHj Node.

  2. (B)

    Two cluster heads are two steps apart. The Node neighboring Nj neighbors another cluster head such as Chi Cluster Head. In this case, the node neighboring Nj as the gateway node, is duty-bound to connect Chi Cluster Head and CHj Cluster Head.

  3. (C)

    In the last mode, two cluster heads are three steps apart. In this case, the two cluster heads Chi and CHj are respectively neighboring two gateway nodes Ni and Nj; with the two said nodes also being next to each other. In this mode, the two cluster heads Chi and CHj are connected to each other via two gateway nodes Ni and Nj.

In the end of the second phase of proposed algorithm, the most resistant network topology is formed via the connection of the most energetic cluster head nodes, thereby shaping the most resistant backbone of the network (against the movement and impairment of sensor nodes).

One of the capacities and abilities of learning automata is compatibility with the environment and environmental changes. Given that each topology maintains a limited lifetime, and after a while, the formed topology will lose its energy and will break; under these conditions, in each of the topology control methods, the process of redevelopment of topology should be repeated. But, in the proposed algorithm, with the passage of time, whenever each node of the backbone loses its energy, the learning automata intelligently converges toward the second energetic neighboring node, in the following repetitions; preventing the failure of topology and removing the need for the repetition of algorithm. Thus, it is expected from the proposed method to significantly rise topology’s lifetime and to reduce the costs of calculations and communications.

3 Conclusion

In this thesis, a fresh approach based on learning automata was presented for unequal clustering of wireless sensor networks. In unequal clustering, the sizes of formed structures are not equal and the majority of clusters close to sink node will be smaller in size, while the further away clusters will maintain larger sizes. This feature reduces the consumption of energy in the vicinity of the sink node and will increase the network’s lifetime. The proposed model can maintain the following steps.

  1. (1)

    Determination of the cluster head with the usage of a learning automata

  2. (2)

    Determination of the size of each cluster with the application of optimization model and formation of clusters

  3. (3)

    Routing data in the clustered network, with the application of routing algorithm

  4. (4)

    Measures of the automata are updated in the end of each cycle

In the proposed method, the cluster heads are determined with the usage of possibility vector of the learning automata, and the number of neighbors to sensor node is determined. At the end of each cycle, the measures of learning automata are updated with the application of laws on rewards and fines. This feature results in the better performance of the proposed method in selection of the cluster head after a number of cycles.

In order to assess the performance of the proposed model, NS2 software is used. In this measure, the performance of the clustering algorithm in the network’s environment was studied from different angles such as the network’s lifetime, rate of consumption of energy, rate of delivery of package, and participation of nodes in the delivery process. The results of this study revealed that the proposed method maintained a better performance compared to the performance of other methods, and in addition to a longer lifetime, the proposed method will result in lower consumption of energy, and larger number of delivery of packages.