A fault-tolerant hierarchical hybrid mesh-based wireless network-on-chip architecture for multicore platforms

Dehghani, Abbas; Jamshidi, Kamal

doi:10.1007/s11227-015-1430-z

A fault-tolerant hierarchical hybrid mesh-based wireless network-on-chip architecture for multicore platforms

Published: 24 April 2015

Volume 71, pages 3116–3148, (2015)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

The Journal of Supercomputing Aims and scope Submit manuscript

A fault-tolerant hierarchical hybrid mesh-based wireless network-on-chip architecture for multicore platforms

Download PDF

Abbas Dehghani¹ &
Kamal Jamshidi¹

351 Accesses
15 Citations
Explore all metrics

Abstract

Wireless network on chip (WNoC) is a promising new solution for overcoming the constraints in the traditional electrical interconnections. However, the occurrence of faults has become more prevalent because of the continuous shrinkage of CMOS technology and integration of wireless technology in such complex circuits. This can lead to formation of faulty regions on chip, where the probability of the entire system failure increases in a significant manner. This issue is not addressed in the previous works on WNoC systems. In this article, a fault-tolerant hierarchical hybrid WNoC architecture is proposed. First, an innovative strategy is proposed for solving the problem of fault-tolerant wireless routers placement in standard mesh networks inspired by node-disjoint communication structures. Next, efficient fault-tolerant communication protocols are presented for applying this structure. The experimental results demonstrate the robustness of this proposed architecture in the presence of various fault regions under different traffic patterns.

A fault-tolerant and congestion-aware architecture for wireless networks-on-chip

Article 21 February 2019

An efficient fault-tolerant routing algorithm in NoCs to tolerate permanent faults

Article 25 May 2016

Performance-Enhanced $$d^2$$ -LBDR for 2D Mesh Network-on-Chip

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

According to Moore’s law, continuous shrinking of silicon technology leads to the integration of a large number of processing cores and memory elements into a single chip as multiprocessor system-on-chip (MPSoC). Communication plays a crucial role in the design and performance of these systems. Conventionally, the network-on-chip (NoC) approach is proposed as a scalable interconnection for on-chip communications using integrated switching network [1]. Despite the advantages of this approach, the traditional NoC architectures face serious performance limitations, such as high transmission latency and high power consumption for communication between distant cores [2–6]. According to the International Technology Roadmap for Semiconductors (ITRS) [7], the wired interconnections are the major bottleneck of satisfying performance requirements in future multi-core systems; therefore, new interconnection approaches are needed. Different interconnection strategies are proposed to overcome the limitations of wired interconnections [2–4].

Wireless network on chip (WNoC) is one of the promising solutions to achieve higher performance compared to the traditional NoCs [4]. This approach provides a flexible communication platform with high bandwidth and low power consumption for on-chip data transmission. Different types of WNoC architectures are proposed which demonstrate high efficiency of these architectures [4]. As the wired interconnections are highly effective and proper for short distances, the hybrid approach currently dominates WNoC architecture. Many researchers have adopted hybrid hierarchical wireless architectures to achieve excellent trade-offs between latency and energy consumption with limited area overhead [8–13]. In spite of all hybrid WNoC advantages, wireless technology is prone to various failures due to the high integration density and the high complexity in system circuits. It should be noted that aggressive scaling of VLSI technology affects the reliability of the conventional NoCs [14, 15]. Adding wireless technology increases the fault rate due to the higher sensitivity of wireless communication components regarding noise sources and manufacturing issues [10, 16–18]. In the hybrid WNoC architectures, the wireless interconnections are in charge of long-distance transmissions, and if they fail the express links will be lost. This degrades the system performance in a significant manner, due to the prominent role of wireless routers in enhancing system performance. Hence, fault tolerance is a critical factor in WNoC architecture design.

In general, the faults of system on chips (SoCs) can be classified as permanent and transient, each of which can occur due to various reasons [15]. The probability of transient faults in WNoC architectures is higher than the traditional NoC architectures due to the inherent nature of wireless channels and electromagnetic interference of wireless system with other elements on the chip. Attempts are made in [17, 19] to handle transient faults in WNoC architectures. They presented a unified error control code (ECC) framework applicable to hybrid WNoC architecture. This scheme improves system reliability, but incurs additional area overhead. The reliability of hybrid WNoC architectures based on code division multiple access (CDMA) is studied in [20]. They demonstrated that the WNoC architectures achieve reliable communication using CDMA-based wireless interconnects. The rate of permanent faults is also increased due to the serious challenges of design and integration of wireless interface in the nanoscale domain. Moreover, other common problems like the manufacturing process defects are effective in this increase, because the wireless interconnection approach is still in its early stage [18]. Few studies are available on permanent faults in WNoC architectures. The authors in [16, 18] demonstrate that WNoC architectures based on complex networks satisfy the desired performance level in the presence of wireless link failures. They have utilized a power-law model to establish wireless links between routers. This approach enhances NoC performance and tolerates wireless link faults with minimal performance degradation in a significant manner. The trade-offs for different options of hierarchical hybrid WNoC architectures are proposed in [10]. They indicate that although the small-world millimeter-wave wireless NoC architecture WNoC has less performance compared to hierarchical WNoC, it provides more fault tolerance against wireless link failures. In these studies, the issue of complete failure of nodes is not addressed.

The focus of this article is on the permanent faults which may create fault regions in the WNoC systems. In the previous proposed WNoC architectures, the faulty regions are not considered. Although many studies refer to managing faulty regions in conventional NoCs, they have not considered problems of long-distance communications [15]. With respect to the prominent role of the wireless routers in the system performance, it is essential to design a WNoC architecture which would handle faulty regions properly. For this purpose, the network structures which provide node-disjoint paths between nodes can be used. These structures offer alternative routes without sharing nodes and channels [21, 22]. To provide this capability in the hybrid WNoC architectures, wireless routers must be established using an appropriate plan due to constraints such as limited area and power budget. Due to the existence of node-disjoint paths in standard mesh structures and their simple layout, the mesh structure is used here as the basic topology of WNoC architectures.

At first, an innovative strategy of placing a reasonably low number of wireless routers is proposed in a standard mesh topology with the objective of fault tolerance. This approach leads to the construction of a fault-tolerant cluster-based hybrid WNoC architecture. In this approach, each core can reach at least two wireless routers with an acceptable Manhattan distance. This design provides a regular arrangement of wireless routers in a manner where there are at least two wireless node-disjoint paths for any long-distance communications. To apply this design, a wireless communication mechanism and a fully adaptive fault-tolerant routing protocol are suggested here, providing a fault-tolerant solution for handling permanent faults. The simulation-based evaluations here demonstrate the effectiveness of the proposed architecture in the presence of fault regions. It is found that here the fault-tolerant WNoC architecture consumes less power and provides considerable improvements in latency and throughput compared to fault-tolerant traditional mesh-based NoC.

The rest of this article is organized as follows. Section 2 describes the proposed fault-tolerant WNoC architecture design, including topology, an approach for determining locations of wireless routers, the wireless infrastructure, and fault-tolerant communication protocols. In Sect. 3, simulation results under synthetic traffics are discussed, followed by the conclusion in Sect. 4.

2 Fault-tolerant WNoC architecture

WNoC architectures are in their early stages and are more prone to manufacturing defects and system lifetime faults. These faults can greatly reduce WNoC performance. Therefore, a fault-tolerant hierarchical hybrid WNoC architecture is proposed using an analytical method here. In the following sections, the formation mechanism of a fault-tolerant WNoC architecture including topology, physical layer, and communication protocols is explained in detail.

2.1 Topology

Node-disjoint communication structures can tolerate node and link failures in different networks by establishing routes with no common nodes [21, 22]. This article advocates adoption of such node-disjoint structures for reducing the effect of failures on WNoC systems performance. It is proposed here that each core can communicate with at least two wireless routers through disjoint paths, in addition to having at least two wireless node-disjoint paths between every pair of wireless routers, meaning that the wireless network must be at least a two-connected network [21].

2.1.1 System model

This design applies a hierarchical two-tier model for hybrid WNoC architectures. The basic infrastructure of this model is a standard mesh topology due to having inherent disjoint paths and applying regularity benefits [5, 6, 23]. Communications of the lower tier are performed through wire links used in the mesh structure (wire links). For the higher tier, some routers of mesh topology are selected to be equipped with the wireless interface. These enhanced routers play the role of gateway for long-range communications between distant cores. In fact, this architecture is a cluster-based network where the cluster heads are used in a different manner compared to that of the traditional clustering model. The features and requirements of this proposed model are described as follows:

The cores communicate with wireless routers through the mesh infrastructure. This approach does not need any extra links for accessing wireless routers. Moreover, this model increases fault tolerance due to the availability of disjoint paths to wireless routers.
The use of wireless routers is dependent on the distance between cores. It means that two cores with different wireless routers may communicate with each other through wire links if the distance between them is short. In this regard, we define a threshold distance parameter represented by T. This value represents a maximum Manhattan distance where no wireless link is necessary for communications while wired communications satisfy the performance requirements of the system.
Each core can communicate with at least two wireless routers. The distance between cores and their respective wireless routers must be reasonable. Here, the next parameter is represented by L, which is the maximum Manhattan distance between cores and their corresponding wireless routers. Parameter L has an important role in the performance of this proposed architecture, because it affects the number of wireless routers and their locations in a direct manner. If the value of L is too large, reaching wireless routers will require high energy dissipation and high transmission latency. If the value of L is too low, the number of wireless routers will increase, and this increase would not only lead to area overhead problems, but also cause serious problems in accessing the wireless medium [7, 18]. In the proposed model, the travel distance between any two cores in wired communications must not exceed the threshold distance. Hence, the value of L should be equal to or less than $h=\left\lfloor \frac{T}{2}\right\rfloor $.

2.1.2 Fault-tolerant wireless router placement

An attempt is made to determine the location of wireless routers in an (m+1)* (m+1) mesh topology with the following six conditions:

Condition1 :: All cores can reach at least two wireless routers with a maximum Manhattan distance L $(1 \le L \le h)$.
Condition2 :: Wireless network must be of at least a two-connected network.
Condition3 :: Regular arrangement of wireless routers is required to apply the communication protocols with the reasonable cost for optimal usage of node-disjoint paths. Moreover, this regularity can help the architecture scalability and easy implementation of architecture.
Condition4 :: Considering the area overhead of wireless interface and challenges of on-chip wireless communications, this proposed approach should be designed with a reasonably low number of wireless routers, so that the suggested architecture could be justified.
Condition5 :: The number of wireless routers should be sufficient to prevent bottleneck, since in addition to long-range communication, the normal operations of a router must be performed by these wireless routers.
Condition6 :: Considering the wireless communication range, wireless routers must be placed at a maximum possible distance from one another so that in case of faulty region occurrence, a core would not lose all of its wireless routers.

In general, the value of L is very low compared to the network size, which makes the search space vast. Hence, with respect to the aforementioned requirements, an innovative procedure for obtaining locations of wireless routers is proposed here.

It is obvious that at least three wireless routers are needed to build a two-connected wireless network. In such a network, there will be two node-disjoint paths between each pair of wireless routers. On the other hand, in a mesh structure with three wireless routers, the maximum size of the mesh can be $\left( {L+1} \right) *\left( {L+1} \right) ,$ so that all nodes can reach these wireless routers with a maximum distance of L. To prove the validity of this issue, it must be shown that the placement of three wireless routers in a $\left( {{\textit{L}}+1} \right) {*}\left( {{\textit{L}}+1} \right) $ mesh can provide the desired specifications of this proposed design.

If L has an even value, there exists a central node in the mesh, in which its maximum Manhattan distance to all other nodes is equal to L. Hence, by choosing the central node as the position of the wireless router, all nodes would have a wireless router with the maximum distance of L. If any node other than the central node is selected as the second wireless router, at least half of the network’s nodes can reach the second wireless router with a distance less than L. However, with this option, the distance of the second wireless routers from nodes located on the corner of the other half of the network will be more than L. Hence, another wireless router is needed which can be placed at any location in the second half of the network. As an example, for L = 4, some of the possible locations for wireless routers are shown in Fig. 1a.

If L has an odd value, there exists a central region where a location can be selected for placing the first wireless router. With this selection, all nodes except for one corner explained below can be connected to this wireless router with a maximum Manhattan distance of L. The network will be divided into two asymmetric sections. In the bigger section of $\left( {L+1} \right) *\left\lceil \frac{L+1}{2}\right\rceil $ size, the distance from one of the corner nodes to the selected wireless router is $L+1$, and this corner node will need two new wireless routers to satisfy Condition1. These two new wireless routers can be placed in the $x$ and $y$ direction with the distance of L from this corner node. Hence, all nodes in the network can have two wireless routers with a maximum distance of L, e.g., for L = 3, some of the possible locations for wireless routers are shown in Fig. 1b.

However, if even one unit is added to this ($L+1)*(L+1$) mesh size (i.e., we have an (L + 2)*(L + 2) mesh), then there will be one node that does not have access to these three wireless routers with the maximum Manhattan distance of L. This is due to the fact that the distance between the nodes in opposite corners is 2L + 2. If two wireless routers with a maximum distance of L for the node in one corner are placed, then the distance of the node in the opposite corner to these wireless routers becomes $L+2$. Therefore, as mentioned earlier, with three wireless routers placed at the proper locations, the nodes in an (L + 1)*(L + 1) mesh can be connected to two of these wireless routers with the maximum distance of L (Condition1).

For locating wireless routers in the mesh network with ${m}>{L}$ size, first the proper wireless routers placement in a (L + 1)*(L + 1) subnet must be selected. Regardless of the whole network size, in each subnet, three wireless routers are needed. Note that this issue satisfies Condition5. For placing fewer numbers of wireless routers in the network, neighboring subnets should be able to use common wireless routers. For this purpose, it is better that the wireless routers be placed in the corners of the given subnet. In this regard, two approaches could be used.

In the first one, two wireless routers are placed in the subnet corners which satisfy Condition6. Here, for satisfying Condition1, the third wireless router must be placed in the center (Fig. 1). This is named the diagonal approach. Note that when L is odd, instead of the center node, there exists a central region. One location is chosen in the central region as a wireless router. For satisfying Condition6, the two other wireless routers would be placed on the corners of the diagonal where the first wireless router is not placed (Fig. 1b).

In the second approach, the three wireless routers are placed in the three corners of the subnet. For satisfying Condition1, another wireless router is placed in the fourth corner. With this approach, Condition6 is also satisfied. This new approach is named as the corner approach. The reason here is that the corner approach may reduce the number of wireless routers by sharing more of them with subnets compared to the first approach. It should be noticed that in this approach similar to the diagonal approach, the adjacent subnets need two wireless routers at most, while the distance between the wireless routers is observed (Condition6).

With respect to these two approaches, the network is divided into mesh subnets of (L + 1)*(L + 1) from one corner (for example, the top left corner as in Fig. 2a). In the diagonal approach, in each subnet, the diagonal positions are in such a manner that the two neighboring subnets have one wireless router in common (Fig. 2a). In the corner approach, the wireless routers are placed in the four corners of the subnets (Fig. 2b).

For an (m + 1)*(m + 1) mesh network, if “$m\,mod\,L = 0$” (i.e., the $m$ value is dividable by L), and “$m\,div\,L= a$”, through deduction, if the value of $a$ is even, the number of wireless routers in the diagonal approach is $(\frac{{5\mathrm{a}}^{2}+{2\mathrm{a}}}{4}+\left\lceil \frac{{\mathrm{a}+1}}{2}\right\rceil {*}\frac{\mathrm{a}}{2})$. If the value of $a$ is odd, the number of wireless routers in the diagonal approach is $\frac{\hbox {3a}^{2}+\hbox {2a+1}}{2}$. In the corner approach, for any value of $a$, the number of wireless routers is $(a+1)^{2}$.

By comparing the obtained numbers of both the approaches, it would become clear that in this case, for the even value of $a$, the diagonal approach and, for odd values of $a$, the corner approach will have less wireless routers (Fig. 2).

If “$m\,mod\,L= d$”, some subnets of (L + 1)*(d + 1) and (d + 1)*(L + 1) size at the edges of the network would be formed in the $x$ and $y$ directions. Moreover, a subnet of (d + 1)*(d + 1) size at the bottom right corner of the network would be created. With respect to the value of $a$, the location order of wireless routers in these subnets is almost similar to the location order of an (L + 1)*(L + 1) mesh, so that regularity is retained (Condiion3) and Condition5 is satisfied as well. The difference here is that, when $d \le \left( {\textit{L}/2} \right) $ in the subnet of (d + 1)*(d + 1), no wireless router is needed in the central area in the diagonal approach, and no wireless router is needed in the fourth corner position in the corner approach. Furthermore, in the corner approach when $d \le \left( {\textit{L}/2} \right) $, for satisfying Condition5, the positions of wireless routers in the partial subnets are shifted by $d$ unit in the $x$ and $y$ directions of each subnet. Note that when d = 1, the above-mentioned procedure is true in all partial subnets as shown in Fig. 2c.

Consequently, a fault-tolerant wireless placement procedure “FTWP” is proposed for the standard mesh networks. As it can be noticed, both the FTWP approaches satisfy the regularity of wireless routers placement (Condition3). Since the subnets are placed next to one another in the mesh structure, the nodes can have access to more than two wireless routers. Normal nodes can register one of the wireless routers as the primary and the rest as the backups.

The other effective parameter in this proposed architecture is the wireless communication range, which is presented by $R$. It is more appropriate to describe $R$ as $r*w$, where r is a number calculated according to the maximum wireless communication range, and $w$ is the length of a wire hop. With respect to the wireless routers arrangement in the network, the minimum value of R is $r = \sqrt{2}L\left( {\sim \!2L} \right) $ where a connection is made between each one of the two wireless routers in a subnet. The maximum value of R depends on the characterizations of on-chip wireless communication which is discussed in Sect. 3. Nevertheless, even with the minimum value of R, in both the FTWP approaches, each wireless router has connections with two other wireless routers, which have connections to at least two other different wireless routers. This fact indicates that there is at least one pair of wireless node-disjoint paths in the network. Hence, Condition2 is satisfied.

To study the performance of FTWP, the obtained minimum number of wireless routers by FTWP is compared with the number obtained through an exhaustive search for various system sizes. Table 1 shows the minimum number of wireless routers required to satisfy Condition1 through FTWP and the exhaustive search. It can be seen from this table that the results of FTWP are close to the exhaustive search. Note that the minimum number of wireless routers obtained through exhaustive search does not satisfy the required six conditions of this proposed architecture. With an increase in N, finding the minimum number solution by exhaustive search becomes increasingly difficult to a degree where the search time is unjustifiable.

Table 1 The number of wireless routers for different network sizes with different placement approaches

Full size table

Considering the challenges of on-chip wireless communication [4, 8, 10], the complexity of the proposed architecture will increase significantly for small values of L and it will not be cost-effective, meaning that Condition4 is not satisfied. Hence, a new approach based on FTWP is proposed to decrease the number of wireless routers while obtaining and maintaining the mentioned conditions in this article.

In FTWP, backup wireless routers are used in the case of probable failure. Hence, we can increase their distance from the nodes beyond L which leads to a change regarding Condition1 for the purpose of decreasing the number of wireless routers. The primary wireless routers must still satisfy the condition of the maximum distance of L, while the concern here is the backup wireless routers. A new rule must be defined for the acceptable maximum distance between the nodes and their backup wireless routers.

Considering a subnet of (L + 1)*(L + 1) size and wireless routers placement as shown in Fig. 1, with only two wireless routers properly placed, all nodes will have access to their primary wireless routers with the maximum distance of L. For satisfying condition of the maximum distance between wireless routers (Condition6), the best placement is the diagonal two end points (Fig. 3). This new placement stratifies the other conditions as FTWP does.

In this approach, the nodes adjacent to the primary wireless router would have the farthest distance to backup wireless routers. With the failure of the primary wireless router, in high probability these nodes will fail as well. However, the distance between nodes and backup wireless routers must be subject to an appropriate limit. For this purpose, the L value is limited with respect to the T value. According to Fig. 3, the maximum distance from normal nodes to their corresponding backup wireless routers is $2L-1$ hops. With the failure of a primary wireless router, the packets will travel at most $2L-1+L$ hops and this number should not exceed the T value. Therefore, the value of L should be chosen in the $1\le {\textit{L}}\le \left\lfloor \frac{T}{3}\right\rfloor $ range. It should be mentioned that in a case where the primary and backup wireless routers of a single node both fail, the communication protocol should provide effectively the communications of this node through wire links as will be discussed later. In this new approach, by setting $r\ge 2L$, the wireless connectivity among wireless routers will be maintained due to the available wireless routers in the neighboring subnets. Considering the regular arrangement of wireless routers, the chip can be virtually divided into four regions from the perspective of any wireless router; here, in each region, there will still be two different wireless routers providing node-disjoint paths using quadrant-based communication protocols [24, 25].

According to the maximum Manhattan distance of 2L between nodes in a subnet, regardless of network size, two wireless routers are required in each subnet for providing primary and backup wireless routers for all normal nodes in the subnet. It is obvious that this new approach provides the minimum number of wireless routers if $m$ is dividable by L. When “$m\,mod\,L= d$”, the number of wireless routers in some cases can be further reduced. In this case, some subnets with sizes smaller than an (L + 1)*(L + 1) mesh will be formed at the edges of the network. For satisfying Conditions 2 and 3, the location of wireless routers in these subnets is similar to an (L + 1)*(L + 1) mesh (Fig. 4a). With the exact investigation of these cases in this proposed design, the reduction method of the wireless routers is determined. If $d=1$ and the selection of wireless router locations have begun from the right corners of the subnet as shown in Fig. 4b, each pair of wireless routers in the $x$-direction or $y$-direction of the mesh are eliminated. For each elimination, the closest wireless router in the adjacent subnet with a size of (L + 1)*(L + 1) is shifted to the edge of the chip where the elimination has occurred. Then another wireless router is placed with the distance of L hops from this position of the network in the mentioned shift direction. Moreover, in a subnet of (d + 1)*(d + 1) size formed at the bottom right corner, if $d \le \left( {L/2} \right) $, then only one wireless router will be placed at that corner of the network (Fig. 4b).

In brief, in this new approach, each node can communicate with the primary and backup wireless routers with a maximum distance of L and $2L-1$, respectively, and an upper limit for L. In this approach, each node will have less backup wireless routers, but the cost will be reduced. This proposed fault-tolerant wireless network on chip (FWNoC) is formed based on this new placement approach. In the following sections, the communication protocols for the FWNoC architecture will be discussed.

2.2 Wireless interconnection system

Several infrastructures are proposed for on-chip wireless communications at different frequency ranges [4]. In [26], the authors proposed a wireless infrastructure based on CMOS Ultra Wideband (UWB) for the NoC architecture. This infrastructure provides a data rate of 1.16 Gbps at a central frequency of 3.6 GHz with a transmission range of 1 mm. In [25], a multi-channel wireless infrastructure using UWB transceivers is proposed where the communication capacity is increased. In their method, the wired links are used for a synchronous distributed MAC protocol. The advances in CMOS millimeter-wave circuits have made bandwidth of hundreds of GHz available. In [27], the authors proposed a 0.38 mm metal zigzag antenna to design a single channel millimeter-wave WNOC (mWNoC) architecture at a center frequency of 57.5 GHz with data rate of 16 Gbps and transmission range of 20 mm. In [9], three distinct channels with a 3 dB bandwidth of 16 GHz are used for a multi-channel mWNoC architecture with center frequencies of 31, 57.5 and 120 GHz. In [28], the authors proposed a WNoC architecture using sub-THz antennas placed in a polyimide layer. This architecture has 16 non-overlapping channels in the frequency range of 100–500 GHz. Each channel can transmit data at a 20 Gbps rate with a 10–20 mm transmission range. In [12], an infrastructure is proposed where low power and compact design transceivers are used based on on–off keying (OOK) modulation. This infrastructure provides 16 non-overlapping channels with a total of 512 GHz bandwidth. In [8, 29], the authors introduced the use of nanoscale antennas based on carbon nanotubes (CNTs) operating in the THz frequency range and the design of a WNoC architecture operating in the THz frequency range using CNT antennas. By applying this technology, 24 distinct wireless channels each with 10 Gbps bandwidth and 0.33 pJ/bit of energy can be created [8].

Some of these introduced infrastructures, due to their closeness to what is proposed here, can be used for FWNoC architecture. When the number of wireless routers is low, the infrastructures based on a single-channel scheme can be used as the physical layer of this proposed architecture, while when there is a wide distribution of wireless routers, these infrastructures are not appropriate due to their high-speed concurrent transmission demand.

According to the architectural parameters and the number of wireless routers, the physical layers are required based on the multi-channel schemes. However, a low-cost infrastructure should be selected with the ability to provide energy-efficient wireless communications. The proposed infrastructures in [8, 9, 12, 28] can satisfy the requirements of this newly FWNoC architecture. After specifying the physical layer, an effective wireless medium access control (WMAC) becomes necessary.

In this proposed approach, the wireless routers are organized into separate groups, and a specific carrier frequency is allocated to each one of the groups. This is a type of frequency division multiplexing (FDM) following a process where the neighboring groups (i.e., groups with wireless routers located within the communication range of one another) do not have the same carrier frequencies. Moreover, the problem of hidden terminal [26] should be of concern; this means a wireless router may be within the range of two non-neighboring groups. In this case, these two groups must not have the same frequencies. If each group is considered as a vertex, all groups can be modeled with a single graph with respect to the parameter R. Therefore, the allocating frequency is similar to the vertex coloring problem [30], such that none of the two adjacent vertices have the same color with respect to the hidden terminal problem. For avoiding interference in each group, time division multiplexing (TDM) technique is applied. As the token scheme has simpler synchronization mechanism and less waiting time for transmitting data compared to the classic TDM mechanism, a token sharing scheme is used for assigning time slots to wireless routers of a group. The number of tokens is equal to the number of wireless groups. In fact, each group has its own special token. Tokens are passed among wireless routers and represent the right of transmitting on a certain channel frequency. In summary, in the proposed WMAC, each wireless router is equipped with a transceiver based on the token scheme at the carrier frequency of its own group. Moreover, if a wireless router is in the communication range of other groups, it will be tuned to the frequency bands of those groups. Hence, each wireless router will be equipped with one transmitter and multiple receivers. In fact, this method is a combination of the TDM and FDM schemes.

The cost and complexity of this method depend on the wireless router grouping method, because it affects the required number of channels and receivers in the wireless routers. To reduce the system cost, a procedure is suggested for grouping the wireless routers as follows.

These groups are created by selecting one wireless router as the base and adding other wireless routers one by one in such a manner that the following conditions are met:

1.
Any pair of wireless routers in the group should be in the transmission range of each other.
2.
Groups should contain the maximum possible wireless routers.
3.
If there are wireless routers available on the edges of the network capable of being added to the group, they should be added even if Condition2 here is not satisfied. This is because these wireless routers will create single wireless router groups and this would increase the total number of groups in the network.

The first wireless router selection begins from one corner of the network and thereafter the nearest wireless router to the previously created group will be selected. For example, in the groups of a mesh network with $m>>L$ and $r=2L$ (Fig. 5.), five distinct channels are necessary for satisfying the required conditions of this proposed WMAC. An increase in the communication range would lead to having less channels, indicating the creation of less groups. Consequently here, the group size will be larger and, hence, an increase in the token scheme cost. On the other hand, this high communication range is appropriate, because it decreases the number of hops and consequently an increase in energy efficiency. Note that the transmission range can be increased without changing the size of the groups, but in this case the number of receivers will be increased. Hence, selecting an R value is a trade-off between cost and performance, as will be investigated in Sect. 3.

2.3 Fault-tolerant routing protocols

The design of fault-tolerant and deadlock-free routing protocols is necessary for applying the proposed communication platform with respect to the following preliminaries and assumptions.

The convex block fault model in [31] is adopted here with some modification. In the adopted models, both node and link faults are addressed. The focus of this study is on the faulty nodes. If one node fails, all the links related to it are considered as faulty links. In this model, nodes are aware of the status of their neighboring nodes. The modification made here is to divide the entire chip into five zones (Fig. 6). One of them is the central zone named the Center-Zone which does not contain any edges of the chip. The other four zones, including their respective edges of the chip, are named Right-Zone, Left-Zone, Up-Zone, and Down-Zone (Fig. 6). In this model, the corners of the chip are included in the Right-Zone and Left-Zone, because the management of the faulty region in these zones could be equally applied for the chip corners.
Each router has a status register with a default value of zero (i.e., normal status). For the nodes surrounding a given faulty region, in this status register, each node will save its own position in relation to the faulty region (East, west, north, ...) and the zone of the faulty region (Fig. 6). An example of the above description is illustrated in Fig. 7. In the wireless routers, the status register has an additional bit named P which becomes set in case of any wireless router permanent fault. The wireless routers manage their communication mechanism by using this bit, as will be discussed later.
If there exists a wireless router in the faulty region, nodes around this faulty region can recognize this wireless router and will issue a special packet with a time-to-live of 2L, which would make the other nodes aware of the faulty wireless router. It is assumed that detection of faults is done in a static manner, meaning that in case of any permanent fault, the system will restart and the faulty components will be recognized [15]. The detection of faults can be performed in different ways [15] which are beyond the scope of this article. When the system begins operation, each node is aware of the status of its corresponding wireless routers. Moreover, the faulty regions are not the source or destination of any communication.
The cores should be properly assigned to the wireless routers to reduce the possibility of hotspot occurrences. Assuming uniform traffic, the assignment of wireless routers is made in a balanced manner with respect to their distance from the cores. Note that wireless router assignment could be made more appropriately by knowing the traffic pattern. This issue requires more investigations and can be addressed in future works
The wormhole packet switching technique [32] is used for this FWNoC infrastructure. In this study, some new fields are added to header flit to manage wireless communications which will be described later. According to this proposed mesh-based infrastructure, the addresses of the nodes are identified in (x,y) coordinates. It is assumed that the left bottom corner node has coordination of $\left( {0,0} \right) $. Manhattan distance between source $(s)$ and destination $(d)$ nodes can be computed by $dis(s,d)=\left| {x_d -x_s } \right| +\left| {y_d -y_s } \right| $.

This proposed communication protocol is performed on the basis that if $\hbox {dis}\left( {\hbox {s}, \hbox {d}} \right) $ is smaller than or equal to the T value, communications are performed through wired links; otherwise, for long-range communication, the packets are sent to the corresponding wireless router through the wired links.

2.3.1 Wired routing protocol

This wired routing protocol to be designed must guarantee access to the wireless routers through the shortest path in a fault-free state. Moreover, the design of this routing unit should be simple and low in cost. In this regard, a fully adaptive fault-tolerant routing algorithm (FAFTR) is proposed based on the minimal West-first algorithm.

The advantages of the minimal West-first routing algorithm are its high speed and low design cost [32]. In this routing algorithm if the destination is in the west, it acts like a deterministic XY algorithm; otherwise, it acts adaptively, provided that it does not have any movement toward the west. In fact, in this procedure, S–W (south to west) and N–W (north to west) turns are not allowed to avoid deadlock. However, this algorithm is partially adaptive [32]. For example, in this study with respect to this proposed fault model, for moving westward, the minimal West-first routing algorithm will be correct when the faulty region is located in the Right-Zone, but it will not be correct when the faulty region occurs in other zones presented in Fig. 6; accordingly, this algorithm must be modified to restore all broken paths. This modification needs to identify the location of the faulty region with respect to the given zone. Here, it is suggested that the failed paths be replaced with new deterministic paths to route the packets around the faulty regions. For this purpose, the nodes around faulty regions must send the packet to the proper output port with respect to the information stored in their status register. For example, if the faulty region is in the Centre-Zone, when the packets cannot move westward, they are directed toward south, deterministically (Fig. 8a). With these procedures, non-allowable turns of the minimal West-first algorithm may occur around the faulty region. Here, the nodes around the faulty region in Center-Zone may use the S–W turn. According to channel dependency graphs (CDG) [32, 33], there is the probability of deadlock around the faulty regions in the Center-Zone (Fig. 8b).

A turn-based solution for managing this deadlock is suggested here. In this solution, instead of unallowable S–W turn, the E–S turn will be disallowed around the faulty region which makes the S–W turn allowable. Hence, routes around the faulty region will be restored according to forbidden E–S and N–E turns (Fig. 8c). With these new rules, when the faulty region is located in the Center-Zone, packets which need E–S turn are forwarded toward the west. Note that these results in longer paths for some of the routes compared to the minimal West-first algorithm (Fig. 8d). In other zones of the network, the broken paths will be renovated almost similar to the Center-Zone. In these zones, the routers around the faulty region will direct packets toward north or south depending on the destination position. Through this approach, the basic rules of the minimal West-first routing algorithm are violated as well. But here, the circular dependency in nodes around the faulty region will not occur because one edge of the network is involved in the faulty region. This newly proposed algorithm changes the normal behavior of the minimal West-fist routing algorithm in the nodes adjacent to nodes surrounding the faulty region causing deadlock. To ensure deadlock avoidance, the distance between two faulty regions should be greater than two hops; otherwise, both the faulty regions would be combined. With this approach, nodes adjacent to the nodes surrounding the faulty region have normal status and they would follow the base algorithm before and after encountering the faulty region to ensure deadlock avoidance in the network.

In brief, the FAFTR algorithm is represented based on the minimal West-first algorithm. In this algorithm, routing is decided based on the value stored in the status register, after specifying the position of the destination node, as shown in Fig. 9. That is, if the destination node is westward $(\hbox {Xoffset}<0)$, one of the cases, and the value of the status register is down (s) (i.e., the faulty region is in the Down-Zone and the router is located in the south of this zone), the packets will be forwarded to the north port. The other cases of this proposed algorithm are illustrated in Fig. 9. In this figure, Select () function, adopted form [32].

2.3.2 Wireless routing protocol

As it was mentioned earlier, if $dis\left( {s, d} \right) > T$, the source node must send the packet to the corresponding wireless router. For this purpose, the source node must apply the necessary changes in the header flit. In this article, some new fields are added to the header flit consisting of the field of the wireless router address and two control bits, G and F. Here, the G bit specifies the need for wireless transmission and the F bit indicates that all wireless routers of a node failed. When the distance between source and destination is greater than T, the source node sets the G-bit and inserts its wireless router address into the header flit. Next, it sends the packet to the wireless router according to the FAFTR algorithm. The intermediate nodes will ignore the real destination address and perform routing according to the wireless router address when $G=1$. When the wireless router receives the packet, and observes $G=1$, it will direct the packet to the wireless interface. Furthermore, if all wireless routers of a node failed, both the G bit and F bit will be set and the packet will be sent through the wired links. The intermediate nodes having healthy wireless router and will check their Manhattan distance to the destination; if this distance is less than or equal to T, the intermediate node resets the G and F bits and will send the packets according to the FAFTR algorithm. On the contrary, if the checked distance is greater than T, the intermediate node will reset the F bit and send the packet to its corresponding wireless router.

The wireless communication protocol depends on the proposed architectural parameters like L and T. This dependency is related to the number of wireless routers: (1) in case of small numbers, wireless transmission can be accomplished with one single hop by selecting the appropriate R. In this case, there is no need for a specific wireless routing unit; the wireless communication will be made through the WMAC mechanism. I2) in case of large numbers, the number of wireless routers is such that the single-hop communications cannot be used due to the challenges of on-chip wireless communication, and hence the need for an appropriate wireless routing algorithm.

Considering the fixed arrangement of wireless routers in a regular structure, this newly introduced FWNoC architecture can use the quadrant-based routing algorithms [24] with some modification. If the wireless routers receive a packet through a wired port, which requires long-distance communications (i.e., G $=$ 1), the wireless transmitter unit will broadcast the packet. From the perspective of any transmitter wireless router, the network is divided into four logical regions like four quadrants of the coordinate plane. When a wireless router receives a packet through the wireless interface, it must recognize whether its entity and the real destination are in the same quadrant or not. The receiver wireless router address $\left( {\hbox {Xcurrent}, \hbox {Ycurrent}} \right) ,$ together with the real destination address $\left( {\hbox {Xd},\hbox {Yd}} \right) $ and the transmitter wireless router address $\left( {\hbox {XT},\hbox { YT}} \right) $ can be applied in finding in which quadrant the receiver and the destination are located (Fig. 10). If they are not in the same quadrant, the receiver will discard the packet; otherwise, it will check its Manhattan distance from the real destination. If this distance is less than or equal to 2L, wireless transmission is not needed; therefore the G bit is reset and the packet will be sent to the destination through the wired links like normal packets. In case this distance is larger than 2L, it will replace the address of the transmitter with its own address and will broadcast the packet again. The decision-making unit for the wireless interface is shown in Fig. 10.

In this proposed wireless routing, conditions like deadlock, redundant paths, and duplicate packets exist. The following are the suggestion to handle them in a more appropriate manner.

Deadlock To avoid deadlock in wireless routing, the virtual channels are applied instead of turn model used in the wired routing. According to this proposed infrastructure in Sect. 2.2, there exist two types of wireless communications: intra-group and extra-group. The intra-group communications take place in one hop and after receiving the packet it will either be sent through the wired link to the destination or will be sent to another group. Therefore, there is no cyclic dependency among the members of a certain group through one-hop communications. In extra-group communications, each wireless router has a distinct receiver for each frequency channel corresponding to each group. To avoid deadlock, a distinct buffer will be allocated in the transmitter unit for each one of these receivers, indicating that the sending space of each receiver is separate. In fact, instead of using one FIFO, the sender channel is divided into several virtual channels where each channel corresponds to one receiver, leading to possible dependency cycle elimination. It should be noted that with respect to this proposed channel allocation (Fig. 5) and this proposed wireless routing algorithm, from the perspective of each wireless router, the receiving buffers of each frequency channel accept packets only from one side of the network. Therefore, there is no need for a specific order in using receiver channels. The only requirement here is that the depth of receivers’ buffers should be at least equal to the biggest neighbor size. Furthermore, for assuring deadlock avoidance, a distinct buffer should be considered in the transmitter unit for packets received from the wired port which need wireless communications. It can be observed that for avoiding deadlock, there exists a direct relation between the numbers of receivers in a wireless router and the number of needed virtual channels. Therefore, deadlock avoidance cost is added to the limitation in selecting R, indicating the importance of choosing the appropriate value for R in this proposed architecture.
Redundant paths In this study, data packets can be transmitted through multiple wireless paths simultaneously, which can manage permanent faults. But in fault-free cases, these simultaneous transmissions impose additional overhead with a negative impact on the system performance. Therefore, the number of simultaneous transmissions should be reduced as much as possible in fault-free case. For this purpose, it is suggested that a distinction should be made between packet forwarding by wireless routers in normal and faulty modes. With respect to the wireless router positions in relation to one another, the decision-making unit (Fig. 10) should be modified to handle the simultaneous wireless transmission. To accomplish this, the policy of deciding whether the receiver and the real destination are in the same quadrant or not should be changed. In fault-free cases, if the receiver and the real destination are located horizontally or vertically in-line with each other, the receiver is not considered to be in the same quadrant with the real destination, except for the cases where the destination is exactly in the same direction as the transmitter. For example, in a network mesh with r $=$ 2L (Fig. 5), with respect to wireless routing algorithm, this new approach leads where only one wireless router tries to forward the packet in the fault-free cases. If the value of R is large, in addition to this approach, the wireless routers closer to the transmitter should be exempted from forwarding. Not that the bit P in the status register (Fig. 10) is considered to distinguish the behavior of wireless routers in normal and faulty modes.
Duplicated packets This proposed wireless routing algorithm can lead to transmission of the duplicated packets. To manage this issue, before taking any action, it should be made sure that the packets of wireless routers are not duplicated. With respect to adopted wormhole-based flow control mechanism, by applying fields like packet id and the source address in the header flit, duplicate packets can be recognized. Therefore, wireless routers would save this information after sending each packet. Here, the important point is the validity time duration of the fields like packet id and the source address. This validity time duration usually begins with the time of sending the first and ends with sending the last flit. In this study, considering the fixed-regular arrangement of wireless routers, the maximum number of wireless routers which send the same packets to a certain wireless router is known. Therefore, the validity time duration can be shortened so that the number of comparisons is reduced. For example, in a mesh network with $r = 2L$, the maximum number of wireless routers which can send the same packet to a certain wireless router is three (Fig. 5), meaning that only three comparisons are required.

3 Experimental studies

The characteristics of this FWNoC architecture are analyzed and their performances evaluated in the presence of permanent faults through simulation under different traffic patterns.

Although the proposed method regarding the wireless routers placement is applicable for any value of T, a reasonable value should be chosen. This value is obtained through analyzing the achievements of the previous studies as follows.

Assuming ${\textit{N}}=\left\lfloor \frac{{\textit{T}}}{2}\right\rfloor +1$ and using a minimal routing algorithm, an N*N mesh will not need the use of wireless routers because the maximum Manhattan distance between nodes (MMDN) is not greater than T hops. If the mesh size is increased by one, i.e., (N + 1)*(N + 1), the new MMDN value for even T and odd T will be T + 2 and T + 1 hops, respectively. Therefore, in an (N + 1)*(N + 1) mesh, the maximum number of corner nodes (MCN) is 3 for even T and 1 for odd T, provided that the distance of these corner nodes from other nodes exceed T hops. Assuming L = 3, the above statements are illustrated in Fig. 11. With respect to the symmetry of the mesh structure, the number of communications, where the Manhattan distance between source and destination is greater than T hops, can be calculated; that is the number of wireless communications required in the mesh network.

The V is defined as the set of all network nodes. As each node can be used as the source or destination in a communication, the number of possible communications is equal to the total number of ordered pairs in ${\textit{V}}*{\textit{V}}$. In an (N + 1)*(N + 1) mesh with an even T, the maximum number of communications with the travel distance greater than T hops is 20. In case of odd T, this value will be equal to 4. If the size of mesh is increased by X units, i.e., (N + X)*(N + X), with respect to the Manhattan distance between nodes in opposite corners and symmetry of the network, the new value of MCN can be obtained using mathematical induction. Therefore, the maximum number of communications with the travel distance greater than T can be calculated, while, in practice, the actual number of the communications in the network depends on the traffic pattern.

It is demonstrated that the traditional wire-based NoC is inefficient for an 8*8 mesh [2, 12, 18]; therefore, in this article the maximum number possible with the travel distance exceeding T hops is obtained for an 8*8 mesh as shown in Fig. 12. Assuming the uniform traffic distribution, for ${\textit{T}}=11$ only 1.5 % of all possible communications need wireless routers. Hence, it is realized that ${\textit{T}} \le 11$ can be a reasonable value for this architecture. Here, L is chosen as $=$ 2, 3, 4, 5 in a manner that the all possible values of T are covered. The percentage of routers which should be equipped with wireless interface for different sizes of the network is almost equal (Fig. 13). This percentage can be 16 at most, which may not be very significant.

Since the T value change depends on the characteristics of wired links, comparing different thresholds for a given network is of no use. The important point is whether these obtained values of parameter L are proper for other thresholds. This issue is determined according to the L limit $(1\le {L}\le \left\lfloor \frac{T}{3}\right\rfloor )$. It is clear that these obtained values of L are very appropriate for bigger T. Note that less wireless communications is needed for bigger T. For smaller T, selecting of L will be more restricted. Note that the previous studies indicate that there is almost no advantage in utilizing long links with mesh networks the size of which are smaller than 4*4 [34]. Therefore, the T value is greater than four hops. However, the appropriate value of parameter L for a given size of the network is determined through simulation.

3.1 Simulation setup

An event-driven simulator based on OMNet++ is developed to model the FWNoC architecture [35]. This simulation environment is used to simulate the traditional and Photonic NoC in [3, 36, 37].

The basic architecture contains 256 identical tiles arranged as a 16*16 2-D mesh network over a 20 mm*20 mm die. Each tile consists of a processing core and a five-port router. The routers consist of one status register, two registers for storing the address of wireless routers, and three common functional units like input port control unit, route computation/switch traversal, and output port control unit [38]. Wormhole switching technique with credit-based flow control is adopted here. Packets are composed of eight flits each with a 64 bits size. The routers here employ round-robin arbitration scheme. The grant messages are used to manage the internal operations of the routers.

Each tile is connected to its neighbor through bidirectional wired channels. The width of these channels is considered as 64 bits. Each physical channel has two virtual channels, each with a FIFO buffer with a of four flits depth. By 45 nm CMOS technology, the NoC routers are driven with a 2 GHz frequency clock. Considering the propagation velocity of electronic signals in the optimally repeated wires at 45nm technology [39], the delay incurred by wired links between cores is less than the clock period of 500 ps. The characteristics of wireless interconnections are adopted from [8], which provides 24 distinct frequency channels. Each wireless channel has a data rate of 10 Gbps. According to the method described in Sect. 2.2, wireless routers are organized in the separate groups and wireless routers of each group are tuned to a specific frequency channel. Each wireless router will be equipped with one transmitter and multiple receivers. The transmitter and receiver units have two virtual channels, each with a buffer with four flits depth. The wireless transmission unit uses round-robin arbitration for transmitting the packets of different receivers. It is assumed that the threshold distance is equal 11 and the default value of communication range is set to be ${\textit{r}}=2\hbox {l}$.

3.2 Performance evaluation with topology configuration

The performance of FWNoC architecture is evaluated in terms of average latency and network throughput. Average latency is the time needed for a packet to traverse through the network from source to destination. The network throughput is defined as the average rate of packets successfully received.

The average latency and throughput under a uniform random traffic for a traditional mesh-based NoC which uses the FAFTR algorithm (FTM-NoC) and for an FWNoC architecture with parameters $L=2, 3, 4, 5$ are illustrated in Fig. 14a, b. Four different topologies are obtained by changing the parameter L, respectively. All of these FWNoC architectures outperform the FTM-NoC architecture, because the wired travel distance in FWNoC is always less than or equal to T and average number of hops is reduced significantly. This reduction is due to the hierarchical feature of the FWNoC architecture and the use of high-speed wireless shortcut links. As observed in Fig. 14, reducing the value of parameter L will improve the system performance slightly, because even though reducing L decreases the number of wired hops, the required number of wireless hops will be increased.

For $L=2$, the minimum number of wired hops is used, while it contains the maximum number of wireless hops. In this case, at most five wireless hops are needed. If this number is reduced by an increase in R, the WMAC will become more complicated and the area overhead will increase due to the required more resource buffers. Moreover, for $L=2$, 34 wireless routers are needed which is twice as much as the number of needed wireless routers in other cases. It is deduced here that $L=2$ is not a good choice. For $L=5$, conditions are reversed, i.e., although the maximum number of wireless hops is almost one, the number of wired hops are increased. Also, because of the less number of wireless routers, hotspot occurrence will be more probable leading to an increase in transmission delay. According to Fig. 14, $L=3$ has the best performance and $L=4$ is almost the same. For $L=4$, the number of needed wireless routers is decreased, and hence an advantage regarding the cost. However, for $L=4$, in case of primary wireless routers’ failure, the number of wired hops would increase, where the possibility of hotspot occurrence is higher in comparison with $L=3$ due to less number of wireless routers.

The other parameter of FWNoC architecture is the transmission range of $R= r*w$, as described Sect. 2.1.2, the value of which should be chosen so that the wireless communication becomes more energy efficient than its wired counterparts (i.e., link length is bigger than 7 mm) [10]. Considering this issue and the fixed-regular arrangement of wireless routers, the proper communication range can be obtained by analyzing and simulating this architecture. For $L=3$, the characteristics of wireless communication such as maximum number of wireless hops, number of carrier frequencies, and group size are shown in Table. 2. To show the impact of R on system performance, the average latency and throughput are compared for various values of R.

Table 2 Characteristics of wireless communication FWNoC for $L=3$ and different values of R

Full size table

The performance improvement of the system for greater R due to the fewer number of required wireless hops is shown in Fig. 15 a, b. When $r=\surd 8$, more wireless hops are needed and this would lead to lower performances. For $r=\sqrt{50}\hbox {L}$, although all communications are of single hop, accessing channel requires more time due to bigger group size. Even in case of higher traffic, performance will be worse than $r=\sqrt{8}\hbox {L}$. Moreover, for $r=\surd 50\hbox {L}$ more buffers are needed, that is a higher cost. As shown in Fig. 15, $r=\surd 10\hbox {L}$ and $r=4\hbox {L}$ have better performance because the maximum number of wireless hops in these two cases is 2. When $r=4\hbox {L}$, this maximum number is much lower than $r=\surd 10\hbox {L}$. In fact, only the communications between the opposite corners will need this maximum number of wireless hops, while the other communications would need only one wireless hop. Hence, the maximum number of wireless hops is one. Also, because of smaller wireless groups, accessing the wireless channel will be faster. Considering all these, it is observed that $r=4\hbox {L}$ has the best performance because of smaller group sizes and more communication range. It should be noted that in this case, an increase in communication range will not have a significant effect on system performance, because wireless operation is almost one hop. Hence, an increase in communication range will only increase the cost. On the other hand, if the size of groups is reduced while fixing the communication range, more carrier frequencies will be needed, leading to a need of more receivers and more buffer resources.

3.3 Performance with permanent faults

At first, the effects of fault occurrence in different zones of network (Fig. 6) are investigated. For this purpose, five different cases, each with one faulty region located in one zone of the network, are considered. The faulty region includes 8 normal routers and one wireless router.

The comparison of average latency versus offered load under uniform random traffic for FTM-NoC architecture is illustrated in Fig. 16a. Since this proposed routing algorithm acts in a deterministic manner in the surroundings of the faulty region in addition to the fact that the central nodes of the network have a greater role than the ones at the boundary, there will be severe congestion around faulty regions in the Center-Zone. Hence, this routing protocol will behave differently in the Center-Zone. This difference is less in the FWNoC architecture (with $L=3$, $r=4\hbox {L}$) as shown in Fig. 16b. This is because the FWNoC architecture utilizes wireless shortcut links resulting in less packets encountering the faulty region. Comparison of the curves $a$ and $b$ in Fig. 16 indicates that FWNoC becomes saturated at higher loads compared to FTM-NoC architecture. This is due to the fact that the traveled distance between cores in FWNoC is shorter than that of the FTM-NoC architecture by using wireless shortcut links. Note that the wireless infrastructure would not be disrupted in the presence of only one faulty region due to this proposed arrangement of wireless routers. However, as the travel distance of some communications exceed T, there is degradation of the system performance in comparison with the fault-free case in the FWNoC architecture. This is followed by studying the effect of different fault patterns on the system performance. Since the previously already proposed WNoC architectures do not have any mechanism for handling multiple faulty regions, the simulations in this study include FWNOC1 (with $L=3$ and $r=4\hbox {L}$), FWNOC2 (with $L=4$ and $r=3\hbox {L}$), and FTM-NOC architecture. Four synthetic traffic models including uniform random, transpose, bit reverse, and bit complement are considered with node fault rates varying from 0 to 30 %. Five fault patterns are chosen for each fault rate. These patterns are the combination of normal routers and wireless routers at different locations in the network.

The system performance is evaluated in terms of average latency and the saturation throughput, the latter being the maximum accepted traffic. It should be noted that for getting meaningful results for computing the average latency, low offered load value must be selected, because higher offered loads quickly saturate the network. This offered load is set at 0.1 GB/s according to the experience discussed above. The average of values obtained from the five different fault patterns is considered (Figs. 17, 18). As observed from here, in the fault-free case, the performance of FWNoC1 and FWNoC2 is improved compared to FTM-NOC in a significant manner. This is because the travel distance between any pair of nodes is always less than T due to wireless links. This improvement is quite remarkable for the transpose traffic and bit reversal due to having the source–destination pairs with the most of the diagonal links that require long-range wireless links. When the bit complement pattern is used, lower performance of the system becomes evident (Figs. 17, 18) due to hotspot occurrence. In the presence of faults, the FWNoC architectures perform better than FTM-NOC architectures, because FWNoC architectures provide access to backup wireless routers and use wireless node-disjoint paths.

In lower fault rates (5 % node faults), the performance degradation of FWNoC architectures is low for the four traffic patterns, in comparison with the fault-free case. This is because the arrangement of wireless routers is such that the faulty regions have less impact on the wireless infrastructures. When the fault rate increases, the distribution of fault regions will increase on the chip with an effect on this proposed infrastructure in a drastic manner. This leads to further degradation of FWNoC performance as shown in Figs. 17 and 18. The performance degradation of the FWNoC2 is noticeable, because access to wireless routers is more than before. Moreover, the possibility of disruption of infrastructure in FWNoC2 is higher than that of FWNoC1 due to the availability of less wireless router in FWNoC2. This result indicates the importance of choosing a proper value for L as mentioned in Sect. 2.1. It can be deduced that the FWNoC1 has the best performance among all scenarios, because the architecture parameters (L and R) of this architecture is such that the faulty regions have less impact on the infrastructure of wireless system than that of the FWNoC2; consequently, the wireless links become less overloaded. At a high fault rate, with respect to the high possibility of infrastructure disruption and the possibility of the failure of all wireless routers of a given node, the communications will be performed through wired links; hence, there is less difference between the architectures’ performance(Figs. 17, 18).

3.4 Feasibility evaluation

To evaluate the feasibility of this proposed architecture, its respective area requirements and system power consumption are considered. In this study, based on the mesh-based interconnection model introduced by [38], the FWNoC architecture is implemented in a 45 nm CMOS technology over a 20 mm*20 mm die with a clock frequency of 2 GHz and a supply voltage of 1.1 v [39].

Area cost The area overhead of this proposed infrastructure is caused by the normal routers, wireless routers, and wire links. Orion2 [40] is used in calculating the router area. Recently, authors in [41, 42] revealed some inaccuracies in the Orion2. They reported that Orion2 calculates the power and area for SRAM array-based buffers and multiplexer-based crossbars inaccurately. Hence, the SRAM bit cell spacing is updated here according to [41]. The characterized parameters for the 45 nm technology such as interconnection and process parameters are updated in accordance with [7, 43]. Furthermore, a matrix crossbar is used here instead of the multiplexer-based crossbar model in Orion2. The area of a $5\,\times \,5$ crossbar and a 64-bit buffer is estimated as $0.0303\,\hbox {mm}^{2}$ and $0.00327\,\hbox {mm}^{2}$, respectively. The area for the wired links is also estimated as $0.0076\,\hbox {mm}^{2}$ using Orion 2.0. For the normal routers, considering the proposed fault model, the configuration information is stored in an eight-bit status register. Furthermore, the FAFTR algorithm imposes combinational logic blocks like modified multiplexers and newly developed decoders to each port of the routers in relation to the standard minimal West-first routing algorithm. By modifying the routing computation unit in Orion2, a $5\,\times \,5$ router is estimated as $0.31286\,\hbox {mm}^{2}$. Since the wireless routers are the normal routers equipped with wireless interface, this area is expanded due to wireless links, transmitter and receiver buffers, modified crossbars, and logic for controlling wireless routers’ behavior. The area overheads of the wireless links are obtained through analytical and experimental results from [8]. The total area overhead for one wireless link is $151\upmu \hbox {m}^{2}$ [8]. This area is for both the transmitter and receiver, including the modulators, demodulators, low-noise amplifiers (LNAs), and antennas. The area of 6*6 crossbars is estimated as $0.0386\,^{2}$. In wireless interface, other logics are implemented for controlling the wireless routers. Overall hardware overhead of a routing decision-making logic mainly consists of, eight 2-to-1 multiplexers, two 4-bit subtractors, and one 4-bit comparator. This overhead is estimated as an area of approximately 0.14098 $\mathrm{mm}^2$ in 45 nm technology by scaling estimated areas from [26, 44]. The area costs of other components are adopted from [44]. Note that the additional area overhead of the wireless routers is different with respect to the number of receivers in the wireless port. The area overhead of FWNOC1 and FWNOC2 is approximately increased by 36 and 27 %, in relation to the FTM-NoC architecture. Buffers used in wireless ports are the major reason of this area increase. However, there is a trade-off between additional area overhead and performance benefits achieved from this proposed architecture.

Power consumption The other major motivation in designing an alternative interconnection is the increase of power consumption in electrical interconnections in future multi-core systems. To evaluate this power consumption, a power estimation model is integrated into the simulator to perform high-level power analysis.

The energy dissipations of wireless links are obtained in an analytical manner from [8]. The energy dissipation of the longest wireless link of 23 mm is 0.33 pJ/bit, which supports the data rate of 10 Gbps on the chip. For FWNOC1, according to the parameter ${r}=4\hbox {L}$, the maximum communication distance is 15.96 mm ($\sim $16mm). Assuming an ideal line of sight channel and free space loss, a transmitted power of approximately 0.63 mW can provide acceptable transmission performance for this length and this receiver sensitivity. The modified Orion2 is used for estimating power consumption of one flit transmission across a wired link and a normal router. The energy dissipation of a 64-bit metal link is estimated as 7.68 pJ/mm. The energy required for a single hop through a normal router is the total of energy spent in units of a router (neglecting arbiter energy) [3]. The energy dissipation of a buffer and a 5*5 crossbar is estimated as 4.48 and 8.96 pJ, respectively, for a 64-bit flit. The decision-making unit of routers is characterized in terms of unit components such as full adder, 2-bit comparator, 2-to-1 multiplexer and logic gates. The energy dissipation of these unit components is estimated for 45 nm technology by scaling the model [45]. The average energy dissipation is estimated around 2.3 pJ for a 4-bit unit of a decision-making unit when all components are in the output transition state. This value is considered when a packet needs this unit.

Figure 19a–d shows the comparison of average power consumption achieved by the three architectures in four synthetic traffic patterns, respectively, under offered load = 0.1 GB/s with node fault rates varying from 0 to 30 %. The obtained values are normalized with respect to the power consumption of FTM-NoC architecture in the fault-free case. As mentioned before, the FWNoC architecture achieves more advantages by reducing latency through long-range and low-power wireless shortcut links than FTM-NoC. In fact, packets travel fewer hops and access fewer functional units in FWNoC architecture, so the overall power consumption decreases accordingly as shown in Fig. 19. With an increase in L (FWNoC2), the throughput degrades and latency increases due to the less number of wireless routers and higher required number of wired hops. Here, each packet uses network resources for a longer period of time. Hence, the average power consumption increases in the FWNoC2. The FWNoC1 configuration has the lowest power consumption; however the area overheads of FWNOC1 architectures is higher than that of the FWNOC2.

4 Conclusions and future work

In this article, based on a two-tiered hierarchical model, a fault-tolerant hybrid WNoC architecture is presented. In the lower tier, the cores are connected by a standard mesh network. By an innovative approach, a reasonably low number of routers of this mesh network are selected to be equipped with wireless interface. These wireless routers provide long-distance communication in the higher tier. The main contribution of this study is applying node-disjoint communication structures in the hybrid WNoC for handling permanent faults with moderate hardware cost overhead. Specific features of the proposed architecture can be described as follows:

Each core records at least two wireless routers addresses, considered as primary and backup wireless routers. Each core connects to its corresponding wireless router with an allowable Manhattan distance. Moreover, these wireless routers are located with a maximum possible distance between one another for tolerating faulty regions. Various configurations can be created for this proposed design depending on the maximum allowable distance and wireless transmission range. With respect to the threshold distance and system size, the proper values of architecture parameters are obtained by analysis and simulation.
This design provides a regular arrangement of wireless routers where there exists at least one pair of node-disjoint paths between each two wireless routers. This regularity can provide the communication protocols applying node-disjoint wireless paths with a low cost.
A FAFTR algorithm is presented based on the minimal West-fist algorithm for fault-tolerant wired communications. Moreover, an effective WMAC and a quadrant-based routing algorithm are adopted for wireless communications. These algorithms are deadlock free with moderate complexity.

The experimental results demonstrated that the FWNoC architectures with the proper parameters provide higher fault tolerance compared to that of their conventional wired counterparts in the presence of various fault regions. The FWNoC architectures achieve considerable improvement in network throughput, transmission latency, and power efficiency under different traffic patterns compared to fault-tolerant mesh-based NoC. This is due to the fact that FWNOC architectures provide node-disjoint paths through long-range and low-power wireless links.

A prior knowledge of traffic can assist choosing the positions of wireless routers and allocation of wireless routers to nodes. This issue needs further investigation. Hence, traffic dependency and real traffic patterns will be addressed in future works in detail. Moreover, due to data transmission through node disjoint path, this issue proposed design can help manage the transient faults on wireless links. This issue can become a possible topic for future studies. Conducting studies on other fault models would contribute to this field.

References

Benini L, De Micheli G (2002) Networks on chip: a new paradigm for systems on chip design. Proc 2002 Des Autom Test Eur Conf Exhib IEEE Comput Soc:418–419. doi:10.1109/DATE.2002.998307
Kaplan AB (2008) Architectural integration of RF-interconnect to enhance on-chip communication for Many-Core Chip Multiprocessors. Ph.D. diss., University of California Los Angeles
Shacham A, Bergman K, Carloni LP (2008) Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors. IEEE Trans Comput 57:1246–1260. doi:10.1109/TC.2008.78
Article MathSciNet Google Scholar
Deb S, Ganguly A, Pande PP et al (2012) Wireless NoC as Interconnection Backbone for Multicore Chips: Promises and Challenges. IEEE J Emerg Sel Top Circuits Syst 2:228–239. doi:10.1109/JETCAS.2012.2193835
Article Google Scholar
Ogras UY, Marculescu R (2006) “It’s a small world after all”: NoC performance optimization via long-range link insertion. IEEE Trans Very Large Scale Integr Syst 14:693–706. doi:10.1109/TVLSI.2006.878263
Article Google Scholar
Sabbaghi-Nadooshan R, Modarressi M, Sarbazi-Azad H (2010) The 2D digraph-based NoCs: attractive alternatives to the 2D mesh NoCs. J Supercomput 59:1–21. doi:10.1007/s11227-010-0410-6
Article Google Scholar
ITRS Edition (2011) . http://www.itrs.net/Links/2011ITRS/Home2011.htm. Accessed 10 Oct 2014
Ganguly A, Chang K, Deb S et al (2011) Scalable Hybrid Wireless Network-on-Chip Architectures for Multicore Systems. IEEE Trans Comput 60:1485–1502. doi:10.1109/TC.2010.176
Article MathSciNet Google Scholar
Deb S, Chang K, Yu X et al (2013) Design of an Energy-Efficient CMOS-Compatible NoC Architecture with Millimeter-Wave Wireless Interconnects. IEEE Trans Comput 62:2382–2396. doi:10.1109/TC.2012.224
Article MathSciNet Google Scholar
Chang K, Deb S, Ganguly A et al (2012) Performance evaluation and design trade-offs for wireless network-on-chip architectures. ACM J Emerg Technol Comput Syst 8:1–25. doi:10.1145/2287696.2287706
Article Google Scholar
Hu W-H, Wang C, Bagherzadeh N (2014) Design and analysis of a mesh-based wireless network-on-chip. J Supercomput. doi:10.1007/s11227-014-1341-4
Google Scholar
Matolak D, Kodi A, Kaya S et al (2012) Wireless networks-on-chips: architecture, wireless channel, and devices. IEEE Wirel Commun 19:58–65. doi:10.1109/MWC.2012.6339473
Article Google Scholar
More A, Taskin B (2012) A unified design methodology for a hybrid wireless 2-D NoC. 2012 IEEE Int Symp Circuits Syst IEEE: 640–643. doi:10.1109/ISCAS.2012.6272113
Chang K-C (2009) Reliable network-on-chip design for multi-core system-on-chip. J Supercomput 55:86–102. doi:10.1007/s11227-009-0376-4
Article Google Scholar
Radetzki M, Feng C, Zhao X, Jantsch A (2013) Methods for fault tolerance in networks-on-chip. ACM Comput Surv 46:1–38. doi:10.1145/2522968.2522976
Article Google Scholar
Wettin P, Pande PP, Heo D et al (2013) Design space exploration for reliable mm-wave wireless NoC architectures. IEEE 24th Int Conf Appl Syst Archit Process IEEE:79–82. doi:10.1109/ASAP.2013.6567554
Ganguly A, Pande P, Belzer B, Nojeh A (2011) A unified error control coding scheme to enhance the reliability of a hybrid wireless network-on-chip. IEEE Int Symp Defect Fault Toler VLSI Nanotechnol Syst IEEE:277–285. doi:10.1109/DFT.2011.24
Wettin P, Vidapalapati A, Gangul A, Pande PP (2013) Complex network-enabled robust wireless network-on-chip architectures. ACM J Emerg Technol Comput Syst 9:1–19. doi:10.1145/2491676
Article Google Scholar
Nojeh A, Pande P, Ganguly A et al (2008) Reliability of wireless on-chip interconnects based on carbon nanotube antennas. IEEE 14th Int Mix Sensors Syst Test Work IEEE:1–6. doi:10.1109/IMS3TW.2008.4581628
Vijayakumaran V, Yuvaraj MP, Mansoor N et al (2014) CDMA Enabled Wireless Network-on-Chip. ACM J Emerg Technol Comput Syst 10:1–20. doi:10.1145/2536778
Article Google Scholar
Lloyd EL (2010) Fault-Tolerant Relay Node Placement in Heterogeneous Wireless Sensor Networks. IEEE Trans Mob Comput 9:643–656. doi:10.1109/TMC.2009.161
Article MathSciNet Google Scholar
Chen C-W, Chung C-P (2005) Designing a disjoint paths interconnection network with fault tolerance and collision solving. J Supercomput 34:63–80. doi:10.1007/s11227-005-0327-7
Article Google Scholar
Furhad MH, Kim J-M (2014) A shortly connected mesh topology for high performance and energy efficient network-on-chip architectures. J Supercomput 69:766–792. doi:10.1007/s11227-014-1178-x
Article Google Scholar
Zhao D, Wang Y (2008) MTNet: Design of a Wireless Test Framework for Heterogeneous Nanometer Systems-on-Chip. IEEE Trans Very Large Scale Integr Syst 16:1046–1057. doi:10.1109/TVLSI.2008.2000820
Article Google Scholar
Zhao D, Wang Y, Li J, Kikkawa T (2011) Design of multi-channel wireless NoC to improve on-chip communication capacity!. Proc Fifth ACM/IEEE Int Symp Netw Chip ACM:177–184
Zhao D, Wang Y (2008) SD-MAC: Design and Synthesis of a Hardware-Efficient Collision-Free QoS-Aware MAC Protocol for Wireless Network-on-Chip. IEEE Trans Comput 57:1230–1245. doi:10.1109/TC.2008.86
Article MathSciNet Google Scholar
Deb S, Ganguly A, Chang K et al (2010) Enhancing performance of network-on-chip architectures with millimeter-wave wireless interconnects. In: ASAP 2010–21st IEEE Int. Conf. Appl. Syst. Archit. Process. IEEE, pp 73–80. doi:10.1109/ASAP.2010.5540799
Lee S-B, Zhang L, Cong J et al (2009) A scalable micro wireless interconnect structure for CMPs. In: Proc. 15th Annu. Int. Conf. Mob. Comput. Netw.—MobiCom ’09. ACM Press, New York. doi:10.1145/1614320.1614345
Kempa K, Rybczynski J, Huang Z et al (2007) Carbon Nanotubes as Optical Antennae. Adv Mater 19:421–426. doi:10.1002/adma.200601187
Article Google Scholar
West DB (2001) Introduction to graph theory. Prentice hall, Upper Saddle River
Google Scholar
Boppana RV, Chalasani S (1995) Fault-tolerant wormhole routing algorithms for mesh networks. IEEE Trans Comput 44:848–864. doi:10.1109/12.392844
Article MATH Google Scholar
Duato J, Yalamanchili S, Ni LM (2003) Interconnection networks: an engineering approach. Morgan Kaufmann Publishers Inc, USA
Google Scholar
Zhang Z, Greiner A, Taktak S (2008) A reconfigurable routing algorithm for a fault-tolerant 2D-Mesh Network-on-Chip. Proc Des Autom Conf:441–446. doi:10.1109/DAC.2008.4555858
Manevich R, Polishuk L, Cidon I, Kolodny A (2014) Designing single-cycle long links in hierarchical NoCs. Microprocess Microsyst 38:814–825. doi:10.1016/j.micpro.2014.05.005
Article Google Scholar
Varga A (2001) The OMNeT++ discrete event simulation system. Proc. Eur. Simul. Multiconference. sn. p 185
Ben-Itzhak Y, Zahavi E, Cidon I, Kolodny A (2012) HNOCS: modular open-source simulator for Heterogeneous NoCs. Int Conf Embed Comput Syst IEEE. pp 51–57. doi:10.1109/SAMOS.2012.6404157
Koohi S, Hessabi S (2011) Hierarchical opto-electrical on-chip network for future multiprocessor architectures. J Syst Archit 57:4–23. doi:10.1016/j.sysarc.2010.07.003
Article Google Scholar
Pande PP, Grecu C, Jones M et al (2005) Performance evaluation and design trade-offs for network-on-chip interconnect architectures. IEEE Trans Comput 54:1025–1040. doi:10.1109/TC.2005.134
Article Google Scholar
Zhang Y, Hu X, Deutsch A et al (2011) Prediction and comparison of high-performance on-chip global interconnection. IEEE Trans Very Large Scale Integr Syst 19:1154–1166. doi:10.1109/TVLSI.2010.2047415
Article Google Scholar
Kahng a. B, Li BL Bin, Peh L-SPL-S, Samadi K (2009) ORION 2.0: A fast and accurate NoCapower and area model for early-stage design space exploration. 2009 Des Autom Test Eur Conf Exhib .pp 423–428. doi:10.1109/DATE.2009.5090700
Hayenga M, Johnson DR, Lipasti M (2010) Pitfalls of orion-based simulation. ORION 35:0–40
Google Scholar
Sun C, Chen C-HO, Kurian G, et al. (2012) DSENT—a Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling. 2012 IEEE/ACM Sixth Int. Symp. Networks-on-Chip. IEEE. pp 201–210. doi:10.1109/NOCS.2012.31
PTM—Latest models. http://ptm.asu.edu/latest.html. Accessed 10 Oct 2014
Francis RM (2009) Exploring networks-on-chip for FPGAs. Ph.D. diss., University of Cambridge
Banerjee N, Vellanki P, Chatha KS (2004) A power and performance model for network-on-chip architectures. Proc Des Autom Test Eur Conf Exhib IEEE Comput Soc:1250–1255. doi:10.1109/DATE.2004.1269067

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, University of Isfahan, Isfahan, Iran
Abbas Dehghani & Kamal Jamshidi

Authors

Abbas Dehghani
View author publications
You can also search for this author in PubMed Google Scholar
Kamal Jamshidi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kamal Jamshidi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dehghani, A., Jamshidi, K. A fault-tolerant hierarchical hybrid mesh-based wireless network-on-chip architecture for multicore platforms. J Supercomput 71, 3116–3148 (2015). https://doi.org/10.1007/s11227-015-1430-z

Download citation

Published: 24 April 2015
Issue Date: August 2015
DOI: https://doi.org/10.1007/s11227-015-1430-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A fault-tolerant hierarchical hybrid mesh-based wireless network-on-chip architecture for multicore platforms

Abstract

Similar content being viewed by others

A fault-tolerant and congestion-aware architecture for wireless networks-on-chip

An efficient fault-tolerant routing algorithm in NoCs to tolerate permanent faults

Performance-Enhanced $$d^2$$ -LBDR for 2D Mesh Network-on-Chip

1 Introduction