Keywords

1 Introduction

Semiconductor technology has advanced towards System on Chip (SoC), in which every one of the functions required a specific product, it is incorporated into a single chip [1]. SoC tools is capable of developing cycle while increasing the performance and usefulness. At the end of the day, previous processors memory and interfacing devices are separately. SoC based processors are reduce the complexity and cost [2].

Systems on Chip (SoC) are planned utilizing prior components, for example, processors, memory clusters and DSPs. These are cores. In these cores have communication problem because cores are not scalable in present technology, can be solved through concurrent communication and point to point communication using Network on Chip (NoC). It can be composed various ways, according to the system design [3]. A generic NoC architecture consist of several cores, Routers (R) and Network Interface (NI). The cores in NoC generally resembles the mesh, which is clearly depicted in Fig. 1. These cores can be homogeneous i.e. central processing unit, or heterogeneous i.e. audio-video cores, wireless transmitter and receiver, etc. In NoC every core is linked with local router through Network Interface. In the similar fashion, every router is consequently associated with neighboring router, establishing a packet based NoC [4].

Fig. 1.
figure 1

NoC architecture

In this paper, we confine our contemplation to mesh topology and data transferred in the form of packets. Take note of that our work are not restricted to mesh topologies, but rather we adhere to this confinement to be more particular in this paper. Researchers have observed around communication energy issues and proposed different strategies to the optimal parallelization of the mapping. Our paper deals with the obtained issue of task allocation onto the parallel processors, by proposing an elective way to deal with above mentioned systems. We model energy efficient mapping (EMAP) algorithm, main goal of EMAP is saving communication energy.

The remaining part of this paper is organized as follows. Section 2 presented Literature review, Sect. 3 explained our Methodology and Mapping algorithm is elaborated with example in Sect. 4. Experimental results and hardware verification are presented in Sects. 5 and 6, followed by conclusion in Sect. 7.

2 Previous Work

We refer the reader to a few recent surveys on core mapping NoCs for pointers to modern research and improvement. Nectarios Koziris and et al., proposed Physical Mapping of Clustered Task Graphs onto Multiprocessor Architectures. Map the task graphs on nodes of the processor network with respect low complexity [5]. In [6] Srinivasan Murali and et al., presented NMAP algorithm, which maps the cores onto NoC architectures with respect to bandwidth and it is presented both single minimum-path routing and split traffic routing. It contains three phases, initial one is mapping generation, based on a which module has highest number of communication, next one is find shortest path using Dijkstr’s algorithm and finally iterative improvement, by changing pairs of modules and recomputing shortest paths. A simple technique for low energy mapping and routing in Network-on-Chip architectures. This problem projected in the occurrence of bandwidth and latency explained Krishnan Srinivasan and Karam Chatha [7]. In [8], described mapping algorithm which consequently maps a given set of property onto a NoC and builds a deadlock-free deterministic routing function for performance and minimize the communication energy.

Wein-Tsung Shen et al., proposed a well-organized binomial core mapping and optimization algorithm (BMAP) to reduce the hardware cost of on-chip network (OCN) infrastructure. BMAP complexity is O(N2log(N)) and provides more economical when compared to traditional mapping techniques and it is a fast and efficient technique when compared to NMAP [9]. Wei Hu et al., [10] discussed fast algorithm, which maps the cores onto a WK-Recursive NoC under Performance Constraints. A new energy efficient and bandwidth aware topological core mapping on NoC platform. Which is significant reduction in execution time presented Saurabh Agrawal et al., [11]. Nithin Michael et al., proposed map the applications on network on chip nodes (processors) with respect to communication cost. The idea is that minimizing communication cost translates to minimizing dynamic power and leads to better energy-efficiency [12]. They have done incredible work and gave an excellent guideline for this paper. The commitments of this paper are to propose mapping algorithm with respect to communication energy.

3 Methodology

We assume to have an application that should be mapped onto the NoC platform. Every application has multiple parallel tasks. By method for static analysis, it is probably regulate the data traded among cores. Which indicates how frequently a packet is sent between the source to destination in the network. Our problem is to map the cores onto the mesh based NoC Platform. This paper illustrates energy efficient mapping on NoC under communication rate constraints, which minimize the total communication energy. It is formalized in the next section.

3.1 Mathematical Formulation of the Mapping Problem

The communication rate between the vertices is denoted by the core graph:

Definition 1:

An Application Core graph is a directed graph ACG(V, E), V represents Vertex (\(\forall {V_i} \in V, \forall {V_j} \in V \)) E represents communication rate between the two vertices \(\forall \mathrm{e}_{ij} \in E , e_{ij}\) denoted by rate, signifies the communication rate from \(V_{i}\) to \(V_{j}\) [13, 14].

The connectivity and communication of the NoC is represented by topology graph:

Definition 2:

NoC preferred mesh topology, this topology graph can be uniquely described TG(N, D) is a directed graph. N representing a node or tile in the topology \(\forall {\mathrm{t}_{xy}} \in N, \mathrm{t}_{xy}\) represents \(x^{th}\) row and \(y^{th}\) column of tile. D represents communication distance \(\forall {\mathrm{N}_{ij}} \in D,\,\mathrm{N}_{ij}\) denotes distance between node (\(N_{i}\)) and node (\(N_{j}\)) [15].

Problem 1:

Given an Application Core Graph (V, E) and NoC Topology Graph (N, D); Find a mapping function

\(\varOmega \) : V \(\rightarrow \) N have \(\forall {V_i} \in V \), \(\forall {V_j} \in V \) and total communication energy.

figure a

energy is calculated in terms of distance(hops) and communication rate using [16].

$$\begin{aligned} \text {WMD}(V_i, V_j) = e_{ij} \text { X MD (MAP}(V_i), \text { MAP}(V_j)) \end{aligned}$$
$$\begin{aligned} \text {MD (MAP}(V_i), \text {MAP}(V_j))=|x_1 - x_2| + |y_1 - y_2| \end{aligned}$$

here \(V_i\) parameters are (\(x_1\), \(y_1\)) and \(V_j\) parameters are \((x_2, y_2)\). Let map(\(V_i\)) = (\(x_1\), \(y_1\)) and map(\(V_j\)) = (\(x_2\), \(y_2\)). The parameters MD(MAP(\(V_i\)), MAP(\(V_j\))) represents the Manhattan distance between cores. More preciously the resulting communication energy is

$$\begin{aligned} CE_{ij} = e_{ij} \text { x }N_{ij} \end{aligned}$$

\(N_{ij}\) denotes distance between node (\(N_{i}\)) and node (\(N_{j}\)). \(e_{ij}\) signifies the communication rate from \(V_{i}\) to \(V_{j}\).

\(\text {Total communication energy} =\sum \limits _{\forall {(i,j)}}{\mathrm{e}_{ij} \text { x } N_{ij}}\)

Problem 2:

Given an Application Core Graph (V, E) and NoC Topology Graph (N, D); Find a mapping function and minimum communication energy. Which

$$\begin{aligned} \text {CE} = \sum \limits _{\forall {(i,j)}}{\mathrm{e}_{ij}} \text { x } N_{ij} \end{aligned}$$

such that : Using these definitions, the problem of mapping can be formulated as follows. Given an ACG and an NoC topology that satisfy

$$\begin{aligned}&\text {size (ACG)} \le \text { size (NoC Topology)} \\&\text {map}(\mathrm{V}_{i}) \in N, \,\, \end{aligned}$$

Problem 1 clearly explained mapping and communication energy. For solution of minimum communication energy is

\(\text {f(CE)}= \left\{ \begin{array}{cl} \text {min CE} &{}: \text {intial mapping} \\ \text {min } \{\text {ce:} \text { ce} \in \text {CE}\} &{}: \text {changing mapping} \end{array} \right\} \)

4 EMAP Algorithm

In this section, we present the energy efficient mapping algorithm (EMAP), which have little complexity and gives ideal results. Which minimize the total communication energy. In proposed algorithm pseudo code as shown in Fig. 2. It explains map the first vertex, which has maximum communicate rate and select node which has maximum free neighbouring nodes. Map the vertex on nodes. Same procedure repeated for mapping of all vertices and it should be minimum communication energy, otherwise mappings with pairwise swapping of vertices are calculated to find the mapping with the minimum communication energy.

Fig. 2.
figure 2

EMAP algorithm

Fig. 3.
figure 3

Mapping of core graph onto NoC graph (a) an example core graph (b) NoC graph (c) Mapping

Figure 3, shows an EMAP algorithm example. A simple Application core graph is shown in Fig. 3(a) and Mesh NoC graph for a 25-node is shown in Fig. 3(b). In our example 8 vertices mapped to the 25 nodes (8 \(<<\) 25) EMAP is shown in Fig. 3(c) based on mapping algorithm. Selection of nodes based on maximum communication and communication energy.

5 Evaluation

We simulated the our energy efficient core mapping algorithm (EMAP) using Noxim simulator [17] for synthetically generated random traffic patterns as well as for common benchmark patterns like MPEG4 decoder, MWD, Video Object Plane Decoder (VOPD) and Dual Screen Display (DSD) shown in Table 1. Figure 4, shows the communication energy for the applications with the same communication rate for all algorithms. As seen from the figure, MMAP algorithm perform well for all applications when compared to the other algorithms (NMAP [6], BMAP [9] and PMAP [5]).

Table 1. Benchmarks
Table 2. Characteristics of different algorithms and optimization algorithm
Fig. 4.
figure 4

Communication energy of different mappings for benchmarks (normalized)

In order to compare algorithm domain and optimization is shown in Table 2. NMAP algorithm is mainly based on iteration of vertices and path optimization, BMAP algorithm mainly based on greedy binomial merge, PMAP algorithm is mainly based on nodes of the processor network with respect to low complexity and EMAP algorithm efficient mapping by swapping nodes in the network. Finally we can see that EMAP algorithm gives better mapping with minimum communication energy.

6 FPGA Implementation and Verification Environment

The energy efficient mapping (EMAP) algorithm was simulated, synthesized and verified on Kintex 7 FPGA provided by Xilinx [18]. The 7 series FPGAs offered by Xilinx has 4 FPGA families that address a wide range of system requirements that spans from cost sensitive - small form factor requirements to ultra-high-end connectivity bandwidth for high volume applications, and signal processing architectures for high speed applications. Kintex 7 FPGA, which is a part of the cutting edge FPGA family from Xilinx has special features such as high-speed transceivers, clock management timers, and DSP blocks. Because of the advanced architecture and technology, Kintex 7 has a high frequency of operation. Moreover because of the smaller process technology, the power consumption is very less. The advantages of such a platform are not only providing logic resources to emulate real communications but also to carryout the performance evaluation in a short time. The main clock frequency of the platform is 66 MHz. Its maximum bandwidth can reach 62.94 MByte / s [19].

The energy efficient mapping (EMAP) algorithm was designed using Verilog hardware description language. The design was simulated on Vivado 2016.4 tool provided by the Xilinx [20].

Table 3. Hardware utilization of different mapping algorithms

The proposed energy efficient mapping (EMAP) algorithm was implemented on Kintex 7 FPGA platform using Verilog HDL. Table 3 shows the hardware utilization for different mapping algorithms.

Table 4 shows the power utilization for EMAP on Kintex 7 FPGA family. It is seen that the proposed mapping shows reduced the power consumption when compared to the different mapping algorithms.

Table 4. Power consumption for existing and proposed mapping algorithm

7 Conclusion

In this paper presented an energy efficient core mapping algorithm onto the mesh NoC platform. We have accessible mapping algorithm that maps the cores onto the NoC under communication rate constraints to minimize the total communication energy. EMAP algorithm evaluated for random generated benchmarks. There is a significant saving communication energy when compared to existence algorithm. An experimentation environment was developed for simulating the proposed approach and the simulation was verified on FPGA board (Kintex-7 FPGA housed on KC705 evaluation kit).