A Novel Hardware/Software Partitioning Technique for System-on-Chip in Dynamic Partial Reconfiguration Using Genetic Algorithm

Janakiraman, N.; Kumar, P. N.

doi:10.1007/978-81-322-1602-5_10

N. Janakiraman⁹ &
P. N. Kumar¹⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 236))

1804 Accesses
1 Citations

Abstract

Hardware/software partitioning is a common method used to reduce the design complexity of a reconfigurable system. Also, it is a major critical issue in hardware/software co-design flow and high influence on the system performance. This paper presents a novel method to solve the hardware/software partitioning problems in dynamic partial reconfiguration of system-on-chip (SoC) and observes the common traits of the superior contributions using genetic algorithm (GA). This method is stochastic in nature and has been successfully applied to solve many non-trivial polynomial hard problems. It is based on the appropriate formulation of a general system model, being therefore independent of either the particular co-design problem or the specific partitioning procedure. These algorithms can perform decomposition and scheduling of the target application among available computational resources at runtime. The former have been entirely proposed by the authors in previous works, while the later have been properly extended to deal with system-level issues. The performance of all approaches is compared using benchmark data provided by MCNC standard cell placement benchmark netlists. This paper has shown the solution methodology in the basis of quality and convergence rate. Consequently, it is extremely important to choose the most suitable technique for the particular co-design problem that is being confronted.

Access provided by Autonomous University of Puebla. Download conference paper PDF

An efficient ACO-based algorithm for scheduling tasks onto dynamically reconfigurable hardware using TSP-likened construction graph

Article 21 April 2016

A New Approach for Automatic Development of Reconfigurable Real-Time Systems

Automated Scheduling for Tightly-Coupled Embedded Multi-core Systems Using Hybrid Genetic Algorithms

Keywords

1 Introduction

Hardware/software partitioning is a method of dividing a complex heterogeneous system into hardware co-processor functions and its compatible software programs. It is a prominent practice that can realize results greater than the software-only or hardware-only solutions in system-on-chip (SoC) design. This technique can improve the system performance [1] and reduce the total energy consumption [2]. The proposed partial dynamic reconfiguration method does not depend on any tool. It uses a set of algorithms to detect crucial code regions, compilation/synthesize of hardware/software modules, and updating of communication logic. Hence, it could tune up the system to give full efficiency without disruption of other SoC-related operations. Here, the genetic algorithm (GA) is used for optimization process. This is essential in system-level design, since decision-making process affects the total performance of system. This paper presents a novel system partitioning technique with in-depth analysis. The paper is organized as follows. Section 2 briefs about the previous works in this field. Section 3 presents the proposed system model for partitioning problem. Section 4 gives the results and its analysis. Section 5 concludes the paper and discusses about the future work. Last section provides the list of references.

2 Related Works

When compared to dynamic partitioning using standard software, the run-time (or) partial dynamic reconfigurable systems had attained superior performance with manually specified predetermined hardware regions. Multiple choices of preplanned reconfigurations were rapidly executed in a run-time reconfigurable system using PipeRench architecture [3] and dynamically programmable gate arrays (DPGA) [4]. The binary-level partitioning technique [5] was provided a good solution compared to source-level partitioning methods due to the functionality of any high-level language and software compiler. Since the satisfaction of performance was not considered for the cost function of this system, it may be failed to find out local minima. A mapping technique for nodes and hardware/software components was developed in [6] called GCLP algorithm. The hardware cost was minimized by the incorporation of hill-climbing heuristic algorithm with the hardware/software partitioning algorithm [7].

3 System Model for Partitioning

The problem resolution requires the system model definition to represent the important issues in the hardware/software co-design for a specific problem [8]. The system partitioning problem model is represented by the task graph (TG) flow diagram. TG is a model of directed and acyclic graph (DAG) flow with weight vectors. Formally, it is defined as $G=(V, E)$, where ‘V’ represents the nodes and ‘E’ represents the edges. The flow direction is represented by each edge. Due to reducing the complexity of TG, it can be modified as one starting node and one ending node. Figure 1 represents the overview of the partitioning procedure. Design constraints and design specifications are given as the input to the partitioning process as a high-level specification language. The nodes can act as giant pieces of information like tasks and processes of coarse granularity or tiny types like instructions and operations of fine granularity approach.

After the system space estimation, every node is tagged with some attributes. Giant pieces of data for a node $(V_\mathrm{i,j} )$ are represented by 5 attributes as follows:

(1)
Hardware area $(\text {HA}_\mathrm{i,j} ).$
(2)
Hardware implementation time $(\text {HT}_\mathrm{i,j} ).$
(3)
Software memory size $(\text {SS}_\mathrm{i,j} ).$
(4)
Software execution time $(\text {ST}_\mathrm{i,j} ).$
(5)
The average execution time in numbers $(N_\mathrm{i,j} ).$

Shortly,

Hardware module $\left( {\text {HM}_\mathrm{i,j} } \right) =\left( {\text {HA}_\mathrm{i,j} } \right) +\left( {\text {HT}_\mathrm{i,j} } \right) +(N_\mathrm{i,j} )$
Software module $\left( {\text {SM}_\mathrm{i,j} } \right) =\left( {\text {SS}_\mathrm{i,j} } \right) +\left( {\text {ST}_\mathrm{i,j} } \right) +(N_\mathrm{i,j} )$

Communication values $(C_\mathrm{i,j} )$ of every node are represented by three

components as follows:

(1)
Transfer time $(\text {TT}_\mathrm{i,j} )$
(2)
Synchronization time $(\text {SynT}_\mathrm{i,j} )$
(3)
The average communication time in numbers $(M_\mathrm{i,j} )$

Shortly,

Communication value of node $\left( {C_\mathrm{i,j} } \right) =\left( {\text {TT}_\mathrm{i,j} } \right) +\left( {\text {SynT}_\mathrm{i,j} } \right) +({M}_\mathrm{i,j} )$

$$\begin{aligned} C_\mathrm{i,j} =\frac{\left( {N_{i} *\Delta \text {TT}_{i} } \right) +\left( {{N}_{j} *\Delta \text {TT}_{j} } \right) +(\text {SynT}_\mathrm{i,j} )}{\left( {\text {HT}_{i} } \right) +(\text {HT}_{j} )} \end{aligned}$$

where $(\Delta \text {TT}_{i} )=\left( {\text {ST}_{i} } \right) -\left( {\text {HT}_{i} } \right) $ and $(\Delta \text {TT}_{j} )=\left( {\text {ST}_{j} } \right) -\left( {\text {HT}_{j} } \right) .$

Efficiency of the hardware/software system partitioning process is based on the target architecture and its mapping technique. Hence, this work considers the ‘Dynamically Reconfigurable Architecture for Mobile Systems’ (DReAM) as target architecture. Execution of hardware and software processes should be concurrently in the standard processor and the application-specific co-processor. This partitioning process concludes the assignment of modules to implement the hardware and software stages, implementation schedule (timing), and the communication interface between software and hardware modules. In general, this partitioning solution can be validated by the measurement of eminent attributes like performance and cost parameters. Hence, this paper used as three quality attributes related to design elements as follows:

(1)
The estimated hardware area is ${A}_{E}$, and the maximum available area is A.
(2)
The estimated design latency is $T_{E}$, and the maximum allowed latency is T.
(3)
The estimated software (or) memory space is ${M}_{E}$, and the maximum available space is M.

Static-list scheduling method is used for the scheduling process [9]. It is a subtype of resource-constrained scheduling algorithm. This scheduler considers the timing estimation of every vertex and its interconnections. This scheduler unit provides the design latency (${T}_{E})$ and the cost of communication for hardware–software co-design. Based on the hardware and software implementations, another four parameters are considered for co-design realization.

When the entire system is implemented in hardware,

(1)
The minimum design latency is MinT.
(2)
The maximum hardware area is MaxA.

When the entire system is implemented in software,

(1)
The maximum design latency is MaxT.
(2)
The maximum memory space is MaxM.

These parameters are used to create the bounding constraints for the design space.

$0\le {A} \le $ MaxA; $0\le {M} \le $ MaxM; MinT $\le {T} \le $ MaxT.

3.1 System Operations

The design specifications are given in the format of ISPD98 benchmark suite [10] circuit netlist. This partitioning process has three stages.

In first stage, the processing of design specifications is divided into three subtasks. The first subtask is the separation of hardware ($\text {HA}_{i}$ and $\text {HT}_{i})$ and software ($\text {SS}_{i}$ and $\text {ST}_{i})$ estimations from the design specifications. The second subtask is to translate the design specifications into a hypergraph-based control data flow graph (CDFG) representation ${G}=({V}, {E})$. The third subtask is scheduling (${N}_{i}$ and ${N}_\mathrm{i,j})$ of each operations in the CDFG with satisfaction of the design constraints and the priority of operations.

In second stage, the outputs of these three tasks are given into the system-level partitioning module through the registers. It has three functionalities. The operational-level analysis is the first process, used to classify the tasks whether it is suitable for hardware realization or software execution. Next, the allocation process is used to allocate the required supporting entities like functional units, interconnections, and storage elements for the scheduled hardware and software systems. This allocation is based on the speed constraint (i.e., parallel processing) and the area constraint (i.e., dynamic partial reconfiguration). Finally, an absolute data path is generated by integrating components in the basis of hardware and software partitions. Then, the partitioning data are given to the specific hardware ($\text {HM}_{i})$ and software ($\text {SM}_{i})$ models.

In third stage, the hardware and software models are executed separately and the outcomes are compared with their estimated values (i.e., first stage). If any controversy arises, the feedbacks are given to the second-stage process. This looping process is continued till the satisfaction of all criterions.

Next, the performance (${C}_{\mathrm{{i,j}}})$ of hardware–software co-design is estimated and compared with target performance metrics. If any misalignment arises, the feedback is indicated to the system-level partitioning stage. Then, the entire second and third stages are recompiled, till the achievement of target performance measures. Finally, the hardware/software co-simulation and co-verification is performed, and then, the SoC is realized.

3.2 Hardware/Software Estimation

The CDFG file is given to the input of both hardware and software estimations with the settings of target technology files and processor specifications. The hardware execution is a parallel process since the specifications are modeled in VHDL library. The software execution is a sequential process since the specifications are modeled in C code. The GA technique is used to optimize these parallel and sequential processes.

Hardware estimation is based on the high-level synthesizable components, to share the control and data path between hardware and software processes. GA is used to optimize this resource sharing process [11]. The quality measures are closely associated with performance metrics like execution, implementation, transfer, and synchronization times commonly called reaction time. This reaction time is associated with each node in each execution of local DFG. For convenient, the CDFG is split into several small DFGs called local DFGs.

The response times for

Routine statements, $T_{\text {RS}} = T_{\text {DFG}} $

Conditional statements, $T_{\text {CS}} =\sum \limits _{n} {P}_{n} {T}_{\text {DFGn}} $ ;

n—Number of iterations
$\text {P}_{n} $—Probabilities of iterations of outcomes

Looping statements, ${T}_{\text {LS}} ={nT}_{\text {DFG}} $ ;

$$\begin{aligned} {T}_{\text {CDFG}}&= {F}({T}_{\text {DFG}1} , {F}_{\text {DFG}1} ,\ldots , {T}_{\text {DFGi}} , {F}_{\text {DFGi}} )\\&\quad + {F}({T}_{\text {DFG}1} , {F}_{\text {DFG}1} ,\ldots , {T}_{\text {DFGj}} , {F}_{\text {DFGj}} ) \end{aligned}$$

$$\begin{aligned} \text {MinT}=\alpha [(\text {MaxA}*{C}_{\mathrm{{i,j}}} )+\mathop \sum \limits _{i} {T}_{i} {N}_\mathrm{i,j} ] \end{aligned}$$

${T}_{i}$—Time delay for each node
$\alpha $—Co-estimation factor

$$\begin{aligned} \text {MaxT}=\text {MinT}+\beta \mathop \sum \limits _{i} [{T}_{i} \mathop \sum \limits _{{j}=1}^{{R}_{i} } {N}_\mathrm{i,j} ] \end{aligned}$$

${R}_{i}$—Required components of each node ‘i’
$\beta $—Constant, since MaxT is a higher-order term
${F}_{i}$—Number of fixed components for each node ‘i’

$$\begin{aligned} {T}_{\text {CDFG}} =\text {MinT}+\beta \mathop \sum \limits _{i} [\frac{{T}_{i} }{{F}_{i} }\mathop \sum \limits _{{j}={F}_{i} +1}^{{R}_{i} } {N}_\mathrm{i,j} ] \end{aligned}$$

Register Estimation: [12]

Many input multiplexers = $({i}^{*}\text {MUXs})$

State machine-based control logic is used to control lines, $\text {log}_2 {i}$

ROM size, $(\text {STA}^{*}[\left( {1+\text {log}_2 {i}} \right) \left( {\text {REG}+\sum \limits _{i} {F}_{i} } \right) +\text {log}_2 {S}])\text {bits}$

STA—Number of states
REG—Number of registers

Software estimation is based on the calculation of memory space occupied by instruction set and user-defined data types and data structures. The average queuing time for each memory access can be modeled as ${T}_{q} $, and the number of access is represented by ${N}_{\text {mem}} $. This calculation is necessary to estimate $\left( {\text {TT}_\mathrm{i,j} } \right) \text {and}\left( {\text {SynT}_\mathrm{i,j} } \right) $.

Hardware estimation $\left( {{T}_{\text {HM}} } \right) =\left( {{T}_{(\text {CDFG},\text {HM})} } \right) +\alpha T_{q} ({N}_{\text {mem},\text {HM}} )$

Software estimation $\left( {{T}_{\text {SM}} } \right) =\left( {{T}_{(\text {CDFG},\text {SM})} } \right) +{T}_{q} ({N}_{(\text {mem},\text {SM})} )$

Co-estimation $\left( {{T}_{\text {HM}/\text {SM}} } \right) =\sigma \left( {{T}_{q} } \right) +\varphi (\frac{{N}_{\text {mem}} }{{T}_{q} })$; where $\sigma \, \text {and}\,\varphi $ are complex structures.

Table 1 Design characteristics for ISPD’98 benchmark suite

Full size table

4 Analyses of Results

All the hardware/software partitioning algorithms have been experimented in a set of benchmark suites provided by ISPD’98, whose characterization is shown in Table 1. Size and values of the system graph should bound within the design space. All these examples are illustrated in the form of directed and acyclic graphs to specify the certain coarse–grain tasks. Every example has been tested in different constraints, but it always within the specified boundary conditions. The results are summarized in Table 2. These results will be analyzed from both qualitative and quantitative perspectives. The qualitative aspects will be mainly represented by the resulting cost of the solutions obtained from each method, under different constraints. The quantitative issues will be shown by means of the computation time resulting from each technique.

Table 2 Results acquired with the ISPD’98 examples

Full size table

5 Conclusion and Future Work

In this paper, the commonly used biologically inspired optimization algorithm, which addresses the hardware/software partitioning problem for SOC designs, is implemented using clustering approach as well as their performance is evaluated. This evaluation process does not have any constraints on the cluster size and the number of clusters. Hence, this evaluation approach is quiet suitable to be used in reducing the design complexity of systems. This paper had shown how this problem can be solved by means of very different partitioning techniques at runtime of the system (dynamic partial reconfiguration). The problem resolution has been based on the definition of a common system model that allows the comparison of different procedures. These extensions have improved previous implementations, because they include some issues previously not considered. The constraints of these algorithms have been integrated into the cost function in a general and efficient way. This genetic algorithm-based dynamic partitioning technique has produced an average of 16.19 % accuracy in hardware/software partitioning compared to [13] and [14].

A future study could extend the system model to encompass other quality attributes, like power consumption, influence of communications, and the degree of parallelism. Also, the hybrid algorithms of these biologically inspired algorithms and their compilation are currently under study.

References

Gajski, D.D., Vahid, F., Narayan, S., Gong, J.: SpecSyn—an environment supporting the specify-explore-refine paradigm for Hardware/Software system design. IEEE Trans. VLSI Syst. 6(1), 84–100 (1998)
Article Google Scholar
Henkel, J.: A low power Hardware/Software partitioning approach for core-based embedded systems. In: Proceedings of the 36th ACM/IEEE Conference on Design Automation, pp. 122–127 (1999)
Google Scholar
Goldstein, S.C., Schmit, H., Budiu, M., Moe, M., Taylor, R.R.: PipeRench—a reconfigurable architecture and compiler. IEEE Computer 33, 70–77 (2000)
Article Google Scholar
DeHon, A.: DPGA-coupled microprocessors-commodity ICs for the early 21st century. In: Proceedings of FCCM (1994)
Google Scholar
Stitt, G., Vahid, F.: Hardware/Software partitioning of software binaries. In: IEEE/ACM International Conference on Computer Aided Design, pp. 164–170 (2002)
Google Scholar
Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S.: Multilevel hypergraph partitioning—application in VLSI domain. IEEE Trans. VLSI Syst. 20(1) (1999)
Google Scholar
Alpert, C. J.: The ISPD98 circuit benchmark suite. In: Proceedings of the 1998 International Symposium on Physical Design, pp. 80–85 (1998)
Google Scholar
Jiang, Y., Zhang, H., Jiao, X., Song, X., Hung, W.N.N., Gu, M., Sun, J.: Uncertain model and algorithm for Hardware/Software partitioning. IEEE Comp. Soc. Annu. Symp. VLSI 243–248 (2012)
Google Scholar
Al-Wattar, A., Areibi, S., Saffih, F.: Efficient on-line Hardware/Software task scheduling for dynamic run-time reconfigurable systems. In: 26th International Parallel and Distributed Processing Symposium Workshops & PhD, Forum, pp. 401–406 (2012)
Google Scholar
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Pearson Education (2004)
Google Scholar
Sheng, W., He, W., Jiang, J., Mao, Z.: Pareto optimal temporal partition methodology for reconfigurable architectures based on multi-objective genetic algorithm. In: 26th International Parallel and Distributed Processing Symposium Workshops and PhD, Forum, pp. 425–430 (2012)
Google Scholar
Mazumder, P., Rudnik, E.M.: Genetic Algorithms for VLSI Design, Layout and Test Automation. Pearson Education (2003)
Google Scholar
Luo, L., He, H., Dou, Q., Xu, W.: Hardware/Software partitioning for heterogeneous multicore SoC using genetic algorithm. In: Second International Conference on Intelligent System Design and Engineering Application, pp. 1267–1270 (2011)
Google Scholar
Su, L., Zhang, X.: Research on an SOC Software/Hardware partition algorithm based on undirected graphs theory. In: IEEE International Conference on Computer Science and Automation Engineering, pp. 274–278 (2012)
Google Scholar

Download references

Acknowledgments

This work was supported in part by All India Council for Technical Education—Quality Improvement Programme scheme 2010. Access to research and computing facilities was provided by the Anna University and K.L.N. College of Engineering.

Author information

Authors and Affiliations

Anna University, Chennai, India
N. Janakiraman
Anna University, Chennai, India
P. N. Kumar

Authors

N. Janakiraman
View author publications
You can also search for this author in PubMed Google Scholar
P. N. Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to N. Janakiraman .

Editor information

Editors and Affiliations

Institute of Engineering and Technology, JK Lakshmipat University, Jaipur, Rajasthan, India
B. V. Babu
Department of Computer Science, Liverpool Hope University, Liverpool, United Kingdom
Atulya Nagar
Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Kusum Deep
Department of Paper Technology, Indian Institute of Technology Roorkee, Roorkee, India
Millie Pant
Department of Applied Mathematics, South Asian University, New Delhi, India
Jagdish Chand Bansal
Institute of Engineering and Technology, JK Lakshmipat University, Jaipur, Rajasthan, India
Kanad Ray
Institute of Engineering and Technology, JK Lakshmipat University, Jaipur, Rajasthan, India
Umesh Gupta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Janakiraman, N., Kumar, P.N. (2014). A Novel Hardware/Software Partitioning Technique for System-on-Chip in Dynamic Partial Reconfiguration Using Genetic Algorithm. In: Babu, B., et al. Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012. Advances in Intelligent Systems and Computing, vol 236. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1602-5_10

Download citation

DOI: https://doi.org/10.1007/978-81-322-1602-5_10
Published: 26 February 2014
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1601-8
Online ISBN: 978-81-322-1602-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

A Novel Hardware/Software Partitioning Technique for System-on-Chip in Dynamic Partial Reconfiguration Using Genetic Algorithm

Abstract

Similar content being viewed by others

An efficient ACO-based algorithm for scheduling tasks onto dynamically reconfigurable hardware using TSP-likened construction graph

A New Approach for Automatic Development of Reconfigurable Real-Time Systems

Automated Scheduling for Tightly-Coupled Embedded Multi-core Systems Using Hybrid Genetic Algorithms

Keywords

1 Introduction

2 Related Works

3 System Model for Partitioning

3.1 System Operations

3.2 Hardware/Software Estimation

4 Analyses of Results

5 Conclusion and Future Work

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Novel Hardware/Software Partitioning Technique for System-on-Chip in Dynamic Partial Reconfiguration Using Genetic Algorithm

Abstract

Similar content being viewed by others

An efficient ACO-based algorithm for scheduling tasks onto dynamically reconfigurable hardware using TSP-likened construction graph

A New Approach for Automatic Development of Reconfigurable Real-Time Systems

Automated Scheduling for Tightly-Coupled Embedded Multi-core Systems Using Hybrid Genetic Algorithms

Keywords

1 Introduction

2 Related Works

3 System Model for Partitioning

3.1 System Operations

3.2 Hardware/Software Estimation

4 Analyses of Results

5 Conclusion and Future Work

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation