Performance (Timing) Analysis

Tu, Kaihui; Tang, Xifan; Yu, Cunxi; Josipović, Lana; Chu, Zhufei

doi:10.1007/978-981-99-7755-0_5

Kaihui Tu⁶,
Xifan Tang⁷,
Cunxi Yu⁸,
Lana Josipović⁹ &
…
Zhufei Chu¹⁰

547 Accesses

Abstract

Timing analysis can be static or dynamic. Dynamic timing analysis (DTA) verifies functionality of the design by applying input vectors and checking for correct output vectors whereas static timing analysis (STA) checks static delay requirements of the circuit without any input or output vectors. In this chapter, STA techniques is focused since it is widely used in FPGA design flow to make sure the timing requirements are met.

Access provided by Autonomous University of Puebla. Download chapter PDF

1 Overview

Dynamic timing analysis (DTA), also known as simulation-based timing analysis technique, is complicated for even small FPGAs because of huge number of input vectors and unbearable long simulation time, while static timing analysis (STA), which could analyze a design in a very short time, is then thriving. As a mainstay of modern FPGA design flows, STA breaks a design down into timing paths, calculates the signal propagation delay along each path, and checks for violations of timing constraints inside the design and at the input/output interface. STA also has been integrated with timing-driven EDA engines to optimize FPGA’s timing performance.

The target design checkpoint (containing timing constraints and timing graph) and device library (containing timing models) are the main inputs that a timing analysis engine needs. The final output is the timing report (Fig. 5.1).

Standard Delay Format (SDF) is another optional output of timing engine. SDF is an IEEE standard for the representation and interpretation of timing data (both cell delays and interconnect delays) for use at any stage of the electronic design process [1]. This can be used along with the netlist in a simulator to verify that design meets its functional and timing requirements.

A flowchart of the inputs and output in timing analysis. The device library and design checkpoint are given to the timing analysis engine. The timing analysis engine consists of timing models, graphs, and constraints. The timing report and S D F are the outputs received. — **Fig. 5.1**

Given that device timing model has been discussed in Sect. 2.2.2, timing constraints has been discussed in Sect. 3.2.2, another main input–timing graph, derived from target design checkpoint, will be introduced in the following section.

Before we dive into the timing calculation algorithms, here are some basic concepts about STA. Figure 5.2 is the most common used picture to illustrate this.

A schematic diagram of the setup or hold timing analysis. The data input is given to register 1, followed by the combinational T logic, register 2, and ends with the data output. C l k to the 2 registers 1 and 2 has T c l k skew. — **Fig. 5.2**

Equations 5.1 and 5.2 can accurately represents the calculations of setup time slack (${\text {Slack}}_{\text {setup}}$) and hold time slack (${\text {Slack}}_{\text {hold}}$).

$$\begin{aligned} {\text {Slack}}_{\text {setup}} = T_{\text {period}} - (T_{\text {cq}} + T_{\text {logic}} + T_{\text {net}} + T_{\text {setup}} - T_{\text {clk}_\text {skew}}) \end{aligned}$$

(5.1)

$$\begin{aligned} {\text {Slack}}_{\text {hold}} = T_{\text {cq}} + T_{\text {logic}} + T_{\text {net}} - T_{\text {hold}} - T_{\text {clk}_\text {skew}}) \end{aligned}$$

(5.2)

where $T_{\text {period}}$ is clock period, $T_{\text {cq}}$ is defined as time it takes for data to appear on output Q once clock is triggered (pos edge or neg edge), $T_{\text {logic}}$ is the delay of the combinational logic, $T_{\text {net}}$ is the delay of the routing net, $T_{\text {clk}_\text {skew}}$ is the time difference between the clock arriving time at the two flip-flops.

To simplify the equation, $T_{\text {net}}$ and $T_{\text {clk}_\text {skew}}$ can be ignored. In order to make sure that ${\text {Slack}}_{\text {setup}}$ and ${\text {Slack}}_{\text {hold}}$ are positive, we can derive Eqs. 5.3 and 5.4 (plus $T_{\text {setup}}$ on both sides) from Eqs. 5.1 and 5.2.

$$\begin{aligned} T_{\text {period}} > T_{\text {cq}} + T_{\text {logic}} + T_{\text {setup}} \end{aligned}$$

(5.3)

$$\begin{aligned} T_{\text {cq}} + T_{\text {logic}} + T_{\text {setup}} > T_{\text {hold}} + T_{\text {setup}} \end{aligned}$$

(5.4)

Combine Eqs. 5.1 and 5.2, we can have Eq. 5.5.

$$\begin{aligned} T_{\text {hold}} + T_{\text {setup}} < T_{\text {cq}} + T_{\text {logic}} + T_{\text {setup}} < T_{\text {period}} \end{aligned}$$

(5.5)

$T_{\text {cq}} + T_{\text {logic}} + T_{\text {setup}}$ is the data propagation delay, if it is greater than $T_{\text {period}}$, the data will not arriving when the second register is sampling, on the other hand, if it is smaller than the register sampling window ($T_{\text {hold}} + T_{\text {setup}}$), the registers could fall into metastability.

In FPGA design, STA can be performed in different stages: post-synthesis (logical level) and post-implementation (physical level). Post-synthesis STA (based on ideal implementation information) is faster but less accurate than post-implementation STA (based on real implementation information).

2 Timing Analysis Techniques

STA usually requires a timing graph that describes the target design from the timing perspective, identifying all the timing paths. The timing graph consists of nodes and edges, nodes correspond to component pins or input/output ports, and edges are the timing path between them. Edges have attached weights that can denote some characteristics such as delay values [2].

Timing Graph Definition: A timing graph G = N, E, s, t is a directed graph having exactly one source node s and one sink node t , where N is a set of nodes, and E is a set of edges. The weight associated with an edge corresponds to either the gate delay or the interconnect delay (Fig. 5.3).

A directed node graph of timing. The source is followed by primal input, tile, primal output, and ends with sink. The arrows represent the timing path. — **Fig. 5.3**

Traditional STA is deterministic (DSTA) and compute the circuit delay for a specific condition. In practice, the worst-case slow or best-case fast process is typically used and this could lead to over-design, leaving a lot of margin on the table in terms of PPA. Statistical STA (SSTA) then come out to address this problem. It combines the delays along the timing paths which is expressed statistically (with mean and standard deviations) to obtain the overall delay data.

SSTA is also employed by Intel in its Quartus Prime software to mitigate the effect of random variation on longer paths [3]. By discounting the minimum/maximum delay spread on these paths, the FPGA performance reported by STA may increase. There are two main categories of SSTA techniques–path-based and block-based.

1.
Path-based

In path-based STA technique, critical path is searched in an exhaustive way. The statistical calculation is simple, but the paths of interest must be identified prior to running the analysis [4,5,6].
2.
Block-based

In block-based STA technique, the circuit timing graph is traversed in a topological manner. In [7], two basic graph traversal algorithms–depth first search (DFS) and breadth first search (BFS) are applied to STA module and the runtime efficiencies is compared by testing a large number of sequential circuit instances. The conclusion is that BFS algorithm can implement STA module more efficiently than DFS algorithm. Due to its runtime advantage, many research [8,9,10,11] and commercial efforts have taken the block-based approach. The advantage is completeness, and no need for path selection, however, to compute statistical max (or min) of random variables is not trivial.

The choice of using path-based analysis or block-based analysis depends on several factors, such as the design complexity, stage, and goal. Generally, path-based analysis is more suitable for small or medium-sized designs, where the number of paths is manageable and the accuracy is important. It can also be used for final verification or optimization, where the timing margins are tight and the details are needed. On the other hand, block-based analysis is more suitable for large designs, where the number of paths is overwhelming and the runtime is important. It can also be used for FPGA architecture exploration, where the timing budget is loose and the trends are sufficient.

In some cases, it could be more optimized to combine both techniques and use them in different stages or levels of the design. For example, one can use block-based analysis for the system-level design, where the blocks are abstracted and the overall timing is estimated. Then, one can use path-based analysis for the block-level design, where the paths are detailed. The balance between accuracy and efficiency can be obtained in this way [12].

3 Summary and Trends

The state-of-the-art STA engines still can not replace DTA (simulation) completely because there are some aspects of timing verification that cannot yet be completely captured and verified in STA [13]. Some of these limitations include:

1.
Inaccurate timing models

The timing models used in FPGA STA may not accurately represent the behavior of the actual circuit due to the complexity of the FPGA architecture.
2.
Lack of support for dynamic circuits

FPGA STA assumes that the circuit is static and does not take into account dynamic circuits such as state machines or circuits with feedback paths.
3.
Impact of environmental conditions

FPGA STA assumes ideal environmental conditions, such as constant temperature and voltage, which may not hold true in the real world.

Although FPGA STA has been matured for many years, it still benefits from emerging technologies. The following are some of the recent trends in FPGA STA:

1.
Parallel acceleration

Parallel STA on different computing platforms is one of the researching hot spots, such as multi-core CPUs [4, 14,15,16,17] and GPUs [16, 18].
2.
AI (machine learning) acceleration

ML algorithms are increasingly being used to analyze the timing characteristics of FPGA designs [19,20,21]. ML-based timing analysis can quickly identify critical paths in the design, predict the timing behavior of the design, and optimize the design for timing performance.

References

IEEE standard for standard delay format (SDF) for the electronic design process, in IEEE Std. 1497-2001 (2001), pp. 1–80
Google Scholar
J.L.M. Lee, A scalable method to measure similarity between two EDA-generated timing graphs, in 2015 International Conference on Computer, Communications, and Control Technology (I4CT) (2015), pp. 44–48
Google Scholar
Intel, Guaranteeing silicon performance with FPGA timing models. https://cdrdv2-public.intel.com/650314/wp-01139-timing-model.pdf
T.-W. Huang, M.D.F. Wong, UI-timer 1.0: an ultrafast path-based timing analysis algorithm for CPPR. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 35(11), pp. 1862–1875 (2016)
Google Scholar
D. Mishagli, E. Koskin, E. Blokhina, Path-based statistical static timing analysis for large integrated circuits in a weak correlation approximation, in 2019 IEEE International Symposium on Circuits and Systems (ISCAS) (2019), pp. 1–5
Google Scholar
L.-W. Chen, Y.-N. Sui, T.-C. Lee, Y.-L. Li, M.C.-T. Chao, I.-C. Tsai, T.-W. Kung, E.-C. Liu, Y.-C. Chang, Path-based pre-routing timing prediction for modern very large-scale integration designs, in 2022 23rd International Symposium on Quality Electronic Design (ISQED) (2022), pp. 1–6
Google Scholar
J. Lu, N. Xu, J. Yu, T. Weng, Research of timing graph traversal algorithm in static timing analysis based on FPGA, in 2017 IEEE 3rd Information Technology and Mechatronics Engineering Conference (ITOEC) (2017), pp. 334–338
Google Scholar
L. Zhang, Y. Hu, C.-P. Chen, Block based statistical timing analysis with extended canonical timing model, in Proceedings of the ASP-DAC 2005. Asia and South Pacific Design Automation Conference, 2005., vol. 1 (2005), pp. 250–253
Google Scholar
R. Chen, H. Zhou, New block-based statistical timing analysis approaches without moment matching, in Proceedings of the ASP-DAC 2007—Asia and South Pacific Design Automation Conference 2007, Series. Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC (2007), pp. 462–467
Google Scholar
G. Luo, B. Jin, W. Zhang, A fast and simple block-based approach for common path pessimism removal in static timing analysis, in 2015 14th International Conference on Computer-Aided Design and Computer Graphics (CAD/Graphics) (2015), pp. 234–235
Google Scholar
L. Jin, W. Fu, Y. Zheng, H. Yan, A precise block-based statistical timing analysis with max approximation using multivariate adaptive regression splines, in 2019 IEEE 13th International Conference on ASIC (ASICON) (2019), pp. 1–4
Google Scholar
T.-W. Huang, M.D.F. Wong, OpenTimer: a high-performance timing analysis tool, in 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (2015), pp. 895–902
Google Scholar
J. Bhasker, R. Chadha, Static Timing Analysis for Nanometer Designs: A Practical Approach, 1st edn. (Springer, 2009)
Google Scholar
Y.-M. Yang, Y.-W. Chang, I.H.-R. Jiang, iTimerC: common path pessimism removal using effective reduction methods, in 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (2014), pp. 600–605
Google Scholar
T.-W. Huang, M.D.F. Wong, D. Sinha, K. Kalafala, N. Venkateswaran, A distributed timing analysis framework for large designs, in Proceedings of the 53rd Annual Design Automation Conference, Series DAC ’16 (Association for Computing Machinery, New York, NY, USA, 2016). Available https://doi.org/10.1145/2897937.2897959
K. E. Murray, V. Betz, Tatum: parallel timing analysis for faster design cycles and improved optimization, in 2018 International Conference on Field-Programmable Technology (FPT) (2018), pp. 110–117
Google Scholar
T.-W. Huang, G. Guo, C.-X. Lin, M.D.F. Wong, OpenTimer v2: a new parallel incremental timing analysis engine. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 40(4):776–789 (2021)
Google Scholar
Z. Guo, T.-W. Huang, Y. Lin, GPU-accelerated static timing analysis, in 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD) (2020), pp. 1–9
Google Scholar
S. Bian, M. Shintani, M. Hiromoto, T. Sato, LSTA: learning-based static timing analysis for high-dimensional correlated on-chip variations, in Proceedings of the 54th Annual Design Automation Conference 2017, Series DAC ’17 (Association for Computing Machinery, New York, NY, USA, 2017). Available https://doi.org/10.1145/3061639.3062280
A.B. Kahng, U. Mallappa, L. Saul, Using machine learning to predict path-based slack from graph-based timing analysis, in 2018 IEEE 36th International Conference on Computer Design (ICCD) (2018), pp. 603–612
Google Scholar
M.A. Savari, H. Jahanirad, NN-SSTA: a deep neural network approach for statistical static timing analysis, Expert Syst. Appl. 149, 113309 (2020). Available https://www.sciencedirect.com/science/article/pii/S0957417420301342

Download references

Author information

Authors and Affiliations

Beijing, China
Kaihui Tu
Los Gatos, CA, USA
Xifan Tang
University of Maryland, College Park, College Park, MD, USA
Cunxi Yu
ETH Zürich, Zürich, Switzerland
Lana Josipović
Ningbo University, Ningbo, China
Zhufei Chu

Authors

Kaihui Tu
View author publications
You can also search for this author in PubMed Google Scholar
Xifan Tang
View author publications
You can also search for this author in PubMed Google Scholar
Cunxi Yu
View author publications
You can also search for this author in PubMed Google Scholar
Lana Josipović
View author publications
You can also search for this author in PubMed Google Scholar
Zhufei Chu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kaihui Tu .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tu, K., Tang, X., Yu, C., Josipović, L., Chu, Z. (2024). Performance (Timing) Analysis. In: FPGA EDA. Springer, Singapore. https://doi.org/10.1007/978-981-99-7755-0_5

Download citation

DOI: https://doi.org/10.1007/978-981-99-7755-0_5
Published: 01 February 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7754-3
Online ISBN: 978-981-99-7755-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics