1 Introduction

In information security, True Random Number Generators (TRNGs) are circuits designed to generate truly random binary sequences to be used in cryptographic protocols. These circuits implement entropy sources, being able to generate unpredictable random bits deemed to be sufficiently secure for the considered application [1, 2].

A relevant effort has been made by researchers to investigate solutions suitable for being implemented in digital hardware. Traditionally, such sources of entropy, designed in silicon integrated circuits, are based on the exploitation of electronic noise and meta-stable cells, combined in different ways [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17]. TRNGs must be distinguished from Pseudo Random Number Generators (PRNGs), that are deterministic digital circuits aiming to simulate a truly random source, i.e., not capable to generate information by themselves [1, 2, 18].

Recently, a new class of circuits named Digital Nonlinear Oscillators (DNOs) has been proposed for the design of fully digital TRNGs [19,20,21,22]. As defined in [19,20,21,22], DNOs are networks of electronic digital circuits, each one originally designed to behave as an asynchronous logic gate, implementing autonomous nonlinear dynamical systems, exhibiting oscillations in the time-continuous domain. In [19] it has been shown that DNOs can define dynamical systems supporting structurally stable chaotic dynamics.

In this work we discuss the low-level advanced design of TRNGs based on chaotic DNOs, specialized for FPGAs.

2 DNOs as Dynamical Systems

Conceptually, the topology of a DNO can be represented as the interconnection of subcircuits called Elementary Logic Blocks (ELBs).

In Fig. 1a DNO made of six ELBs is presented. The asynchronous domain, in which the DNO operates as the entropy source, has been separated by the surrounding synchronous domain, in which a digital state machine can be used to implement a TRNG. The two domains are joined by low-complexity synchronous circuitry that, in the simplest case, performs the uniform 1-bit sampling of the DNO analog dynamics, e.g., by means of a single D flip flop. The interface between the synchronous and the asynchronous domains represents another possible source of randomness, since the flip-flop can be affected by metastability. This condition represents an advantage for the circuit purposes, because it increases the entropy provided by the source.

Fig. 1.
figure 1

(a) A Digital Nonlinear Oscillator (DNO) made of six Elementary Logic Blocks (ELBs). The synchronous and asynchronous domains are joined by low-complexity interfacing circuitry (a single D flip flop in the simplest case). (b) DNO based on the topology shown in Fig. 1a suitable for being implemented in FPGAs. Each ELB incorporates a logic functionality and the active digital routing of signals, reported at their inputs.

The specific internal structure of the DNO reported in the figure can be interpreted as a coupled oscillators. A special case of coupled oscillators is obtained when an autonomous dynamical system \(\mathbf {x}\) is used to generate a driving signal exciting a second dynamical system \(\mathbf {y}\):

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{\mathbf {x}} = \mathbf {f}(\mathbf {x}),\\ \dot{\mathbf {y}} = \mathbf {g}(\mathbf {x}, \mathbf {y}),\\ \end{array}\right. } \end{aligned}$$
(1)

being \(\mathbf {x}:\mathbb {R}\rightarrow \mathbb {R}^N, \mathbf {y}:\mathbb {R}\rightarrow \mathbb {R}^M\) real-valued functions of time t, and \(\mathbf {f}, \mathbf {g}\) nonlinear smooth real-valued functions of \(\mathbf {x}\) and \(\mathbf {y}\), respectively. If \(\dot{\mathbf {x}} = \mathbf {f}(\mathbf {x})\) and \(\dot{\mathbf {y}} = \mathbf {g}(\mathbf {0}, \mathbf {y})\) define two periodic dynamical systems, we may call \(\mathbf {y}\) in (1) the forced oscillator, being \(\mathbf {x}\) the forcing periodic driver.

As it can be seen in Fig. 1b, the driving oscillator is composed of three ELBs, and is topologically equivalent to a Ring Oscillator made of three inverting gates. On the other hand, the forced oscillator \(\mathbf {y}\) is composed by three ELBs implementing xor or nxor boolean gates. As represented in the figure, each ELB incorporates both a logic functionality and the active digital routing of its input signals, represented in the scheme as active delays. In the next Section, the low-level FPGA implementation of the DNO is discussed in detail.

3 FPGA Hardware Implementation

The low-level design of the circuit shown in Fig. 1b in a FPGA, discussed hereafter, is a constrained implementation, in which we took control of the hardware resources utilization, using device primitives. In this section we discuss how this constrained implementation should be performed. For lack of space, the HDL description of the investigated system implemented referring to a Xilinx Artix 7 FPGA is reported in [23].

The ELBs low-level design have been carried out using device primitives. The access to these resources is guaranteed invoking the UNISIM library by Xilinx, in any entity directly involved in the design [23]. The logic functionality of any ELB can be implemented by means of one LUT: this means that the DNO in Fig. 1b requires a total of six LUTs (less than the resources available in two Artix 7 slices). For clarity of presentation, each ELB in [23] has been associated to a VHDL entity. In the code, special directives have to be used to force the resource utilization at specific chip locations. Since the DNO is an asynchronous circuit, registers have not to be used. The active routing, necessary to interconnect the ELBs, completes the DNO architecture.

In DNO design, combinatorial loops are mandatory. Normally, combinatorial loops should be avoided in digital designs. They occur when a combinational logic feed back to itself without registers, potentially creating logic race conditions or spoiling the timing analysis during the design phases. For these reasons, most design tools generate Design Rules Check (DRC) errors during the synthesis. To allow combinatorial loops, specific directives to enable the synthesis of intentional loops in asynchronous digital structures have been provided [23].

To its minimal terms, the synchronization interface in Fig. 1a can be reduced to a single D flip-flop performing, at the same time, the 1-bit A/D conversion and the uniform sampling of the chaotic signal (provided by the ELB4 in Fig. 1a). To optimize hardware utilization, the 1-bit register FF primitive has been located in the same slice of the 6LUT primitive implementing the logic functionality of the ELB4. The Synchronization Interface takes part in the static timing analysis of the full design, and its constrained location may have an impact on the successful meet of timing constraints. In complex projects, this issue has to be carefully addressed by the designer.

Finally, in a FPGA the configurable routing is organized by means of programmable switches and connection boxes, according to a hierarchical architecture offering local and regional connectivity. Once that the ELBs have been placed in specific chip locations, the final routing is left to the compiler. To minimize the impact of the DNO on the general project routing, it is recommended to have the ELBs concentrated in few slices, placed close to each others. As discussed in the next Section, this also mitigates the impact of hardware variability on the entropy levels of the TRNG.

4 Impacts of Variability: Experiments

The entropy that can be extracted from a chaotic system is always sensitive to the perturbation of its dynamical parameters. The magnitude of the perturbation is linked to its effects on the entropy in a nonlinear way and, with the exception of few cases, the issue has to be investigated by means of numerical simulations or experiments.

In this Section, we present an exhaustive measurement campaign based on experiments devised to assess the impact of both the chip-to-chip variability and the intra-device variability on the entropy. To this aim, we designed 16 instances of the DNO, in different chip areas of the FPGA, repeating the measurements for six different chips (using the same slice locations), reaching a total of 96 DNO instances. The measurements have been performed implementing the system in Xilinx Artix 7 xc7a35 FPGAs, using an internal sampling clock frequency of 400 MHz.

For each case we estimated the Average Shannon Redundancy (ASR) evaluated on binary words of 10 bits, defined as \(\text {ASR}_{10}=1 + \frac{1}{10}\sum _{i=1}^{2^{10}}P(w_i)\log _2P(w_i)\), \(\text {[bit/sym]}\), where \(P(w_i)\) is the generation probability for \(w_i\in \{0,1\}^{10}\), being the summation extended to the 1024 possible binary words of 10 bits. The estimations were obtained acquiring streams of 1 million bits per experiment at room temperature.

Fig. 2.
figure 2

Effects on the ASR of chip-to-chip and intra-device variabilities, for a condensed and a scattered layout. Red square symbols were used to highlight the chip location 1 (A) or the chip number 1 (B). (Color figure online)

4.1 Condensed Layout

Logic resources in a Xilinx Artix 7 FPGA are organized as a matrix of Configurable Logic Blocks (CLBs), each one containing two slices, and each slice being composed of four 6-input Look Up Tables (LUTs) and eight storage elements.

In the upper subplots of Fig. 2 we reported the experimental results highlighting the effects on the ASR of both the chip-to-chip variability and the intra-device variability, for a condensed layout in which the entire DNO in Fig. 1a was concentrated in two slices (same CLB), including the sampling flip flop D. The percentile levels \(L_{x}\), expressed in bit/sym for \(x=10,50,80,90,95\), were estimated on the base of the entire data set (96 DNO instances). Red square symbols were used to highlight the chip location 1 (A) or the chip number 1 (B).

As it can be appreciated from the figure, 90% of the chaotic DNOs are capable to provide outstanding levels of \(\text {ASR}_{10}\) below \(L_{90} = 0.077\) bit/sym with a sampling frequency of 400 MHz. This corresponds, in principle, to 369.2 Mbit/s of truly random information generated with the minimal usage of only two FPGA slices.

4.2 Scattered Layout

The effects on the ASR of both the chip-to-chip variability and the intra-device variability have been evaluated as a function of the routing complexity. As confirmed by experimental results, adopting a scattered layout in which the ELBs are separated by longer distances in the CLBs matrix increases the impact of variability. The same experiments presented in the previous Subsection have been repeated adopting a scattered layout in which each LUT of the DNO is positioned in a different FPGA CLB.

When adopting a scattered layout, the ELBs are connected through an higher number of switch boxes. As it can be appreciated from the lower subplots of Fig. 2, this solution worsen the entropy, in average, enhancing the effects of variability. However, it is worth noting that the best cases in both of the layout (10th percentile) share similar levels of ASR.

In this case 90% of the chaotic DNOs are capable to provide levels of \(\text {ASR}_{10}\) below \(L_{90} = 0.17\) bit/sym with a sampling frequency of 400MHz. This corresponds, in principle, to 332.0 Mbit/s of information. The results are still exceptional, considering the reduced amount of resource utilization.

5 Conclusion

In this work we have discussed the low-level advanced design of TRNGs based on chaotic DNOs, specialized for FPGAs. In detail, starting from a specific DNO topology, we have discussed technical solutions to implement these systems exploiting FPGA device primitives, performing constrained layout implementations. The proposed solutions, capable to provide high entropy levels at a minimal cost of resources utilization, have been characterized by means of exhaustive measurement campaigns to assess and investigate the impact on the entropy of both chip-to-chip and intra-device variability.

According to the observed results, compact layouts allow the system to achieve higher performances in terms of generated entropy. This implies that we can variate the circuit performance simply acting on the routing. An interesting aspect to be investigated would be the effect of the chosen layout on the resulting system power consumption. This kind of investigation will be realized in the future works, as we plan to adapt the presented low-level design approaches for the design of fully digital ASIC based TRNGs.