1 Introduction

Big data solutions are increasingly being adopted in various social systems for including the fields of transportation, agriculture and energy [24]. These systems require embedded devices to collect useful information, such as mobile devices, physical sensors and smart-meters. Counterfeiting of these embedded devices is prevented by storing a secret key into a chip of the devices and using secure cryptographic protocols between the devices and the systems. However, the secret key can be revealed by de-packaging the embedded devices and analyzing digital chip design in the devices [32]. The possibility of counterfeiting the embedded devices is serious because it greatly reduces the security of the system as a whole. Counterfeit chips are often found in consumer and even in military devices [9].

Recently, physically unclonable functions (PUFs) have come under focus as a solution to this issue [26]. When PUFs are mounted in individual digital chips, the circuit structure remains identical but the PUFs generate unique output IDs (responses) to the same input value (challenge) for each individual digital chip. This uniqueness is provided by random process variations such as wire/gate delay or memory characteristics [7, 8], which occur in the manufacturing process of each digital chip. It is expected that PUFs will represent a breakthrough in technology for anti-counterfeiting devices through their application in the key generation being utilized by secure cryptographic protocols, making cloning impossible even when the chip design is revealed.

There are two categories of PUF on digital chips: delay-based PUFs utilizing wire/gate delay variations and memory-based PUFs utilizing memory characteristics [22]. Delay-based PUFs include Arbiter PUFs [18] and ring oscillator (RO) PUFs [31], etc. The strength of these PUFs is to have an exponential number of challenge–response pairs. Arbiter PUFs, however, have a vulnerability in that their responses can be predicted by a machine learning attack [27]. This attack can also be applied to RO PUFs in the same way. However, RO PUFs can be used securely through restriction to \(\mathrm {O}(x)\) different challenges, where \(x\) is the number of blocks in the RO PUF. Memory-based PUFs include SRAM PUFs [10, 13], latch-based PUFs (LPUFs) [29, 30], Butterfly PUFs [17], Flip-flop PUFs [20] and Buskeeper PUFs [28], etc. What is common to these memory-based PUFs is that each bit of responses is generated based on various kinds of memory cells, i.e., SRAM cell, latch cell, flip-flop, butterfly (cross-coupled latches) and buskeeper cell (cross-coupled inverters). The memory-based PUFs can be regarded as secure storage elements, the stored value of which cannot be specified through their mask pattern images. However, all of the above PUFs are vulnerable to invasive attacks, namely, drilling a hole in the chip with a focused ion beam (FIB), etc., and then using a microprobe to read the value of the response outputted from the PUFs. Therefore, PUFs need to be used together with other techniques to prevent invasive attacks, e.g., active shielding techniques [5]. In contrast, Coating PUFs have been proposed to defend against such threats without the use of such techniques [33]. Their responses are based on the randomness in the local capacitance of the protective coating on the chip. FIBs and other invasive attacks change the capacitance, which makes it difficult for the attacker to read the original response.

This paper focuses on memory-based PUFs and especially LPUFs, multiple responses of which can be quickly obtained without switching the power off/on. This is an advantage of LPUFs over SRAM PUFs, which is one of the most feasible and secure memory-based PUFs. Our proposed method in this paper needs multiple responses for repeated measurements, hence we focus on LPUFs. LPUFs generate an \(N\)-bit response whose 1-bit value is generated based on the output from a RS (Reset-Set) latch. LPUFs have many RS latches outputting constant numbers (consecutive ‘1’s or ‘0’s), i.e., “fixed latches”. The 1-bit value corresponding to a fixed latch is 1 or 0 at any time, so the part of response corresponding to fixed latches is highly reliable. In contrast, the LPUF has several RS latches that generate a mixture of ‘1’s and ‘0’s (random numbers), i.e., “random latches”. A problem caused by these random latches is to reduce the reliability of the response, i.e., the consistency of the values of PUF responses for repeated measurements. This is because the values of the response corresponding to the random latches flip every time a response is generated. Hence, the random responses should be eliminated in order to generate responses with high reliability, i.e., “reliable responses”. As a result, the response bits become lower as the number of random latches increases, which reduces the variety of responses in manufactured PUFs. For example, if LPUF with 256 RS latches has 100 random latches, the maximum variety of responses reduces from \(2^{256}\) to \(2^{156}\). Hence the increase of random latches results in reducing the variety of reliable responses.

We have proposed an efficient method of using the random outputs to enhance the variety of responses [35]. This method regards the three types of output patterns from the RS latches (0’s, 1’s, and random numbers) as three values (00/11/10), respectively. The method can ideally increase the variety of responses from \(\log _2 (2^{N})\) to \(\log _2 (3^{N})\), where \(N\) is the number of implemented RS latches in an LPUF. We verified the effectiveness of the proposed method by using two types of different Xilinx FPGA chips: Spartan-3E and Spartan-6. However, this method does not make the maximum use of the entropy extracted from random latches in LPUFs. Further, we have evaluated the proposed method just on FPGA chips and not on ASIC chips, which are often used in the embedded devices. An evaluation on ASIC chips is very important because PUF performances are strongly affected by chip properties, which are quite different between FPGA and ASIC chips.

1.1 Our contributions

This paper makes three contributions: (1) we propose a new method of increasing the variety of responses over that in [35], while maintaining their reliability; (2) we manufacture 73 LPUF ASIC chips with 256 RS latches, and evaluate the effectiveness of our method using the chips; (3) we evaluate the reliability of responses against both temperature and voltage fluctuations.

In detail: (1) our proposed method utilizes the proportion of ‘1’s in the random numbers outputted from each random latch. This method enables us to distinguish each random latch, while the previous method in [35] regards all random latches as the same. Consequently, our proposed method extracts more entropy from random latches than that in [35], and enhances the variety of responses. However, it is not desirable to use the value of the proportion of ‘1’s without any consideration. This is because this information is easily affected by environmental conditions such as temperature and voltage, which causes the problem of reducing the reliability of responses. To avoid this problem, our proposed method categorizes the random latches not according to the value but the range of the proportion of ‘1’s. The proportion of ‘1’s falls within a particular range of values even when temperature and supplied voltage fluctuate. As a result, the number of random latches in each category is expected to be relatively reliable, so can be used to enhance the variety of responses while maintaining reliability. Here, the most important parameter is the number of categories \(K\), because a large \(K\) improves the variety but worsens the reliability.

(2) In order to evaluate the effectiveness of the proposed method and determine the appropriate value of \(K\), we fabricate 73 LPUFs with 256 RS latches on Fujitsu 0.18-\(\upmu \)m CMOS technology. According to our experiments using the 73 chips, the varieties of responses are \(2^{220}\), \(2^{314}\) and \(2^{379}\) when \(K=2, 3, 16\), respectively. The proposed method in this paper (i.e., \(K=16\)) and the previous method presented in [35] (i.e., \(K=3\)) generate 1.72 and 1.42 times a larger variety of responses than the conventional method of eliminating random latches (i.e., \(K=2\)), respectively.

(3) We confirm that all LPUFs in the 73 chips provide ideal performance as PUFs even when \(K=16\) and both temperature and voltage change to \(-20\) \(^\circ \)C \(\sim \) \(60\) \(^\circ \)C and 1.80 \(\pm \) 0.15 V, respectively. The maximum error rate of responses is approximately 0.096, which is less than the 0.15 assumed in [4] for reliable responses based on Error Correcting Code (ECC) with a reasonable size of redundant data. Our proposed PUF dramatically enhances the variety of responses while maintaining reliability, which is very practical and useful.

1.2 Organization of the paper

The rest of the paper is organized as follows. In Sect.  2, we discuss previous related work. Section 3 gives an outline of an RS latch and an LPUF. Section 4 proposes our method of generating a larger variety of responses. Section 5 describes implementing the LPUFs on ASIC chips and shows our experimental evaluation system. In Sect. 6, we evaluate the effectiveness of our proposed method using the chips. In addition, we evaluate various PUF performances such as reliability with respect to changing environmental temperature and power supply voltage. Finally, in Sect. 7 we summarize our work.

2 Related work

SRAM PUFs [10, 13] and LPUFs [29, 30] have been proposed to extract one-bit entropy from a one-bit memory cell, e.g., a SRAM cell and an RS latch, respectively. In contrast, the motivation of our proposed method is to extract multi-bit entropy from an RS latch. Similar concepts are presented in previous work.

The MECCA PUF [16] is basically based on an SRAM PUF, but has a mechanism for changing the word line duty cycle of SRAM cells (i.e., challenge). The value of SRAM cells (i.e., response) is influenced by the duty cycle duration. The SRAM PUF using a data retention voltage (DRV) [14] also provides a more informative non-binary identifier from each cell. This PUF utilizes the DRVs of SRAM cells by repeatedly lowering the SRAM supply voltage and observing the highest voltage at which each cell fails. The highest voltage resulting in a fail is unique, and so used to generate a multi-bit response. These PUFs need a hardware modification to change the duty cycle duration or the SRAM supply voltage. In [21], a method for error correction using soft-decision information has been proposed to realize a smaller entropy loss in responses. This method can be used as an efficient ECC mechanism for conventional PUFs and even for PUFs based on our proposed method.

The theoretical upper bound of extractable entropy from memory-based PUFs has been evaluated based on mutual information [2]. While the paper [2] discussed a similar topic to this paper in terms of the upper bound of extractable entropy, our paper is significantly different in that we show a concrete method and evaluate the performance of the PUF using a custom-designed ASIC implementation of LPUFs. Moreover, we have evaluated the combined effect of environmental parameters such as supplied voltage and temperature.

In [1], the idea of excluding dark bits, corresponding to the responses from random latches, has been proposed to improve reliability of responses. In contrast, the above-mentioned approaches, including our proposed method, utilize the dark bits to extract more entropy for PUF responses.

3 Conventional methods

3.1 RS latch

The behavior of an RS latch consisting of two cross-coupled NAND gates is shown in Fig. 1. The NAND-based RS latch is one component of our LPUF. An RS latch can be configured by NAND, NOR, and other types of gate. This difference, however, does not influence the performance of LPUFs. A general RS latch has two separate input signals connected to an upper NAND gate and a lower one. Note that the RS latch for the LPUF in Fig. 1 has a single combined input \(A\). The RS latch is in a stable state with outputs \((B,C)=(1,1)\) when input \(A=0\), while it temporarily enters a metastable state right after input \(A\) changes from 0 to 1 (= rising edge). Right after this, it enters into one of two stable states: its outputs are \((B,C)=(1,0)\) or \((B,C)=(0,1)\). The probability of transition to either of these states is equal in ideal RS latches. Actually, however, most RS latches have a high probability of entering one specific state. This is caused by a slight difference in the drive capabilities of the two NAND gates or the wire length of the cross-coupled part. When a signal with consecutive rising edges, i.e., a clock signal, is applied to input \(A\), output \(B\) sampled in the stable state after the rising edges falls into one of three patterns: all ‘0’s, all ‘1’s, or a mixture of ‘0’s and ‘1’s (= random numbers). The ratio of ‘1’s in the random numbers depends on each RS latch, which is important and useful in this paper.

Fig. 1
figure 1

NAND-based RS latch

3.2 Latch-based PUF

This section explains the mechanism of an LPUF, shown in Fig. 2. An LPUF consists of \(N\) parallel-implemented RS latches, which generate an \(N\)-bit response:

$$\begin{aligned}&RES = R_{N-1} \parallel R_{N-2} \parallel \cdots \parallel R_{i} \parallel \cdots \parallel R_{1} \parallel R_{0}, \nonumber \end{aligned}$$

where \(R_{i}\) is a unique value outputted from an RS latch \(i\) (\(0 \le i \le N-1\)) and the operation \(\parallel \) means a concatenation of two variables. Note that the more significant bits of the response correspond to the outputs of RS latches with larger latch numbers, in order to simplify discussion in this paper. The LPUF in Fig. 2 has some random latches such as \(\mathrm{LATCH_{2}}\) and \(\mathrm{LATCH_{{ N}-2}}\). These random latches cause a problem insomuch that the reliability of the response RES is reduced since their outputs are unstable random numbers. There are two widely known conventional approaches to solve this problem.

The first approach does not use random latches for the generation of responses. This approach maintains the reliability of responses, but reduces the bit length of responses, i.e., the variety of responses, as the number of random latches increases. Thus it is necessary to implement extra RS latches in an LPUF in accordance with the number of random latches. However, such a solution is not suitable for embedded systems with limited hardware resources. This is because it is necessary to make the area size of RS latches and peripheral circuits as small as possible in LPUFs in embedded devices.

The second approach uses ECCs to correct the non-reliable responses caused by the random latches. This approach requires larger redundant data for response correction as the number of random latches increases. The large redundant data increases the ROM code size, which is not suitable for embedded devices. In addition, an LPUF with \(n\) RS latches naturally extracts \(k (<n)\) bits of entropy even if \([n, k, d]\)-code is used as an ECC.

From the above discussion, the first approach is not desirable when extracting more entropy from PUFs. The second approach is essential for memory-based PUFs used for secure key storage, although it is not sufficient to use this approach alone. In Sect. 4, we propose a method for extracting more entropy from PUFs by utilizing these unwanted random latches. The proposed method dramatically improves the variety of responses and maintains the reliability of responses.

Fig. 2
figure 2

Latch-based PUF

4 Proposed method

The conventional LPUF in Fig. 2 generates responses based on the output values themselves (‘0’s or ‘1’s). We introduced the method in [35] (i.e., the previous method), which extracts the entropy of locations of random latches, rather than eliminating them. If LPUF with \(N\) RS latches has \(T\) random latches, the number of locations of random latches equals \({}_N\mathrm {C}_T\). Here, \({}_N \mathrm {C}_M\) is defined as the number of combinations of \(N\) elements taken \(M\) at a time. The PUF based on the previous method utilizes the entropy derived from the locations of random latches. In this paper, we propose a novel method which extracts more entropy from random latches in order to increase the variety of responses. Concretely, our proposed method uses the information of the proportion of ‘1’s (‘0’s) in the random numbers outputted from each RS latch. The reason why we focus on the proportion of ‘1’s is that this information is different for each RS latch, so is expected to include high entropy. LPUFs using this proposed method are expected to generate a larger variety of responses than the LPUFs using the previous method.

However, the value of the proportion of ‘1’s is affected by environmental conditions such as temperature and voltage. Hence we propose a simple and efficient mechanism to keep the responses as reliable as possible and enhance the variety of responses. This mechanism consists of two processes: a dividing process and a labeling process.

First, the dividing process must determine an important factor \(K\): the number of output patterns resulting from RS latches. The previous method distinguishes just three (i.e., \(K=3\)) types of output patterns from the RS latches (‘0’s, ‘1’s and random numbers). In contrast, the LPUFs using the proposed method distinguish \(K (> 3)\) types of RS latches (\(T_{0} \sim T_{K-1}\)). Consequently, \(\mathrm{LATCH}_{i}\) is defined as belonging to type \(T_{k} (0 \le k \le K-1)\) as follows:

$$\begin{aligned} \left\{ \begin{array}{ll} T_{K-1} &{} \quad \text {if}\;\; X_{i}=1, \\ T_{K-2} &{} \quad \text {if}\;\; \frac{K-3}{K-2} < X_{i} < 1, \\ T_{k} (2 \le k \le K-3) &{} \quad \text {if} \;\; \frac{k-1}{K-2} < X_{i} \le \frac{k}{K-2}, \\ T_{1} &{} \quad \text {if} \;\; 0 < X_{i} \le \frac{1}{K-2}, \\ T_{0} &{} \quad \text {if} \;\; X_{i}=0, \\ \end{array} \right. \end{aligned}$$

where \(X_{i}\) is the percentage of ‘1’s within a certain amount of random numbers. Parameter \(K\) is very significant for the dividing process because it has a great impact on the variety and reliability of responses. A larger value of \(K\) increases the variety of responses, but is anticipated to make reliability worse. This is because the smaller range of \(X_{i}\) (i.e., the larger value of \(K\)) we define, the more RS latches are distinguished into different types before/after temperature or voltage changes. This leads to large error bits of responses. Therefore, a large size of redundant data is necessary for ECC. Further, a larger value of \(K\) causes a larger bit length of responses, which increases the area size of peripheral circuits (e.g., flip-flops for storing responses). Hence we should determine the appropriate value of \(K\) through experiments using LPUF implementations.

Next, the labeling process determines unique values \(L_{k}\) for each type of \(T_{k}\), where \(0 \le k \le K-1\). In the previous method corresponding to \(K=3\), the unique values were just simply labeled as 00/11/10 according to the RS latches outputting ‘0’s, ‘1’s and random numbers, respectively. When the proposed method is used (i.e., \(K>3\)), the unique values \(L_{k}\) can be labeled in various ways. Figure 3 shows a proposed method of labeling the unique value of \(L_{k}\) for each \(T_{k} (0 \le k \le K-1)\) in various \(K\) settings \((3 \le K \le 16)\). We will verify if this labeling process is suitable for LPUFs based on the proposed method. This labeling is principally based on the binary represents, where the unique value corresponding to \(T_{k}\) is simply labeled as \(k (0 \le k \le K-1)\). The reason why we use binary represents is that this simplicity results in almost no additional increase in the design cost to decide the labeling way. The naive binary represents are, however, not suitable for the labeling of unique values because PUF performances such as uniqueness and uniformity are not close to ideal (\(\approx 0.50\)).

If we use the naive binary represents, the Hamming weight of a unique value for \(T_{K-1}\) is not \(\lceil \log _{2} K \rceil \) except when \(\lceil \log _{2} K \rceil = \log _{2} K\) (i.e., \(K=4,8,16\)). When \(K=6\), for example, the unique values \(L_{0}\) and \(L_{K-1} (=L_{5})\) for \(T_{0}\) and \(T_{K-1} (=T_{5})\) are ‘0b000’ (\(k=0\)) and ‘0b101’ (\(k=5\)), respectively. As the number of these two types of RS latches (i.e., fixed latches) is almost the same in all implemented RS latches, the proportion of ‘1’s in the response bits \(RES\) (i.e., uniformity) is approximately 0.33 (\(\approx 2/6\)), which is smaller than ideal 0.5. Hence the unique value for \(T_{K-1}\) should be \(2^{\lceil \log _{2} K \rceil } -1\) (e.g., ‘0b111’ when \(K=6\)). The method of labeling described in Fig. 3 satisfies the above-mentioned conditions by simply eliminating the middle range of binary represents.

Fig. 3
figure 3

Labeling method for unique value \(L_{k}\) corresponding to type \(T_{k}\)

The reason why we regard the labeling process as important is that, if the labeling method is not appropriate, the entropy derived from PUFs becomes lower or the reliability of PUF responses becomes worse, which increases ECC costs. Further, there are various methods of labeling, and the Gray code seems to be an effective labeling method. The Gray code realizes high tolerance to noise, i.e., high reliability of PUF responses. However, the uniqueness becomes lower due to the same reason as the naive binary represents, as mentioned before. This is why we use the labeling method as shown in Fig. 3.

We explain the dividing and labeling processes for the specific example of when \(K=6\) in Fig. 4, as follows. If \(\mathrm{LATCH_{0}}\), \(\mathrm{LATCH_{1}}\), \(\mathrm{LATCH_{2}}\) and \(\mathrm{LATCH_{{ N}-2}}\) include 175, 1,024, 686 and 850 ‘1’s in a data stream of 1,024 bits, \(X_{0}\), \(X_{1}\), \(X_{2}\) and \(X_{N-2}\) are approximately 0.17, 1, 0.67 and 0.83, respectively. \(\mathrm{LATCH_{{ N}-1}}\) has no ‘1’s in the data stream, so \(X_{N-1}\) is 0. The proposed method for \(K=6\) classifies RS latches into six types according to the range of \(X_{i}\): \((T_{0})\) \(X_{i}=0\), \((T_{1})\) \(0 < X_{i} \le 0.25\), \((T_{2})\) \(0.25 < X_{i} \le 0.50\), \((T_{3})\) \(0.50 < X_{i} \le 0.75\), \((T_{4})\) \(0.75 < X_{i} < 1\) and \((T_{5})\) \(X_{i}=1\), respectively. According to the labeling method in Fig. 3 for \(K=6\), \(L_{0}\), \(L_{1}\), \(L_{2}\), \(L_{3}\), \(L_{4}\) and \(L_{5}\) are ‘0b000’, ‘0b001’, ‘0b010’, ‘0b101’, ‘0b110’ and ‘0b111’, respectively. \(R_{i}\), the unique value for \(\mathrm{LATCH}_{i}\) (\(0 \le i \le N-1)\), is shown in Fig. 4. Finally, our LPUF generates a \(3N\)-bit response.

Fig. 4
figure 4

Example of the proposed LPUF with the dividing and labeling processes (\(K=6\))

Next we discuss the way of implementing the proposed method, consisting of the dividing and labeling processes. The proposed method has to be implemented on a co-processor alone if our proposed LPUF is to be implemented as a pure-ASIC design. This dedicated circuit on ASICs, however, causes additional overhead in the circuit area for the PUF implementation. Note that embedded systems consist not only of a co-processor with a PUF circuit, but also a microprocessor, ROM, RAM, etc. We therefore assume that a software approach enables us to realize both processes. Output data from RS latches are stored in the RAM and processed by the microprocessor. This approach does not need additional hardware resources, but needs a slight increase in ROM code size. However, this software approach might lead to serious security threats such as response eavesdropping on the microprocessor or the RAM. Even the hardware approach, implementing the proposed method on the co-processor, might face the same threat due to physical analysis, e.g., side channel attacks [12]. Concrete ways of implementing the proposed method and their security evaluation are very important and need to be discussed in detail, and this is included in future work.

Section 5 implements LPUFs with multiple RS latches into ASIC chips. In Sect. 6, we discuss the appropriate value of \(K\) through experimental results using the ASIC chips. We also evaluate the performances of LPUFs such as the variety and reliability of responses by actually generating responses according to the proposed method.

5 Implementation and setup

5.1 ASIC implementation

We fabricated LPUFs on 73 ASIC chips using the Fujitsu 0.18-\(\upmu \)m CMOS process (CS86 technology [19]) in order to evaluate LPUFs with the proposed method. An RS latch was custom-designed as a hard-macro in the process of designing an IC mask layout. The purpose of this design is to equalize wire lengths between the cross-coupled NAND gates shown in Fig. 1. This enables the RS latch to enter a metastable state more readily and improve the probability of the RS latch outputting random numbers. We implemented an LPUF with 256 RS latches on a chip by an automatic placement of the 256 instances of the hard-macro. RS latches in our chips do not include flip flops (FFs) in front of the two NAND gates [11, 35] in order to reduce circuit area size. The 73 chips were embedded in DIP-28 non-sealed packages. Note that in fact we fabricated 80 ASIC chips, of which 73 chips work correctly. The other seven chips have problems concerning the bonding wires, which are disconnected or short-circuited as a result of the non-sealed packages for other studies (e.g., side channel analysis). The rated supply voltage range of the chips is \(1.80 \pm 0.15\) V.

5.2 Measurement setup

We setup the experimental evaluation system shown in Fig. 5. This system consists of two boards: a custom-made expansion board with six sockets for fabricated chips, and a Spartan-3E (SP3E) starter kit board [34] with a Xilinx SP3E FPGA (XC3S500E-4FG320C). The expansion board can evaluate six fabricated chips at the same measurement time. A complex programmable logic device (CPLD) was implemented on the expansion board, allowing us to select one target chip out of the six chips. The core voltage of the chips can be changed by 0.01 V using an external stabilized power supply. The starter kit board possesses peripheral circuits for data acquisition processes such as a Digital Clock Manager (DCM), a block RAM, an RS232C module and a SD write module. A 50-MHz clock signal generated by an oscillator on the SP3E board was applied to the DCM primitive yielding a 2.5-MHz clock signal that was applied to the ASIC chips. The two boards were connected with user I/O interfaces by a connector cable. The clock signal was provided separately through a SMA cable and port from the SP3E to the expansion board in order to prevent signal degradation. A micro-SD adapter and card were also connected to the SP3E board to store output data from the chips.

Fig. 5
figure 5

Experimental evaluation system

The data acquisition process is as follows. When the RS232C module receives a start command from a user PC, the module sends a start signal to the control (CTRL) module. The CTRL module sends a signal \(\mathrm{SEL}_\mathrm{skt}\) to a 6-1 multiplexor (MUX) in order to select one socket. It also sends a signal \(\mathrm{SEL}_\mathrm{lat}\) to a 256-1 MUX in the chips to select a target RS latch. First, \(\mathrm{SEL}_\mathrm{skt}\) and \(\mathrm{SEL}_\mathrm{lat}\) are set to one and zero, respectively. This means that \(\mathrm{LATCH_{0}}\) in the chip on socket 1 is selected for measurement. The CTRL module measures twenty-one 1,024-bit (= 21,504-bit) output streams from \(\mathrm{LATCH_{0}}\) in our evaluation. \(\mathrm{SEL}_\mathrm{lat}\) is incremented by 1 from 0 to 255 in order to obtain output streams from all 256 RS latches. After obtaining all data from the chip, \(\mathrm{SEL}_\mathrm{skt}\) is incremented by 1 from 2 to 6. During this process we evaluate 73 LPUFs implemented on 73 ASIC chips. The output stream data is stored in the block RAM through an FF, sent to the SD write module, and written into the micro-SD card. The PC can obtain the data via the micro-SD card. In our evaluation, software on the PC provides the dividing and labeling processes rather than this being done on the chips. We consider that the technique for the processes does not influence PUF performance because this performance depends just on the output of the RS latches itself.

Next, Sect. 6.1 evaluates the appropriate value of \(K\) in the proposed method in consideration of the tradeoff between variety and reliability of LPUF responses. Later, in Sect. 6.2 we evaluate the following performance-related metrics (reliability, uniqueness, uniformity and bit-aliasing) [23] of LPUFs when changing both voltage and temperature. We chose the standard voltage of 1.80 V and the maximum allowed ASIC chip rated voltages of 1.65 and 1.95 V. We also chose the room temperature of 27\(^\circ \)C and set the minimum and maximum allowed temperatures of \(-20^\circ \) and 60\(^\circ \)C in a thermostatic chamber. Only the expansion board was put in the thermostatic chamber, while the other experimental items were outside it throughout the experiment. The temperature change does not impact the peripheral circuits and does not cause data garbling, which enhances the confidence of our experimental results.

6 Performance evaluation

6.1 Evaluation of the proposed method: appropriate value of \(K\)

This section evaluates the appropriate value of \(K\) according to two metrics: the response error rate (related to reliability) and the variety of responses.

Figure 6 shows the response error rate in each \(K\) at each operating condition \(c\) as shown in the upper part of Fig. 6. The response error rate is defined as follows with the notation summarized in Table 1. We extract a reference response (\(\mathrm{RES}_{i}^{K}\)) from the \(i\)th ASIC chip (\(1 \le i \le w\), \(w = 73\) in this paper) in normal operating conditions (room temperature of 27\(^\circ \)C and a standard supply voltage of 1.80 V) when setting \(K (3 \le K \le 16)\). Similarly, the response (\(\mathrm{RES}_{i}^{\prime K,c}\)) is extracted at an operating condition \(c\). Then, \(m\) samples (\(m = 20\) in this paper) of \(\mathrm{RES}_{i}^{\prime K,c}\) are collected. Here, \(\mathrm{RES}_{i,t}^{\prime K,c}\) is the \(t\)th (\(1 \le t \le m\)) sample of \(\mathrm{RES}_{i}^{\prime K,c}\). The average of error bits for the parameter \(K\) and the operating condition \(c\) (\(\mathrm{AEB}^{K,c}\)) is defined as follows:

$$\begin{aligned} \mathrm{AEB}^{K,c}&= \frac{1}{w \cdot m} \sum _{i=1}^{w} \sum _{t=1}^{m} \mathrm{HD}_{i,t}^{K,c}, \end{aligned}$$
(1)

where \(\mathrm{HD}_{i,t}^{K,c} = \mathrm{HD} \{ {RES_{i}^{K}, RES_{i,t}^{\prime K,c}} \}\), and \(\mathrm{HD}\{x, y\}\) is the Hamming distance between variable \(x\) and \(y\). Our next interest is the response error rate (\(\mathrm{RER}^{K,c}\)), which is defined as follows:

$$\begin{aligned} \mathrm{RER}^{K,c} = \mathrm{AEB}^{K,c}/\mathrm{RES}_\mathrm{bit}^{K}, \end{aligned}$$
(2)

where \(\mathrm{RES}_\mathrm{bit}^{K}\) is the number of response bits obtained from 256 RS latches for \(K\), this being calculated from \(\lceil \log _{2} K \rceil \cdot 256\).

Basically, from Fig. 6, a larger value of \(K\) gives a slightly larger response error rate (i.e., a lower reliability). An unexpected positive result is that the response error rate does not increase dramatically as the value of \(K\) increases. This is because the parameter \(K\) only has an impact on random latches (\(T_{1} \sim T_{K-2}\)) and not on fixed latches (\(T_{0}\) and \(T_{K-1}\)). The average number of random latches is just 36 of the total implemented 256 latches in a fabricated chip.

It is desirable that \(\mathrm{RER}^{K,c}\) be less than 0.15 assumed in [4] for a reasonable size of redundant data in ECC. \(\mathrm{RER}^{K,c}\) is less than 0.15 for all values of \(K\) from 3 to 16 according to Fig. 6, which is the reason why the value of 16 is appropriate for \(K\) in our LPUF in terms of the reliability of responses. However, some LPUFs implemented on different types of CMOS process might include many random latches. In that case, excessively large values of \(K\) should not be used since \(\mathrm{RER}^{K,c}\) is anticipated to increase, which leads to large costs for ECC.

Fig. 6
figure 6

Response error rate \(\mathrm{RER}^{K,c}\) for \(K\) and \(c\). The decrease between \(K=4\) and 5 is due to the value of \(\mathrm{RES}_\mathrm{bit}^{K}\) increasing from 512 to 768. The decrease between \(K=8\) and 9 is for the same reason

Table 1 Notation summary for this paper

Figure 7 shows the entropy of responses with respect to \(K = 3, \ldots ,16\), which contains three graphs: (1) the ideal upper bound on Shannon entropy of responses, (2) the experimental Shannon entropy and (3) the entropy based on the mutual information of responses. These graphs are experimentally calculated based on responses derived from the 73 fabricated LPUFs.

First, we explain how the graphs (1) and (2) are constructed. Let the ratios of the RS latches numbered as \(\mathrm{LATCH_{{ i}}}\) and classified as types (\(T_{0} \sim T_{K-1}\)) be \(P_i(T_{0})\), \(\ldots \), \(P_i(T_{K-1})\), respectively. Assuming that each RS latch is independent, the Shannon entropy derived from \(\mathrm{LATCH_{0}}\) to \(\mathrm{LATCH_{255}}\) are given as

$$\begin{aligned} \sum _{i=0}^{n-1} E_{i}, \end{aligned}$$
(3)

where \(n=256\) and \(E_{i}\), the Shannon entropy derived from \(\mathrm{LATCH_{{ i}}}\), is defined as

$$\begin{aligned} E_{i}&= -\sum _{j=0}^{K-1} P_i(T_{j}) \log _{2}P_i(T_{j}). \nonumber \end{aligned}$$

The graphs (1) and (2) are given by Eq. (3). The graph (1) assumes that the number and ratio of random latches are 36 and 0.14 (\(\approx 36/256\)) strictly on every chip, respectively. This value of 36 comes from the average number of random latches in our LPUFs on ASIC chips. This ideal upper bound is also based on the following two requirements: (1) the numbers of random latches belonging to all types (\(T_{1} \sim T_{K-2}\)) are equally \(36/(K-2)\), so \(P_i(T_{1})= \ldots =P_i(T_{K-2}) = \{36/(K-2)\}/256\), (2) the numbers of fixed latches belonging to \(T_{0}\) and \(T_{K-1}\) are equally \((256-36)/2=110\), so \(P_i(T_{0})= P_i(T_{K-1}) = 110/256\).

Fig. 7
figure 7

Estimations of entropy of responses: (1) the ideal upper bound on Shannon entropy of responses, (2) the experimental Shannon entropy, (3) the entropy based on the mutual information of responses

Next, we explain the graphs (3) as follows. The graph (2) assumes that the responses are completely reliable; they are identical in both enrollment and reconstruction phase. Actually, however, the responses have some error bits (i.e., noise) due to environmental fluctuations, therefore some bits have to be sacrificed as redundancy bits for error correction. In order to estimate the entropy bits that actually survive the noise, we calculate \(\mathbf{I}(X;Y)\): the mutual information between responses obtained in enrollment (27\(^\circ \)C/1.80 V, normal condition), \(X\), and in reconstruction (\(-20^\circ \)C/1.65 V, worst condition, see Fig. 8), \(Y\). This estimation is based on the method introduced in [15], whose core idea is presented in [25].

From Fig. 7, \(K\) increases with the difference between the experimental results and the upper bound. This means that a larger value of \(K\) cannot necessarily result in a much larger variety of responses. For example, the experimental Shannon entropy and the entropy based on the mutual information increase approximately 62 and 45 bits from \(K=3\) to \(8\), in contrast, it increases only 16 and 20 bits from \(K=8\) to \(16\), respectively. This is because the aforementioned requirement (1) is not satisfied, that is, there are a lot of random latches outputting random numbers whose proportion of ‘1’s is very low or high, such as \(T_{1}\) or \(T_{K-2}\). Our LPUFs can generate responses with maximum variety by setting \(K=16\) since the response error bit is relatively small with a larger value of \(K\).

The entropy of responses based on the mutual information is estimated to be 220 bits when LPUFs using 256 RS latches generate responses eliminating 36 random latches. The LPUFs based on the previous method, i.e., the proposed method for \(K=3\), generate responses with approximately 314 bits of entropy, which is about 1.42 times as large as 220 bits. In contrast, the LPUFs using the proposed method for \(K=16\) generate responses with approximately 379 bits of entropy, which is about 1.72 times as large as 220 bits and about 1.21 times as large as 314 bits. Our proposed method therefore dramatically increases the Shannon entropy of responses, i.e., the variety of responses.

Note that appropriate values of \(K\) depend on the methods of implementing RS latches and the process technologies of ASIC chips. Hence the values of \(K\) should be carefully decided in consideration of the tradeoff between reliability and variety of responses.

6.2 Evaluation of basic PUF performance

This section evaluates our LPUFs in terms of the basic performance defined in [23], i.e., reliability, uniqueness, uniformity and bit-aliasing at \(K=3,8\), and \(16\). The LPUF based on the proposed method gives the results for reliability and the other three metrics shown in Figs. 8 and 9, respectively. Our LPUF with 256 RS latches generates a \(\lceil \log _{2} K \rceil \cdot 256\)-bit response.

Fig. 8
figure 8

Reliability at various conditions

Fig. 9
figure 9

PUF performances under normal conditions (\(27^\circ \)C/1.80 V)

In the reliability evaluation, the reliability of responses is evaluated under the condition that the supply voltage and environmental temperature are changed within the rated voltage range of the ASIC chips (1.65, 1.80, 1.95 V) and the allowed temperature range of the thermostatic chamber (\(-20^\circ \), 27\(^\circ \), 60\(^\circ \)C). Different from Fig. 6, this reliability evaluation focuses not on the average but on the histogram of response error rates when \(K=3,8\), and \(16\). In this evaluation, one response (i.e., \(\mathrm{RES}_{i}^{K}\)) is generated as a reference at normal operating conditions (\(27^\circ \)C and 1.80 V), and the remaining 20 responses (i.e., \(\mathrm{RES}_{i,t}^{\prime K,c}\)) are generated for analysis at each condition \(c\) at \(K=3,8\), and \(16\) from \(i\)th ASIC chip. Figure 8 shows histograms of normalized Hamming distances (NHD) between the reference response and each repeated one (i.e., 20 \(\times \) 73(chips) = 1,460 elements). For chip \(i\) and sample \(t\), each data element of the reliability histogram is calculated as follows:

$$\begin{aligned} \mathrm{NHD}_{i,t}^{K,c} = \frac{\mathrm{HD} \{ {RES_{i}^{K}, \mathrm{RES}_{i,t}^{\prime K,c}} \}}{\lceil \log _{2} K \rceil \cdot 256}. \end{aligned}$$

Our LPUFs are the most susceptible to conditions under the low temperature of \(-20^\circ \)C and the low supply voltage of 1.65 V. Even under this condition and \(K=16\), the average and maximum of NHD (i.e., error rate) are approximately 0.064 and 0.096, respectively. These error rates are much less than the 0.15 assumed in [4] for reliable responses based on a Fuzzy Extractor [6] with a reasonable size of redundant data. Hence our result shows that the LPUFs implemented on ASIC chips with our proposed method yields highly reliable responses even under environmental fluctuations.

The uniqueness evaluation generates a total of 73 responses using all 73 ASIC chips (one response per chip). Figure 9(I-a), (I-b) and (I-c) show histograms of NHD between every combination of two responses, i.e., \({}_{73}\mathrm {C}_{2}=2,628\) combinations at \(K=3,8,16\), respectively. The NHDs between the responses of two arbitrary LPUFs at \(K=3\), 8 and 16 are approximately 0.489, 0.497 and 0.497, respectively. The ideal NHD at \(K=8,16\) is 0.5, so our LPUF gives responses with a high level of uniqueness. In contrast, the ideal NHD at \(K=3\) is around 0.444 because ‘10’ is not used for a unique value (see Fig. 3). This is why the NHD at \(K=3\) is a little smaller than the others. The NHD is, however, a little larger than the ideal 0.444 because the average number of random latches is only 36 in our LPUFs, which is smaller than 85 (\(\approx \) 256/3). Consequently, most of the 2-bit unique values are ‘00’ or ‘11’, so the NHD approaches 0.5.

The uniformity evaluation also generates 73 responses using all 73 ASIC chips, i.e., 73 data elements. Figure 9(II-a), (II-b) and (II-c) show the uniformity: how uniform the proportion of ‘0’s and ‘1’s is in the response bits of an LPUF at \(K=3,8\), and \(16\), respectively. For our LPUFs on ASIC chips, the averages of uniformity at \(K=3\), 8 and 16 are approximately 0.486, 0.485 and 0.484, respectively. Since the ideal uniformity is 0.5 for truly random PUF responses [23], our LPUFs satisfy the requirement for uniformity. However, we can see two isolated data elements around 0.34 and 0.67 in three uniformity figures. This is because two particular chips have more one-typed fix latches (\(T_{0}\) or \(T_{K-1}\)) than the other-typed ones (\(T_{K-1}\) or \(T_{0}\)).

The bit-aliasing evaluation measures the difference in the proportion of ‘0’s and ‘1’s in the 73 \(R_{i}\)’s extracted, respectively from the 73 LPUFs corresponding to \(\mathrm{LATCH}_{i} (0 \le i \le 255)\), i.e., 256 data elements. Figure 9(III-a), (III-b) and (III-c) show histograms of the proportion of ‘1’s at \(K=3,8\), and \(16\), respectively. The averages of bit-aliasing at \(K=3\), 8 and 16 are approximately 0.486, 0.485 and 0.484, respectively.

The ideal bit-aliasing is also 0.5 because, if the bit-aliasing is close to 0 or 1, it means that different ASIC chips may generate nearly identical PUF responses [23]. Hence our LPUFs satisfy the requirement for bit-aliasing.

If a user needs a secure 128-bit AES key, he can obtain it as an output from a 128-bit hash function (e.g., SPONGENT-128 [3]), the input of which is the response of an LPUF after being corrected by ECC. If the uniqueness, uniformity and bit-aliasing of the responses are not close to 0.5 (i.e., close to 0 or 1), an attacker may be able to predict the responses and even the output of the hash function (i.e., 128-bit AES key). This is why we evaluate these metrics of PUF responses themselves.

6.3 Cost

Table 2 indicates the processing time and the gate count of our LPUF on a chip, shown in Fig. 5.

The processing time is estimated around 105 ms, this being the total time taken to extract a 1,024-bit output stream from each RS latch. One way of improving the processing time is to reduce the bitstream length, which was 1,024 bits in our experiment. However, too short a length may result in misdividing, an inaccuracy of \(X_{i}\) corresponding to \(\mathrm{LATCH}_{i}\). For example, RS latches outputting a large number of ‘0’s and very few ‘1’s (i.e., \(T_{1}\)) might be detected not as random latches, but as fixed latches (i.e., \(T_{0}\)). This misdividing leads to a decrease in reliability of responses, so our LPUFs make a tradeoff between reliability and processing time.

Table 2 Processing time and gate count of our LPUF

The gate count is obtained by synthesizing the LPUF on the Fujitsu 0.18-\(\mu \)m CMOS process [19] with the Design Compiler 2003.03. Note that one gate is equivalent to a 2-input/1-output (2-to-1) 1-bit NAND gate. The total gate count of the LPUF is about 1.2K gates. This cost is necessary for extracting constant 379 bits of entropy. We consider, therefore, that our LPUF is sufficiently small to be implemented in embedded systems. Note that our proposed method requires additional costs for multiple enrollment and reconstruction measurements. Here, we do not consider these costs since the concrete way of implementing the proposed method must be careful, as mentioned in Sect. 4.

7 Conclusion

We proposed a method of enhancing the variety and maintaining the reliability of responses from LPUFs. We focused on the information of the proportion of ‘1’s in the output stream from each random latch. The dividing process classifies implemented RS latches into \(K\) types according to the proportion of ‘1’s in the output stream. The labeling process defines the unique values generated by \(K\)-type RS latches. According to our experiment with 73 fabricated ASIC chips, LPUFs with 256 RS latches can generate responses with 379-bit entropy based on the proposed method for \(K=16\), considering their errors caused by environmental fluctuations. This is about 1.72 times as large as the 220-bit entropy achieved by a conventional method of eliminating random latches, and is approximately 1.21 times as large as 314-bit entropy achieved by our previous method in [35], corresponding to the proposed method for \(K=3\). Even in the worst-case condition (\(-20^{\,\circ }\)C/1.65 V), the error rate of responses is 0.096. This means that our LPUFs have high robustness (reliability) against both temperature and voltage variation. Our LPUFs also satisfy the basic requirements of PUFs such as uniqueness, uniformity and bit-aliasing.

Future work could include a discussion of the concrete ways of implementing the proposed method and their security evaluation.