1 Sensor Nodes Empowered by SoC FPAA Devices

Analog Computing has grown up, fueled through the emergence of large-scale Field-Programmable Analog Array (FPAA) devices (e.g., SoC FPAA [11]), the generalization of FPGAs. Physical computing [12], which includes analog computing, enables both improved computational efficiency (speed and/or larger complexity) of ×1000 or more compared to digital solutions (as predicted by [27]) and potential improvements in area efficiency of x100. Physical computing is now programmable and configurable (e.g., [11]). The rise of programmable and configurable analog techniques (e.g., [11]), integrated with digital processing, enables a wide use of physical computing techniques, not limiting to a few analog IC design specialists [11, 12].

Figure 1 illustrates discussions on a wireless sensor node utilizing FPAA. FPAA devices allow the user to investigate many physical computing designs within a few weeks of time. The alternative for one design would require years of IC design by potentially multiple individuals. The sensor node could classify (e.g., [11]) and learn (e.g., [15]) from original sensor signals, performing all of the computation required for the computation and operating the entire system in its real-world application environment. Embedded learning approaches, implemented in a single FPAA device, illustrate the small area and ultra-low power capabilities of configurable physical computing. Section 2 discusses the context-aware opportunities in FPAA architectures.

Fig. 1
figure 1figure 1

Embedded configurable physical computation. (a) Physical computation in an embedded platform enabling a sea of analog and digital interacting and computation enable significant computing resources moving from sensors to decisions to be communicated. (b) Overview picture of the recently published SoC FPAA device [11]. (c) A wireless sensor node using this FPAA device, heavily utilizing context-aware techniques. The data from these experimentally measured structures will guide further scaling efforts (size, energy consumed). One application for this sensor network would be for ground-level monitoring of people, cars, trucks, machineries, or other elements through acoustic or MEMS vibration/accelerometer sensors. A second application for this sensor network would be for a body-level sensing network, monitoring the behavior of knees, heart, and other internal organs through a combination of vibrational and acoustic sensors

Although the low-power physical computing could have huge impacts for network of autonomous sensor nodes, these FPAA-enabled nodes often require secure operation. Although FPAAs are a recent technology, widespread adoption of these devices eventually requires some level of security measures against malicious users. This discussion overviews low-power context-aware FPAA architectures (Sect. 2) and then addresses FPAAs as physical computing devices for low-power embedded applications (Sect. 3). The conversation moves to secure FPAA devices (Sect. 4), showing positive FPAA security attributes (Sect. 4.1) and addressing FPAA security issues (Sect. 4.2). FPAA devices can be used to investigate security of analog/mixed-signal capabilities (Sect. 5), as well as be part of the resulting secure computation, such as implementing unique functions (Sect. 5.3). The final section summarizes the discussions as well as addresses remaining issues for secure ultra-low power embedded FPAA devices.

2 Low-Power Context-Aware FPAA Architectures

Many portable and wearable devices are constrained by their energy efficiency. Figure 2 illustrates the energy impact for cloud computation, on-device digital computation, and FPAA-assisted computation. The digital communication typically dominates the overall energy consumption [14].

Fig. 2
figure 2figure 2

Comparison of cloud computation, on-device computation, and FPAA computation. For cloud computation and for on-device computation, we only consider the energy required for communication. All devices might have an RF radio; we consider just the part required for this core computation. For FPAA computation we include the entire device. If cloud computation were considered free, then cloud and on-device computation would appear of similar complexity. FPAA computation dramatically decreases the resulting on-device computation

Cloud-based computing removes issues of real-time embedded (e.g., fixed-point arithmetic) to be done on some far away (and supposedly free) server using MATLAB-style coding. Computation done off-device is not seen and considered effectively endless; eventually that resulting energy and resulting infrastructure required still have significant impacts. The host system still must constantly transmit and receive data through its wireless communication system to perform these computations. The network connectivity must have a minimum quality at all times; otherwise performance noticeably drops. One often assumes the cloud is nearly free for a small number of users; as the product scales to the consumer market, these assumptions can break down. Although the local digital device computation (for a good wireless network) requires similar energy for cloud and on-device computation (at a 100MMAC(/s) level), physical computation, such as FPAA-empowered devices, enables factors of 1000x improvement in the overall power requirements.

These approaches enable ultra-low power physical computing that enables small devices capable of computational-intensive (> 10MMAC(/s)) context-aware processing. This approach requires low-power components, continuously operating or operating frequently, that can decide when to wake up the expensive hardware components. A node requiring 100 μW average power could operate for several months on a single battery. These computing components are enabled by the x1000 energy improvement (and ×100 area improvement) from FPAA classification algorithms. The always-on computation in stage one requires being physical computation, both because of its computational power and its close proximity to sensor inputs.

The need for low-average power consumption requires that higher-power devices, like wireless transceivers and even embedded μP, must be shut down most of the time. These devices should be active only in those rare cases where they are needed, such as when messages need to be passed between nodes. Similarly high-power sensors and actuators (e.g., acoustic speaker) need to be shut down except when it is being used. Without using physical computation, the sensor node would be a simple, low-speed data acquisition node and likely cannot stay under 1mW given the power constraints of the embedded processor and wireless transceiver. The rest of the processing probably still needs to take place on some other digital system.

Physical computing in context-aware architectures enables potential energy-harvesting opportunities. Most energy-harvesting devices supply ≈10 μW of power per cm2 except in unusual environments. Figure 3b shows the device lifetime (due to average power consumption) for a single coin cell battery (0.1–0.5 Ah). A 10 cm2 energy-harvesting device could supply 100 μW of average power, a manageable area for an embedded sensor node.

Fig. 3
figure 3figure 3

Small sensor nodes require energy-efficient computation, computation enabled through physical computing approaches. Device lifetime available for wireless sensor nodes. The graph shows the opportunity for energy-harvesting systems at the size of 1 and 10 cm2 form factors; most energy-harvesting systems output 10 μW of power per cm2, with the exception of solar cells in direct sunlight in a desert

3 FPAAs as Physical Computation Devices

FPAA devices are our vehicle for discussing ultra-low energy computing. FPAA devices allow the user to investigate many physical computing designs within a few weeks of time. The alternative for one design would require years of IC design by potentially multiple individuals. These FPAAs compare favorably against custom designs, and unlike FPGA designs, FPAA architectures are open to the academic community.

Floating-gate (FG) devices empower FPAA by providing a ubiquitous, small, dense, nonvolatile memory element [19] (Fig. 4). A single device can store a weight value, compute signal(s) with that weight value, and program or adapt that weight value, all in a single device available in standard CMOS [17, 18]. The circuit components involve FG-programmed transconductance amplifiers and transistors (and similar components) with current sources programmable over six orders of magnitude in current (and therefore time constant) [24]. Devices not used are programmed to require virtually zero power. FG devices enable programming around device mismatch characteristics, enabling each device in a batch of ICs to perform similarly.

Fig. 4
figure 4figure 4

Circuit diagram for basic FG device. The device can store a nonvolatile charge (Q), enables feedforward computation of functions involving Q, can program, and can adapt Q, all in a compact device structure. Multiple transistors sometimes ease the programming infrastructure for generic programming architectures

The SoC FPAA [11] ecosystem represents a device to system user-configurable system. An SoC FPAA implemented a command-word acoustic classifier utilizing hand-tuned weights demonstrating command-word recognition in less than 23 μW power utilizing standard digital interfaces (Fig. 5) [11]. Multiple analog signal processing functions are a factor of 1000× more efficient than digital processing, such as Vector-Matrix Multiplication (VMM), frequency decomposition, adaptive filtering, and classification (e.g., [11] and references within). Embedded classifiers have found initial success using this SoC FPAA device toward acoustic classification and learning (e.g., [11, 15] ) in 10–30 μW average power consumption. The circuits compute from sensor to classified output in a single structure, handling all of the initial sensor processing and early-stage signal processing. This ecosystem will scale with newer ICs built to this standard, as expected by all future FPAA devices [20].

Fig. 5
figure 5figure 5

High-level picture of low-power classification using an acoustic classifier for command-word classification. This approach enables computation from sensor to classified output in a single structure, handling all of the initial sensor processing and early-stage signal processing. The SoC FPAA device classified the word, such as dark from the TIMIT database phrases. This analog computation (< 23 μW) is radically different than the class of expected analog operations

This new capability creates opportunities, but also creates design stress to address the resulting large co-design problem. The designer must choose the sensors as well as where to implement algorithms between the analog front end, analog signal processing blocks, classification (mixed-signal computation) which includes symbolic (e.g., digital) representations, digital computation blocks, and resulting μP computation. Moving heavy processing to analog computation tends to have less impact on signal line and substrate coupling to neighboring elements compared to digital systems, an issue often affecting the integration of analog components with mostly digital computing systems. Often the line between digital and analog computation is blurred, for example, for data converters or their more general concepts, analog classifier blocks that typically have digital outputs. The digital processor will be invaluable for bookkeeping functions, including interfacing, memory buffering, and related computations, as well as serial computations that are just better understood at the time of a particular design. Some heuristic concepts have been used previously, but far more research is required in building applications and the framework of these applications to enable co-design procedures in this space.

Analog computation [12] becomes relevant with the advent of FPAA devices, particularly the SoC FPAA devices [11]. Figure 6 shows a high-level view of the demonstrated infrastructure and tools for the SoC FPAA, from FG programming, device scaling, and PC board infrastructure through system-enabling technologies as calibration and built-in self-test methodologies and through high-level tools for design as well as education (e.g., [21]). This framework is essential for application-based system design using physical systems particularly given modern comfort with structured and automated digital design from code to working application. This framework utilizes abstractions in a mixed analog–digital framework [13], as well as development of high-level tools to enable non-device/analog circuit designers to effectively use these approaches [9] (Fig. 6). High-level design tools are implemented in Scilab/Xcos that enable automated compilation to working FPAA hardware [9]. These tools give the user the ability to create, model, and simulate analog and digital designs. Physical algorithms may show improvements beyond just energy efficiency for digital computing machines [12].

Fig. 6
figure 6figure 6

SoC FPAA approach consists of key innovations in FPAA hardware, innovations and developments in FPAA tool structure, as well as innovations in the bridges between them. One typically focuses on what circuit and system applications can be built on the FPAA platform, but every solution is built up for a large number of components ideally abstracted away from the user

4 Embedded FPAA Security Concerns

The FPAA opportunities presented in the last section, particularly the ultra-low energy and small size characteristics, require consideration to make these embedded nodes secure. This section discusses multiple opportunities toward secure FPAA devices. Sections 4.1 and 4.1 discuss the positive characteristics and security issues, in turn, for the FPAA device family. Sections 5.1 and 5.2 discuss the aspects of using FPAA devices to develop training and procedures for deconstructing custom ICs, giving a sense of the security of a compromised FPAA device. Section 5.3 discusses using the FPAA infrastructure to build a unique function for security, potentially in securing the given FPAA device.

4.1 Positive FPAA Security Attributes

The FPAA structure has a number of good security aspects. The FPAA uses FG devices to store the device state without any SRAM loading vulnerability, particularly from an external IC. Once the FG values on the chip are programmed and loaded, the FPAA code is secure, unless one can scan out the states of the FG elements. FG programming an IC will have minimal changes over the lifetime (e.g., 10-year rating) of the part. The programming code is not the IC μP SRAM, but only used for programming, and then purged after programming. Analog values can be hard to measure without disturbing the values significantly, and digital computation can be encoded with analog computation and storage. Further, very low-power circuits are challenging to externally measure due to the low-circuit currents (e.g., pA and nA). These transistors do not have enough current or field to generate light to measure transistor behavior and become very hard to measure the external fields.

On the other hand, the FPAA structure is a platform for creating secure applications. The SoC FPAA structure is a generic structure, openly published, and built from general components. None of the particular components are unknown or confidential. IC layout says almost nothing about the programmed IC functions. The motivation to steal the knowledge of on-chip FPAA circuits is minimal. The infrastructure can measure the analog behavior at any given node in the FPAA. FPAAs allow for scanning every hardware node internally to the circuit (e.g., [11, 33]). If the core FG programming on the IC is verified, effectively part of the calibration procedure and measurement [25], then the entire IC can be verified. Secure analog and digital code can be programmed in a secure space.

The IC could have intelligence, using internal signals and voltages, to choose to erase its contents. If tampering is suspected, the operating device could pull up on the tunneling voltage line(s) in an attempt to erase the previous operating code. The device parallel erase occurs from a combination of electron tunneling and reverse tunneling. The result leaves little chance of recovering any previous code even with a short erase cycle. One is more likely to pick up device mismatch patterns rather than anything of the previous code.

4.2 Addressing FPAA Security Issues

FPAA devices are far from safe from a potential malicious agent, even with a number of good starting properties. For example, the current FPAA devices do not have encryption and related security on the input control of the device. If an actor could connect to the particular control connections, even if the IC pins are disconnected or disabled, they could get direct control of the device and programming infrastructure. Future FPAA devices will have encryption on the control structure, particularly as they move to a wider user community. The encrypted access can make use of a PUF from the particular FPAA, such as the approach shown in Sect. 5.3. Encryption is a straightforward solution used on secure FPGA devices. This section will consider the resulting issues for these devices.

Figure 7 illustrates possible security issues and types of attacks for an embedded system built with SoC FPAA device. The FPAA attacks could happen by physical tampering with an existing device, as well as electronic attacks through the communication port, such as a transceiver port. In a physical FPAA attack, the device is obtained while avoiding self-destruct sequence to be explicitly deconstructed. If the internal code can be obtained, likely at considerable expense, one could potentially reconstruct the FPAA function. Mismatch encoded functions would require additional computational and measurement structures. An alternate physical FPAA attack could use a compiled digital serial port to gain access to the digital control and resulting programming interface. When digital interfaces (e.g., SPI) are controlled by the processor, getting control of the processor is unlikely. A more likely situation is finding a way to stall the computation resulting from a physical attack on the clock structure. Many systems are far less secure due to physical tampering if the device has been obtained, and any self-destruct/erase mechanism was somehow avoided. A more likely situation is a nonphysical attack through the transceiver interface into the IC. These can include attacks to gain control of the FPAA device to reprogram the device or constantly attacking a device to drain the node battery power.

Fig. 7
figure 7figure 7

Possible security issues for an embedded system built with SoC FPAA device. Some attacks could occur through the known communication path, such as through the wireless transceiver port, and other attaches could occur through direct physical access to the device

Low-energy computation opens application opportunities at 10 mW, 1 mW, and lower-average power consumption, and yet the low power consumption constrains the system security capabilities. Embedded FPAA applications have limited digital memory because of the system cost. Network security is characterized in terms of classes of networked devices, summarized in Table 1 [6, 22, 26, 29, 31]. SoC FPAA is a C0 device having only 32 kB total digital memory. Digital memory is expensive in terms of relative on-chip area, complexity, and energy dissipation. Many systems going forward might have less total digital memory, as well as many systems that will not rise to the C1 memory level. FPAAs enable a whole opportunity of C0 devices, devices many assume are impossible to secure over a network. Running a minimal OS and security code may exceed the rest of system energy budget.

Table 1 Summary classification of IoT systems

So how do we have an ultra-low power secure IoT system? Part of the opportunity is coding systems outside of a minimal OS, consistent with the rest of the event-based FPAA μP code, as well as enabling tight secure stack and security aspects in MSP 430 assembly language. Digital FPAA event code is coded in assembly language and encapsulated in graphical code for easy user reuse.

5 FPAAs for Investigating IC Validation

When a user has a programmed FPAA device, it looks like any other custom IC that performs one or a set of functions. Further, the IC layout says nothing about the actual device performance. If the user knows it is an FPAA device and has sufficient knowledge of its programming functions, they might have additional information to figure out the function; otherwise, all they have is the device to characterize.

FPAA devices become good test platforms to investigate how individuals might deconstruct a particular IC. FPAA devices allow for many reprogrammed circuits, so the approach can be repeated many times. In the following subsections, we will discuss two such cases. First we will overview the inspiration of this study, the Black Box (BB) exam at Caltech (CNS 182). Second, we will discuss how this approach was modified, in an academic setting enabled by FPAA devices, to Training IC Deconstruction.

5.1 Black Box (BB) Exam: CNS 182

Academic groups are not usually interested in deconstructing known ICs, and when they are, the details are rarely discussed. One particular exception I personally experienced, both as a student (1993) and a teacher (1994–1996), was the Black Box (BB) exam at Caltech (CNS 182). This particular exercise was the final exam for the second quarter (Winter quarter) for CNS 182, Analog VLSI, and Neural Systems, between 1989 and 1996.

The exam consisted of a 2-h lab session followed by several days (4–5) to write up the results discovered during the lab session. The students in the class spent every week for two quarters measuring custom-built ICs, starting with transistors through small systems, using typical computer-controlled bench equipment. When the students arrived in the lab for the BB exam, a particular circuit consisting of 3–5 pins (besides power (Vdd) and ground (GND)) was operating correctly in one possible mode. Typically the circuit was a single transconductance amplifier (TA) or 2 TA circuit with a couple of transistors and known to be somewhat related to course topics over the first two quarters. This circuit was part of a 40-pin chip custom fabricated for the course; the students did not have access to any layout information. At least one element was a bias, set by a potentiometer. No FG devices were used. In the end, roughly half of the students would correctly guess the correct circuit with various levels of experimental justification.

5.2 Training IC Deconstruction Using FPAA BB Approach

The BB experience was recreated between 2011 and 2012 using currently available FPAA devices. The FPAA enables investigating deconstructing circuits, by providing a structured platform to instantiate a large number of circuits and systems. Each case would look from the IC pins to be some custom IC device and could be tested accordingly. The deconstruction capabilities can be quantified for different amounts of IC knowledge, such as routing information or netlists. These techniques could be used to verify a desired circuit implementation, as well as search for any additional component that was placed in the circuit. The FPAAs used for these experiments were designed between 2007 and 2010 (e.g., [4, 33]); the results should directly extend to using the SoC FPAA devices.

A group of graduate student IC designers were trained through a set of six BB events (Table 2) over a 9-month timeframe to eventually deconstruct a custom-fabricated IC. This BB approach arose from the constant interaction between courses and research. One person designed, compiled, and experimentally characterized the design completely without the knowledge of others. The groups had no idea of the functionality of the circuits before they arrived in the lab. Each person on the student team was previously familiar with measuring the FPAA devices. Between events students developed additional tools to assist in deconstructing the IC design.

Table 2 Summary of FPAA black box experiments

Different events had different level of information (Table 2). The first case paralleled the Caltech experience to get a baseline performance, but with roughly double the number of chip pins and number of components, as well as the students involved did not prepare before this starting exercise. The groups did a number of I–V measurements at the chip pins to identify the resulting circuit. In the second case, the groups had a switch list (Fig. 8b), similar in format to the SoC FPAA approach [24]. The group made extensive use of the routing visualization tool, Routing Activity Tool (RAT), to uncover the resulting circuits. Whiteboard pictures prove this solution approach. Figure 8b shows the expected demodulation circuit which all groups found; the groups also found an unexpected extra oscillator that was explicitly added. In later cases the groups were given a form of netlist, compatible with the existing tools, for their analysis. All of the groups developed clustering algorithms to assist with grouping and identifying the resulting circuits. At each level, the speed to fully recognize and experimentally verify a particular circuit increased with the increasing circuit complexity.

Fig. 8
figure 8figure 8

Illustration of the Black Box exam setup. (a) A student would arrive into the lab with a working device demonstrating some characteristic of the circuit. The question is to find the entire circuit, a circuit inside an integrated circuit with a few, e.g., I/O pins, in a finite amount of time (e.g., 2 h). This experience has parallels to security issues when deconstructing an unknown analog or mixed-mode circuit. (b) A low-frequency signal demodulator is an example system (BB2) to deconstruct from the available data. Each of these components was built using available CAB components and routed into the FPAA infrastructure. Typical electrical engineers might predict such an architecture when faced with multiple components. If an additional component is sitting in this circuit, it might create confusion or might just be overlooked. (c) In BB2, the groups had the switch list programmed into the FPAA device. The switch list communicates the physical routing on the FPAA IC. The first two columns are position in x and y direction on chip. The third column is log-encoded value for current level; 1.8 is a value to program as a switch

The final goal was to extract and verify an entire custom IC developed by another group. A group of four Ph.D. students is involved in the BB experiences, and Dr. Hasler would spend three isolated days together to analyze this IC. Although the promised information varied throughout, in the end, the group was given (approximate) delayered information extracted from the IC, not including n-type or p-type selections. After 3 days and 2 additional days to write the report, the group found all four interleaved DACs, although only one was populated fully. The group discovered an error on the VCO due to a misplaced GND line.

This process showed FPAAs could be used to train individuals to deconstruct the circuitry on a particular device, as well as important insights to secure a particular FPAA device. Nonvolatile analog FG storage makes discovering the internal code of a programmed device extremely difficult without huge expenses. The approach showed some unique aspects of using physical computation related to security; the wider opportunities in physical computing [12] show these items are just scratching the surface of what is possible.

5.3 FPAAs for Unique Functions

Unique functions in FPAA IC devices are rich platforms to construct unique functions, particularly for security. The FPAA device allows for the selection of many devices, devices that have mismatch specific to a particular IC and mismatch that can be selected and compiled into a particular circuit. The mismatch between pFETs for a FG device enables almost 1M mismatched components.

Unique functions and PUFs have been implemented in FPGAs (digital) [28, 37] and analog circuits [8, 32]. For example, [37] uses delay variability in the FPGA to create a specific code directly affected by the component variability. All of these functions are based on the mismatch of the resulting device, whether custom fabricated or compiled in the structure [36]. This FPAA approach is similar to the FPGA approach for making unique functions and PUF, in that a function is compiled on the device and utilized to create a unique output code for a particular input stimulus code.

Figure 9 shows an example of FPAA circuit for generating a unique function [16]. This approach utilizes the mismatch available in the FPAA circuit, mismatch we typically remove from the device. The structure yields a code for encryption of data, enabled by programming the desired code by the user. One use for the input code (stimulation) is the address of the FG elements to measure. The resulting outputs, scanned through shift registers available throughout the IC, would be thresholded to yield a digital code (Fig. 10). The FG elements would be programmed to bias the resulting code as desired, modulating the mismatch pattern. Typically one would program all elements to the same current to bring out the mismatch pattern (e.g., [11]). The programmed values would be retained for the operation of the FPAA IC, showing μV shift over a typical 10-year lifetime. The function could be compiled right into the rest of the circuitry, where implementation and routing of other circuits would obfuscate the resulting devices. This technique allows for an evolution of the codes through secure FG updates. If a code was suspected to be discovered, one could easily just move the sensing circuitry to an open circuit area. This unique function circuit may not have to be on the chip, but can be compiled onto a particular IC when needed [16]. If the IC is erased, knowledge of the PUF is erased except in the secure space originally used.

Fig. 9
figure 9figure 9

Example of generating unique functions for secure codes implemented in an SoC FPAA. Threshold voltage (ΔVT0) mismatch at a chosen location in nearly one million FG devices gives the resulting code

Fig. 10
figure 10figure 10

Effective circuit diagram for the PVT analysis of the unique function circuit. The volatile switch line set to Vdd for this circuit

6 Summary and Next Directions

Physical computing opens great opportunities in energy-constrained IoT environments while creating significant security challenges for these IoT devices. FPAA devices enable the large-scale deployment of physical computation, and yet, these FPAA-enabled nodes often require secure operation against malicious users. Low-power context-aware FPAA architectures enable a number of autonomous sensor nodes. FPAA devices have a number of positive security attributes and security issues. FPAA devices can be used to investigate security and be part of the resulting secure computation.

We want to summarize current issues for building and deploying secure ultra-low power embedded FPAA devices. These directions include:

  • Encrypt the control (and therefore programming) data stream, likely using a PUF circuit for the encryption code as part of the FPAA IC.

  • Develop ultra-small security framework in dedicated assembly code + mixed-signal classification that integrates with event-based μP operation.

Network traffic attacks on FPAA-based systems are likely to be a point of vulnerability, requiring building tables and metrics of proper and improper network activity and classifying the resulting responses [1,2,3, 5, 7, 10, 23, 30, 34, 35]. These functions must be done in as low computational energy as possible. The functions require a minimal digital energy in parsing and creating these tables. Classification energy would be minimized using learning classifiers compiled on the FPAA infrastructure [15].

Security for ultra-low power embedded computing platforms based on FPAA devices is possible and is a space rich in potential research opportunities. The need for secure ultra-low power embedded computing platforms will likely only grow in the near future.