1 Introduction

Intelligent vehicles are making serious in-roads in our daily lives. Of primary importance are tech that promises to make the vehicles safer and more convenient. Twenty years ago the state of art in autonomous vehicles was illustrated by the results of the DARPA Grand Challenge 2004: none of the many vehicles competing in a desert race were able to come close at completing the mission of driving over 100 miles in an off-road setting (the top-scoring vehicle traveled only 7.5 miles). Clearly, technology was not ready [1].

Just three short years later the DARPA Urban Challenge showed that a specially designed and built vehicle could be driving autonomously in a relatively simple setting (mock-up town with no vulnerable road users present). Several vehicles completed the mission, and they did so without incidents [2, 3].

Emboldened by DUC 2007 success, Google decided to develop their own autonomous vehicles. Google hired talented researchers from different Universities, such as Stanford and CMU, under the leadership of Prof. Thrun and challenged them to produce a robot capable of driving autonomously on public roads in traffic [4]. Google’s example is illustrative of a popular approach undertaken by many research groups throughout the world: if you want an autonomous vehicle, then you need a vehicle with a variety of sensors, each of which complements the others to allow the vehicle to build a more accurate picture of the world around it so that its motion can be both swift and accurate while navigating in changing and often uncertain driving environment.

Another popular feature of the original Google approach later improved and enhanced by its spin-off Waymo is the use of high-definition maps to ease the vehicle localization and navigation [5]. Such maps have not only detailed elements of the road structure, e.g., the location of curbs and lanes, but also road infrastructure elements, e.g., traffic lights, signs and adjacent buildings. The availability of these elements in the maps recorded as point clouds allows the vehicle to simplify significantly the task of driving, as anything which is not on the map could confidently be called obstacles, static or dynamic objects, even if only a few points of lidar beam returns have been registered for some of them.

Researchers at Tesla are pursuing arguably a more challenging path toward autonomous vehicles: relying on cameras as the primary perception sensor, and sometimes radar but absolutely no lidar! [6] Moreover, Tesla does not use pre-recorded high-fidelity maps to navigate its vehicles. They argue that human drivers use their eyes and may not need maps to drive safely in all kinds of driving conditions, including during the day and at night [7, 8].

Whether one belongs to the Google camp or the Tesla camp, the following question looms large: why can’t we yet do what humans do so readily? Humans not just have two eyes (no camera can yet match capabilities of an eye, and that is why equating an eye with a camera should only be done metaphorically). They have a natural computer tuned by millions of years of evolution which computes according to yet-to-be-understood algorithms [9], and moreover can adapt existing algorithms to solving new tasks, learning new skills, often with little effort and remarkable ease.

In autonomous driving, a significant problem for decision making is a great variety of driving scenarios. Driving scenarios are classes of situations which may happen in the real world. Urban, suburban and rural settings are numerous and can come in very different shapes and forms. However, classes of such settings are limited and manageable by a suitably designed decision making system. Indeed, staying on the road in whatever lane is chosen, performing primitive maneuvers like moving left or right—and doing so without collisions, is basically all it takes to be a safe driver! Of course, the complexity of driving and associated expansion of driving situations may quickly grow once we have to include other objects on the road, signs, and traffic lights, but the basic primitives of driving remain the same. Thus, it may be argued that safe driving is a relatively easy human skill to acquire compared to, e.g., playing chess well, and this fact must have something to do not only with human excellent ability to recognize all kinds of contexts and generalize to driving accordingly but also with the simple nature of driving as a sequential perception-action problem; if it were not the case, then few people would have been able to start driving so quickly, after just a few hours of practicing [10].

Perception problems continue to be the greatest challenge of driving automation; see, e.g., [11,12,13]. Populating the obstacle map and maintaining reliable and easily computer-interpretable picture of the environment in the vicinity of the vehicle are the key features of a perception system of current and future intelligent vehicles including partially autonomous, active driver assistance systems such as, e.g., Toyota Safety Sense [14] or Toyota Guarding [15]. We believe that, in a foreseeable future, the intelligent vehicles will need a variety of sensors in order to be able to gradually approach the competency of an attentive human driver, whether the vehicle still relies on a pre-recorded high-fidelity map of the driving environment or not. A rich sensor suite becomes the necessary condition of sorts, while a sufficient condition is still expected to come from the relentless pace of advances of the autonomous driving algorithms for perception and decision making in terms of better processing of sensor information and making effective driving decisions. The algorithm analysis is beyond the scope of this chapter since we focus on sensing, specifically on the latest developments in radar, lidar and optical computing to enable smart sensing.

This chapter is arranged as follows. Section 2 discusses the advancements in radars and lidars. Section 3 describes our take on edge computing and photonic information process, followed by Conclusion.

2 Advancements in Radar and Lidar Sensing

In this section, characteristics of three typical sensors and their future trends are discussed. A sensor, by definition, detects the surrounding environment and translates it into different forms of information such as electric and mechanical signals. In the vehicle’s perspective, sensors are subsystems detecting other objects such as vehicles, pedestrians or other vulnerable road users, elements of road infrastructure, and any other obstacles around the vehicle.

Camera is the essential among sensors because it is capable of identifying two-dimensional information, color, and texture of targets easily. For understanding traffic signs and signals, lane markings, roadside furniture, which are imperative information for autonomous driving, a camera can offer the most cost-efficient solution. Moreover, it is widely available across many industries. However, drawbacks are its inability of direct range estimation, susceptibility of dynamic range to illumination and weather impact, and the need of processing excessive amounts of data. The future of camera in a broad area of mobility depends on how to manage the combination of sensors and their data with the support of AI and machine learning [6]. Smart sensing emerges as an attractive proposition since much of the data is expected to be preprocessed at an individual sensor end before the central computation unit for increased overall efficiency of computation and energy. As discussed in Sect. 3, metasurface based special image processing technology is also drawing attention from the same perspective [16].

Ever since the commercial debut in heavy-duty trucks in the late 1990s by Eaton Corporation, automotive radar technology has undergone several generations of evolution in parallel with its commercial proliferation. In the early era of automotive radar, the quality and quantity of speed and range information was extremely limited so that its application was feasible for vehicles in well-defined highway condition. For many years radar was the enabling sensor for features such as Active Cruise Control (ACC). For this application absolute positioning of detections is not necessary, assigning the detection to a lane and computing the time-to-collision is sufficient for the system to actuate the accelerator and the brake so that a safety buffer to vehicles in front is maintained. As long as some simplifying assumptions are made such as other vehicles being on the road surface, simple motion models, and fixed widths of the lanes on a highway, the radar sensor only needs to determine the location of detections in range and azimuthal bearing. Therefore, the ability to scan in elevation is removed entirely which simplifies the system considerably by requiring only a single linear antenna array.

As more automated control features are developed for consumer vehicles, these radars originally designed for ACC applications are pressed into wider service. Radar’s superior performance in all-weather conditions and detection of metal objects hundreds of meters away meant that they could not just be excluded from advanced driving sensor suites. However, the highly filtered output of the radar sensors left something lacking for the teams developing the algorithms operating on the raw sensor data. In fact, modern radars are demanded to provide multitudes of advanced features to accommodate other applications: from parking assistance to fully autonomous driving. Preferentially higher detection resolution in both azimuth and range, wider field of view, larger number of targets to be tracked and additional detection dimension in elevation have become requirements. Provisions have been offered by adding more channels of active or virtual elements in conjunction with widening usable bandwidth. Because of the strict regulation against spatially combined electromagnetic energy density, the number of channels or power per channel cannot be increased arbitrarily.

Locating detections in 3D space and determination of object shape would allow the radar to contribute useful data to advanced autonomous driving systems. A single scan direction is insufficient to accomplish 3D localization, so automotive radars will need to be expanded to two scan directions plus ranging. To determine object shape, improvement in resolution is necessary which corresponds to adding more elements to the antenna arrays. Measuring object shape especially when partially obscured through shallow angle multi-path and waveform processing techniques would fill a gap in current sensor suites. Figure 1 demonstrates an early example of target behind target detection [17]. Essentially, a high-resolution 3D imaging radar is required for future driving functions. To implement imaging function, antenna array should be scaled from one dimension to two dimensions; a simple and intuitive method is to extend the ‘N array’ of antenna into ‘N x M array’. Alternatively, in most used antenna architectures a series of N gain elements is connected per channel and could be stacked up resulting in ‘multi-N x M array’. In this case, the end-fire architecture may be preferred because of its feasible feed line connectivity [18].

Fig. 1
figure 1

One of the authors walking in front of a vehicle (on the left) does not disturb the return from the vehicle, shown in a bird’s eye view in center, in a phased array automotive radar prototype. This example is driven by a 16-channel RF phase shifter chip produced by one of our collaborators, on the right

Steering of high-resolution beams in 3D space at RF wavelengths have been commercialized for 5G applications. Automotive radar systems could adopt these technologies in a low-cost manner. One is multiple-input multiple-output (MIMO) 2D antenna technologies for base station communication in 5G networks [19]. MIMO technique can use non-uniform array spacing plus orthogonal waveforms to create virtual channels and reap the benefit of a fully populated array while minimizing physical channels [20], thus reducing cost. This is one way to achieve high resolution radar without the cost of a fully populated phased array. MIMO antennas have even been miniaturized for the use in cell phones, demonstrating that the concept is not limited to a fixed base station use [21]. On the receiver side, MIMO boosts the signal of several cellphone users simultaneously by localizing them and increasing the apparent gain through the MIMO technique. For radar applications, we simply consider the returns from multiple target reflections as the cell phone users when implementing MIMO.

Expanding the capability of the radar system comes with some drawbacks, mainly cost. A 2D MIMO array requires far more down-conversion mixers than current radars and computing the MIMO-related algorithms in an extra-dimension requires more on-board processing power. Both of these points clash with the expectation that radar sensors for vehicular applications should be less than $100 per unit. However, advancements in the area of RF-CMOS may allow us to overcome this challenge.

From the onset of automotive radar transition from radar’s original military purpose, high frequency semiconductor industry played an important role by miniaturizing microwave component in integrated circuits. In the early 2000s, the RF semiconductor used for vehicle radar, especially for millimeter wave radar, was Gallium Arsenide (GaAs). At that time, the RF circuitry was made up of several discrete GaAs chips. With multiple chips and a high-priced fabrication process, the radar cost was exorbitant and such sensors could only be found in luxury vehicles. Heavy investment in cheaper RF-SiGe (Silicon–Germanium) technology pushed the maximum operating frequency beyond 100 GHz, enabling use for millimeter wave radar systems [22]. From the economic perspective, the number of operating channels per chip should be reduced as much as possible, in tune with integrating other functions. The advantage of SiGe technology is that many RF components can be integrated onto a single die due to larger wafer sizes. Recently, RF-CMOS has emerged competing with SiGe for the obvious benefit of cost attractiveness in mass production and its capability of integration with signal processing units. RF-CMOS is compatible with digital CMOS, and both can be integrated into the same die which means that digital processing will be on the same die as the millimeter wave circuitry, reducing costs further. This type of integration also lends itself well to increasing the number of unique down converting mixers in the system, which enables techniques such as 2D MIMO or a fusion hardware platform combining radar with camera. This is to be contrasted with software-focused efforts at using different sensing modalities to improve the overall performance; see, e.g., [23,24,25,26]. Furthermore, the optimizing active channel numbers in massive MIMO with innovative array distributions is expected to be challenging but attractive solutions [27].

Lidar is a useful sensor for automated driving applications as it can generate a high definition 3D point cloud of the surrounding area. Within the point cloud is information about all surrounding obstacles, and the features of fixed geometry such as ground and buildings can be matched to map data to improve self-localization. Short wavelengths of lasers are used to create focused beams which enable the high-resolution point clouds.

Off the back of the DARPA Challenges mentioned in Introduction prominent scanning lidars became a hallmark of vehicles fitted with automated driving systems [28]. These lidars contained an array of discrete lasers and detectors. Pulsed time-of-flight (ToF) methods determined the range to targets, mechanical scanning covered the azimuth scan plane and the array of lasers and detectors behind lenses covered the elevation scan plane. These bulky, heavy, and costly sensors meant that they were confined to research grade vehicles. Mechanical scanning of the whole sensor head with extreme precision meant that a robust brushless AC motor—a substantial cost, was a necessity. ToF methods accurately determine the range to a target but can be subject to interference from the sun or other light sources. These methods can only return a single target within a beam, which causes problems when aerosols or other small particulates are suspended in air, such as rain, snow, or dust. Recently, large strides have been made in silicon photonic integrated circuits (PICs). PICs allow processing of laser light signals in silicon chips. Investment in PICs has been bolstered by its importance in optical data communication, and one massive national collaborative effort is the American Institute for Manufacturing Integrated Photonics (AIM Photonics). AIM Photonics is a US National initiative targeting deployment of silicon photonic mass manufacturing methods throughout industry with funding from federal agencies, such as DARPA, state level, and private interests [29]. With silicon photonics technology an all-solid-state lidar has been demonstrated, and complex waveform encoding can be applied to the lidar signal.

Solid-state techniques utilizing silicon photonics have been demonstrated in two ways. One way is electronic beam steering by optical phased arrays (OPAs). Due to short wavelengths of the lasers used in near-infrared lidar, thousands of array elements can be packed into a single chip to enable high resolution, thin beam formation. One OPA [30] demonstrates a sub 9 cm2 chip with more than 8000 array elements producing a half-power beamwidth of 0.01° × 0.04° and scannable in two directions, with a scan angle of 100° by phase shifters and a scan angle of 17° in the second direction by wavelength modulation. Co-packaged CMOS dies demonstrate amazing miniaturization over classic mechanical beam scanning systems. One challenge of this type of beam steering due to the high resolution beam is covering all scan points in a reasonable amount of time. Wavelength division multiplex (WDM) techniques can naturally and easily be applied in silicon photonics to allow wavelength unique signals to co-exist in the same PIC. Lidar with 100+ comb generated signals being transformed into simultaneously scanned laser beams have been demonstrated [31], which would overcome the time crunch of scanning thin pencil beams.

Another way to realize solid-state lidar is utilizing integrated focal plane arrays, similar to digital cameras. These so-called flash lidars—due to the scene filling transmission of a laser pulse similar to a camera flash, are among the first solid-state lidars to be developed. Large scale flash lidar have been demonstrated [32] and are already on the market for OEM vehicle use [33]. In a flash lidar, the detector array is exposed to free space similar to mechanical lidar which meant they were limited to ToF ranging techniques. Recently, per-pixel integrated heterodyne circuitry though a hybrid CMOS-silicon photonics process offered by GlobalFoundries has been demonstrated [34]. This allows for the simplicity of flash type lidar while opening the door for complex encoding on the laser signal. Still, flash lidar suffers from a tradeoff between resolution and wide-angle field of view that does not exist in beam steering lidar, as a lens must be chosen to transform the focal plane array to angular detection.

One of the benefits in solid-state lidar using silicon photonics is the introduction of advanced encoding of the transmitted signal. Modulating the signal is natural in silicon photonics through ring resonators, Mach Zehnder interferometers or other similar active structures. Frequency Modulated Continuous Waveform (FMCW) method found often in vehicular radar sensors, is a simple modulation which generates a continuous ramp signal in frequency. This method allows determination of both range and velocity instantaneously (instead of estimating velocity by piecing together several detections). Moreover, multiple targets can be detected within a single beam, which is especially useful in poor weather conditions. Rain and snow clutter can be filtered out [35]. If the system has a sufficient number of unique heterodyne detectors, more advanced encoding could be imagined such as CDMA [36] where scene scanning is essentially done in the digital domain.

Before fully solid-state-scan lidar is introduced, Micro-Electro-Mechanical Systems (MEMS) based mirror architecture will be popular in the near future. The priority for solid-state lidar’s success in market penetration is cost competitive miniaturization of essential components in IC. As learned from the precedent growth of radar market, it is important for semiconductor IC providers, tier-1 suppliers and OEMs to form a virtuous ecosystem.

With the betterment of all types of sensors mentioned above, i.e., higher detection resolution in all directions with more precise Doppler signature, it is now possible to predict even the rotation of a moving target. Not only the increased number of voxels but also innovative ways of their association can significantly improve overall data processing. For example, optimized data size and advanced waveforms can enhance refresh time while maintaining the detection accuracy. With advancement in sensor data processing by AI and machine learning algorithms, sensors can provide much more distinctive characteristics of targets, e.g., [37,38,39]. Algorithmic advances focused on embedded software will continue to drive sensor information processing because of its marginal added cost.

3 Toward Energy Efficient Edge Computing via Optical Advances

In the coming years the amount of data acquired from added sensors of intelligent vehicles is expected to increase significantly, and so will the amount of processing required to utilize this data. This leaves two options in terms of data processing: either to operate on edge computing systems in the vehicle or to partially rely on mobile networks in order to make timely transportation decisions.

Figure 2 provides a breakdown of connected computing technologies and where they process data. The role of this figure serves to elucidate technologies in the connected infrastructure of the automotive sector, rather than the broader IoT domain. Sensors lie at the base of the figure. These are broken into passive, active and smart sensors. Passive sensors include items like RFID tags where no energy needs to be supplied to the sensor (more on passive optical sensors below). Active sensors are those that require power to sense. Smart sensors are those that can provide processed data using only the sensor signal; an example might be how a radar sensor can not only provide position information but also velocity information without the need to convert the signal to another domain. Additionally, we define edge computing as any system that can process data at the point of collection, as opposed to shuttling data via a network. While sensors and edge computing are separated in Fig. 2, to clarify their differences, any sensor that can actively process data can be included under the broader umbrella of edge computing; it is not required that the sensor have a microprocessor attached in order to be considered edge computing. Instead, edge computing is the existence of processing without the need to communicate information to a server or cloud network. Traditionally, edge computing included processing of sensor data on a programable controller, however this description has expanded to include systems with graphic processing units, tensor processing units and other more advanced computing infrastructure that was previously only found at data centers. For this reason, we define edge computing as computation through any system that does not rely on the access to fog or cloud computing services. Fog computing relies on the local area network architecture. Cloud computing relies on an internet access point that allows information to travel from the sensor to a globally accessible server.

Fig. 2
figure 2

A schematic of layers at which processing may occur for the automotive sector: the cloud, fog, edge, or sensor level computing. The left of the figure designates the type of connections that might be activated between these compute layers. The right-hand side of the figure displays the type of vehicle communications that might occur, based on the computing structure on the left. For instance, V2X communications require fog or cloud computing, meanwhile internal communications within the vehicle rely on a slew of cabled connections

Figure 2 also describes how these different processing locations impact specific vehicle communication functions. For instance, fog computing allows for vehicle-to-infrastructure processing; this may include data being transmitted wirelessly via Bluetooth or WiFi to local access points that are allowed to make decisions. For instance, one could imagine an intersection with traffic lights replaced via a local network server. The vehicles that enter the intersection would be sent wireless commands from the server that are based on immediate information gathered at the intersection.

With these definitions in mind, we can better understand how the modern and future vehicle may depend on each of these computing levels. For example, relying on the cloud has the potential for high latency due to large volumes of data and transmission rates, network congestion, and frequently occurring deadzones or urban canyons. However, traditional computer architectures remain energy intensive when running neural network algorithms in any intelligent vehicle—whether a car, a drone, or a remote sensor in IoT. The vehicles of today are even being recognized as cloud accessible hubs to be leased for bitcoin mining or algorithm training, as such datacenters-on-wheels sit idly in their garages.

As vehicles become more intelligent, the on-board power requirements of the vehicle must not only take on the most vital roles like safety and convenience but also play an increasing part in attending to diverse needs of vehicle’s occupants. With the growth of autonomous functions, there will be a higher expectation for the vehicle to provide more in-vehicle services (entertainment, shopping, etc.). While power has always been a priority in the design of the vehicle, it will become an even more critical issue as electric vehicles become more autonomous. In our opinion, new hardware systems are required to achieve increasingly intensive computations at various edge interfaces without straining the power source of the vehicle [40].

While the decision to operate partially in the cloud, over mobile networks, or to operate solely on a mobile computing system of the vehicle are very much the subject of a transportation system design, any opportunity to unburden the power system from sensor processing and related computational costs is a universally welcomed proposition. For this reason, a new trend in intelligent vehicles is to off-load some of the needs of edge computing onto passive hardware. In this section, we review some opportunities to utilize passive optical systems for pre-processing prior to reaching the optical-electrical transduction interface [41].

One of the most immediate methods to relieve computational burden is to reformat the data of the sensors on the vehicle. For example, consider the image shown in Fig. 3a. This image can be described by a camera using a set of values from 1 to 256, however, the image conveys a range of information features such as depth cues, color, and lighting. In order to computationally process this image, feature extraction is applied across the data matrix.

Fig. 3
figure 3

Methods for passive optical image differentiation. a Grayscale image of a flower. b Image shown in a passed through a numerically applied Laplacian image kernel. c Metasurface composed of an array of silicon pillars for image differentiation. The white horizontal bar represents 1 micron. d Optical train needed for coherent optical differentiation utilizing a spatial filter

The most common image kernel that is first applied to an image for feature extraction or image segmentation is a differentiator. Here we consider how adding a passive optical filter could be used to achieve arbitrary algorithmic computations of a scene.

One of the most obvious implementations of a passive optical filter for vehicle applications is image differentiation. While the technology is still at its early stage, researchers have demonstrated the ability to differentiate coherent light passed through an optical metasurface [42]. In this reference, a metasurface is composed of two sets of arrays of nanobeams that compute first- or second-order spatial differentiation in the x-direction. Rather than rely on digital software for processing, the image is passed via a passive, analog, optical filter. Figure 3d demonstrates how a metasurface like the one described in [42] can be utilized in a 4F optical system in order to implement image differentiation at the Fourier plane.

In fact, the concept of optical differentiation has been around since the 1960s, with the idea of utilizing spatial filters as a tool to alter the structure of a beam of light. The concept of spatial filtering plays off the unique Fourier optical transformation after an image passes through a lens. The unique direction of the light due to diffraction causes the sources of light to focus on different portions of the focal plane. By applying a spatial filter at the focal plane, a variety of analogue operations can be applied to an image. The example shown in Fig. 3d, utilizes this approach to spatial filtering, however, advances in the field have enabled researchers to utilize bound states in a continuum in order to achieve differentiation at any location along the propagation direction of the image [43, 44].

The advantage to utilize metasurfaces for analogue processing has gained significant momentum. Not only are the custom-made optical elements just several hundred nanometers thick, but they can also be applied at any location in the optical train. It should also be noted that metamaterials can often be designed to have high transparency. However, there still remains a serious disadvantage before the systems can be perfected for in-vehicle optical analogue processing. Most metasurface systems have an optical response that is limited to a narrow band of wavelengths, and the existing technologies are not capable of implementing this differentiation on incoherent light. This implies that a standard optical image cannot be differentiated optically. Incoherent optical processing is a hot topic of research and while some progress has been made [45], there still remain several hurdles toward implementing this technology in vehicles.

The ability of an optical system to apply image differentiation or edge determination is not only a useful technique in image processing, but it is typically one of the layers in convolutional neural networks. Thus, applying passive optical filters to sensing could enable reduced convolutional processing needed on these large matrix transformations for edge computing, thereby reducing one of the steps in the computationally intensive task of image segmentation.

Beyond image differentiation, optical filtering has a potential to apply other passive optical elements in order to achieve a variety of algorithms, e.g., integration [46, 47]. Algorithm specific metasurfaces can also be employed as convolutional elements for a variety of other image kernels including box blur, sharpen, or unsharpen masking. The size of the kernel will depend on the resolution of the metasurface relative to the image size. With the help of inverse design and machine learning to create metasurfaces, new and unique methods to achieve these computational systems are rapidly evolving [48].

The current trend for communication systems in vehicles is a demand to shuttle higher loads of data which is often due to the integration of higher resolution sensors (see Sect. 2). The need to transmit data at Gbs speeds, from sensors to their controllers (e.g., those for future automated driver assistance systems (ADAS)) is becoming a considerable communication hurdle. To comprehend the significance of this transition point from the traditional data communication approach to that of future automotive vehicles, we should first review the electronic ecosystem of the present-day vehicle. At present, most traditional OEMs continue to rely on the electronic control unit (ECU), which through a single module controls a set of processes such as those in the radar, cameras, powertrain, transmission, suspension and other systems. The ECU may simply process the raw data and transmit it to another hub or it may act as a subsystem for sensor fusion, processing and controlling data from a multitude of other ECUs. In 2021, there are vehicles with upwards of 150 ECUs; each ECU contains a microcontroller, memory, embedded software, and communication ports for the systems, power, and data communications. With so many control units, each with their own protocols, the OEMs have garnished the burden of driving the performance of these processors while extricating excess cabling and redundancy. This has been particularly difficult given the low bandwidth communication channels that most standard ECUs maintain; these are typically handled via automotive bus systems like LIN (Local Interconnect Network), CAN (Controller area Network), MOST (Media oriented system transport) and FlexRay. However, given the transition to more data heavy sensors like ADAS, higher bandwidths have become a requisite, which has led to the adoption of SerDes (Serializer-Deserializer), Automotive Ethernet, and HDBaseT Automotive. In particular, HDBaseT has allowed for communication over 15 m in length, with limited requirements for shielding for both point-to-point and daisy-chain connectivity [49]. It should be noted that while these technologies have enabled several Gbs transfer rates, there are signs that the future vehicle ecosystem will soon need to shuttle information in the form of bits structured as vectors.

Moreover, trends in machine learning are leaning towards understanding the data structure of an entire tensor, rather than operating a convolution on a single matrix of the tensor at a time. In order to achieve this in our current state, the data would need to be stored locally on RAM hardware after being shuttled bit by bit. In order to accelerate this process, Peripheral Component Interconnect Express (PCIe) cables are finding themselves as a greater necessity. A PCIe cable is a computer expansion bus that traditionally allowed for direct, short connections (on the order of cm) between a motherboard, graphics card, or solid-state drive. As of 2018, PCIe has found a means to integrate onto HDBaseT technology permitting signals such as audio and video, power, and controls to be transmitted over a single cable. Today PCIe is on its 6th generation with versions doubling roughly every few years. However, the future bus communication standard may be to move to fiber optic communications either via PCIe over fiber or standard fiber optical cabling [50, 51]. Given that many sensors utilize optical inputs and the growing desire to process optical inputs in the optical domain, the shuttling of information from sensors to hardware without the transduction to electronic sensors could prove fruitful. It is possible that OEMs will continue to try and process data at the point of occurrence thus declining the need for high bandwidth shuttling; however, the question that remains is for how long can this trend be managed successfully.

With the future of communication systems being heavily reliant on optical networking, there exists an opportunity for analogue processing pre-emptive to transduction to an electrical signal, where information would be traditionally processed on a standard CMOS chip. One possible application of this would be to utilize not only sensing processing but also other functions such as control in an optical architecture. The architecture would utilize a lidar or other light-based sensor with outputs to be fed into an optical neural network with model predictive control (MPC). The concept of using an MPC with a neuromorphic photonics processor for general non-linear programming was demonstrated in [52]. In this reference, nonlinear processing was demonstrated for the high-speed control application of tracking a moving target, e.g., in the case of missile targeting. A similar case could be envisioned for driving automation applications, such as path planning at an on-ramp, a traffic circle, or a parking lot. The light perception device—a sensing solution, directly informs a vehicle computer implemented as optical neural network (ONN) of the positions and the velocities of surrounding agents for the purpose of respecting their trajectories and avoiding collisions while the vehicle’s computer plans to maneuver around on the road [53].

4 Conclusion

We overviewed the state of art and promising developments in the field of automotive sensing focusing on radar, lidar and optical processing for driving environment sensing exemplified by driving automation. Since the DARPA Challenges of the beginning of this century various enabling technologies have advanced by leaps and bounds. Phased-array radars and solid-state lidars are taking places of mechanically scanning devices, enabling more precise temporal snapshot of the data and delivering more accurate range and angular measurements for the ever growing number of mobility features. Advances in RF-CMOS and similarly silicon photonics IC for lidar, which is compatible with digital CMOS—the wide-spread technology for cameras, will pave the way for integration of sensing modalities at the level of hardware simplifying sensor fusion. In terms of optical processing, computational metasurfaces, i.e., devices specially designed to implement image differentiation, convolution and other functions of essence to AI algorithms, are on the rise. Similar to artificial neural network software years ago, we expect to live through another Renaissance in the field of optical processing focused on integrated photonics driven by the needs for smart sensing and ultra-low power consumption. The sensor advances described here will help developing the next generation of intelligent vehicles.