

# All-optical ternary-content addressable memory (T-CAM) cell and row architectures for address lookup at 20 Gb/s

P. Maniotis<sup>1</sup> N. Pleros<sup>1</sup>

Received: 9 June 2017/Accepted: 10 October 2017/Published online: 13 October 2017 © Springer Science+Business Media, LLC 2017

**Abstract** Content addressable memories form a popular design choice for routing table implementations thanks to their fast searching capabilities. However, high speed address look-up operation is still challenging due to the speed limitations imposed by conventional electronic technologies. In this paper, we demonstrate a completely optical ternary-content addressable memory (T-CAM) cell architecture that extends the capabilities of the experimentally demonstrated 10 Gbps optical binary-content addressable memory cell to the third matching state "X" or "Care/Don't Care", enabling in this way the subnet-masked operation that is necessary in modern router applications. Additionally, we present a 4-cell T-CAM row architecture introducing wavelength division multiplex-ing-enabled matchline decoding so as to allow for comparison operation with complete optical words. The performance of both the optical T-CAM cell and the optical T-CAM row architectures is evaluated by means of physical layer simulation results, presenting successful Search and Write operation at 20 Gb/s.

**Keywords** Optical memories · Optical ternary content addressable memories (ternary-CAM) · All-optical T-CAM · Optical address look-up · Optical data processing

## **1** Introduction

The always increasing demand for faster Internet speeds has resulted in the continuous penetration of high-speed optical interconnection technologies closer to the Internet Service Provider (ISP) customers. This has led to a tremendous increase in Internet Protocol (IP) network traffic pushing at the same time for faster Address Look-up (AL) speeds in modern router applications (Beheshti et al. 2010). On top of that, the massive growth of the

P. Maniotis ppmaniot@csd.auth.gr

<sup>&</sup>lt;sup>1</sup> Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece

devices connected to the Internet dictates the utilization of large Routing Tables (Krioukov et al. 2007) that turns fast AL operation even more challenging. Hence, this has led to the need for hardware based AL solutions in order to cope with the need for faster routing operations (Jiang et al. 2008). Towards achieving low-latency routing operation CAMs have entered the game; CAMs are a special type of memory targeted to latency-sensitive search applications that allow the realization of search operations within the time frame of a single clock-cycle (Pagiamtzis and Sheikholeslami 2006). In order to accomplish this and assuming a 2-Dimension (2D) array of *n* CAM-cells, each CAM-cell embodies a XOR gate that allows for parallel search throughout the complete 2D array (Pagiamtzis and Sheikholeslami 2006).

However, the AL operation speed is still limited by the relatively low operation speeds imposed by electronic logic circuits and by the electronic interconnects, which are employed for connecting the different memory cells and the 2D CAM-cell array with the forwarding table that usually exploits Random Access Memories (RAMs). Despite the impressive optical packet payload data-rate increase that took place during the last years (Link 1. 2017), modern electronic CAMs can barely offer AL speeds of few GHz (Agarwal et al. 2011). To the best of our knowledge, the first effort towards breaking this speed barrier through transferring the CAM cell layout into the optical domain has been initiated by our research group by experimentally demonstrating the first optical B-CAM cell utilizing a monolithically integrated Semiconductor Optical Amplifier-Mach-Zehnder Interferometric (SOA-MZI)-switch-based Flip-Flop (FF) (Pitris et al. 2016). Despite being the first prototype, the optical B-CAM cell implementation offers already an operation speed of up to 10 Gbps, i.e.  $1.5 \times$  times faster compared to the fastest conventional CAM cell (Pitris et al. 2016). However, for practical routing AL tables where subnet-masked routing has to be accomplished, optical CAMs are required to support also the "X" or "Care/Don't Care" operation besides the simple comparison offered by B-CAMs (Pitris et al. 2016).

This paper extends our work on the optical CAM technology by presenting the first alloptical T-CAM cell and its interconnection in an optical T-CAM row architecture, where 4 T-CAM cells and a novel WDM-encoded matchline design can provide comparison operation for a complete 4-bit optical word. The optical T-CAM cell extends the B-CAM cell operation by allowing the storage of a third state "X", enabling in this way the essential subnet-masked operation needed in modern router applications. The proposed optical T-CAM cell architecture comprises two optical FFs and an optical XOR gate; the 1st optical FF is used for storing the actual T-CAM cell contents, the 2nd FF for implementing the "X" state support while the XOR gate for enabling the T-CAM cell search capability. The 4-cell T-CAM-row architecture follows a proper wavelength encoding scheme by using an Arrayed Waveguide Grating (AWG)-multiplexer; the multi-wavelength output signal produced at the final row output determines whether a success comparison result is achieved throughout the complete T-CAM row. The performance evaluation of the 4-cell T-CAM row architecture has been carried out using the VPI Photonics simulation suite and by employing experimentally verified SOA-based building blocks. The simulation results demonstrate successful T-CAM row operation at 20 Gbps for both Search and Write functionalities. The proposed T-CAM row architecture can be easily scaled to form complete optical T-CAM tables required in AL, while the recent experimental developments in high-speed and ultra-low-power integrated photonic crystal InP-on-Si flip-flop devices (Alexoudi et al. 2016) could potentially allow for its experimental implementation in low-footprint and low-energy prototypes.

## 2 Optical T-CAM cell and row architectures

A CAM-based routing table usually consists of a T-CAM table interlinked to a RAM table. The T-CAM table stores all possible destination addresses following typically a layout where every CAM row includes a single address, while the RAM table employs the addresses of the proper router output ports. In this way, the T-CAM table is responsible for identifying the desired destination address of the incoming packet so as to activate the corresponding RAM table entry that will in turn activate the appropriate router output port. An indicative T-CAM-based routing scheme is presented in the example of Fig. 1a. The destination address of an incoming packet is fed as the *search-input* into the T-CAM table, while the proper router output port that should be used for forwarding the incoming packet to the desired destination emerges as the *search-output* signal at the RAM table output. The T-CAM AL operation is realized in a single step since the destination address of the incoming packet is broadcasted to all T-CAM rows having its constituent bits compared in parallel with the content of every T-CAM row. In case the search-input is identical to a word stored in a T-CAM row, a proper matchline signal identifier emerges at the corresponding row output. A proper encoding and decoding circuit is being used in between the interlinked T-CAM and RAM tables in order to associate the matched T-CAM row *matchline signal* with the correct RAM table row that stores the proper router output port. In the example of Fig. 1a, the *search-input* matches successfully only to the word "X011" stored in the 2nd T-CAM row, with the "X" state denoting that this bit can be successfully matched with an input search value of either 1 or 0. The matchline signal generated at the 2nd T-CAM row is then translated via the encoding and decoding circuitry into the address "01" of the RAM table, which designates that "port B" should be activated at the router output in order to allow the incoming packet to safely propagate to its desired next-hop.



Fig. 1 a T-CAM-based routing table along with operation example, b T-CAM row architecture comprising an indicative number of 4 T-CAM cells

The proposed optical T-CAM row architecture is demonstrated in Fig. 1b. The row incorporates an indicative number of 4 T-CAM cells where each of them consists of 2 FF modules and 1 XOR gate. The XOR gates are necessary for realizing the comparison operation between the *search-input* bits and the values stored in the T-CAM cells. The left side FF of each cell is named XFF and is necessary for implementing the third state "X", enabling in this way the subnet-masked operation which is widely used in modern router applications. On the other hand, the right side FF of each cell is named T-CAM Content FF (TCFF) and stores the actual T-CAM cell content that can be either a logical 0 or 1. When subnet-masked operation is desired, the XFF's content equals 0 implying that the TCFF respective content has to be ignored. As such, the respective XOR operation does not take into account the TCFF content and the comparison result equals to a logical 0 independently of the value of the *search-input* bit. On the contrary, in case the TCFF value has to be taken into account, then the XFF content equals to a logical 1 and the XOR output depends upon the comparison between the TCFF value and the respective *search-input* bit. By assigning a different wavelength for carrying the optical XOR output at every individual T-CAM cell within a row, all the 4 T-CAM cell outputs can be combined at the row output by using an Arrayed-Waveguide Grating (AWG) multiplexer unit (Fig. 1b). This leads to a WDM-encoding scheme that produces the corresponding *matchline signal* at the final row output. In this way, a *matchline signal* of a logical value 0 indicates a completely matched comparison result since all the individual XOR outputs will equal to a logical 0. On the contrary, a non-zero optical power level obtained at the encoder input indicates that at least one individual XOR output produces a comparison miss, denoting a non-completely matched row.

Figure 2 presents the proposed all-optical T-CAM cell architecture comprising the TCFF and XFF that subsequently feed an optical XOR gate. The optical XOR gate consists of a single SOA-MZI switch, while both the TCFF and XFF units comprise two interlinked SOA-MZI switches that form a well-known optical master–slave FF configuration (Pitris et al. 2016; Pleros et al. 2009; Maniotis et al. 2013; Liu et al. 2006). For both FFs, a proper Set/Reset pulse mechanism is used in order to switch between the two possible logical states (Pleros et al. 2009; Maniotis et al. 2013). Both XFF and TCFF are powered by 2 Continuous Wave (CW) laser beams:  $\lambda e$  is used as input signal at the right-side switches of both XFF and TCFF, while a CW signal at  $\lambda a$  and  $\lambda f$  is launched as the input signal at the left-side SOA-MZI switches of the XFF and TCFF, respectively. As such, the content of the XFF and the TCFF gets encoded on  $\lambda a$  and  $\lambda f$  wavelengths as the FF output signals, respectively. The XFF output signal is then fed as the input signal at the XOR gate, after being filtered in an Optical Bandpass Filter (OBF). On the other side, the TCFF output at  $\lambda f$ 



**Fig. 2** All-optical T-CAM cell architecture with 2 FFs (TCFF and XFF) and a XOR gate and T-CAM row's AWG multiplexer for 4 indicative T-CAM cells

enters the XOR gate as the control signal of the upper-branch SOA. The lower branch SOA of the XOR gate is being fed with the input *search bit* that acts as the second control signal. In this way, the TCFF output and *search bit* values get logically XORed and the comparison result gets imprinted on the XFF output signal at  $\lambda a$  that is used as the XOR input. Whenever the T-CAM cell is in the "X" state, the XFF output equals to a logical 0 resulting to a logical "0" at the final XOR output irrespective of the TCFF and *search bit* values. On the contrary, the final XOR output depends indeed on the comparison result between the TCFF output and *search bit* values when the XFF output equals a logical "1": when both the TCFF output and *search bit* signals have the same value the XOR output is 0, while in the opposite case that they have different values the XOR output equals "1" and is imprinted on the XFF output at  $\lambda a$ .

Assigning different wavelengths to all the T-CAM cell outputs within a single row allows for the realization of a simple wavelength encoding scheme by using an AWG multiplexer, as presented at the right side of Fig. 2;  $\lambda a$  through  $\lambda d$  are used for the different cell outputs, while  $\lambda e$ ,  $\lambda f$  and the wavelengths used for the Set/Reset signals are employed in all T-CAM cells. The complete absence of optical power at the final multiplexed T-CAM row output indicates a perfect match between the *search-input* bits and the row's contents.

#### 3 Simulation results

This section presents the physical layer simulation-based performance analysis of the T-CAM row architecture for both Search and Write operations at a line-rate of 20 Gb/s. The simulation models have been developed using the VPI Photonics suite and both XOR gate and FF models are based on experimentally verified building blocks. More specifically, XOR gates follow the SOA-MZI-based dual-rail logic model, as this has been presented by (Wang et al. 2004), while the FF models follow closely the FF model configuration used in the extended simulation analysis of the all-optical cache memory architectures for Chip Multiprocessors (CMPs) (Maniotis et al. 2013). The only difference has been the use of co-propagating instead of counter-propagating Set/Reset signals that are used for switching between the two possible FF logical states. More details around the principle of operation of the all-optical FF technology can be found in (Pleros et al. 2009; Liu et al. 2006). Regarding the SOA model used in both XOR gates and FFs, this is identical with the one presented and experimentally validated by (Kehayas et al. 2006). The wavelengths used in the 4-cell arrangement of Fig. 2 are equal to:  $\lambda a$ : 1564.19 nm,  $\lambda b$ : 1562.56 nm,  $\lambda c$ : 1559.31 nm,  $\lambda d$ : 1557.36 nm,  $\lambda e$ : 1554.78 nm,  $\lambda f$ : 1546.12 nm, Set: 1548.35 nm and Reset: 1551.88 nm.

Figure 3 presents the simulation results for all the 4 T-CAM cells employed in a single row. Figure 3(i), (ii) illustrate the Set/Reset pulse traces that are fed into the XFFs and determine whether the XFF has to define a "X" state for the T-CAM cell or not. Figure 3(iii), (iv) illustrate the Set/Reset pulse traces that are fed into the TCFFs of the 4 cells dictating the logical content of every TCFF. Figure 3(v) depicts the XFF output signal, while Fig. 3(vi) illustrates the TCFF content transitions for every T-CAM cell. As can be seen in both Fig. 3(v), (vi), successful bit storage operation is achieved according to the respective Set/Reset pulse traces; the presence of a Set pulse leads to a FF content transition to the 0 logical state, while the presence of a Reset pulse leads to a FF content transition to the logical state of 1. Figure 3(vii) presents the *search-bit* pulse traces that are



Fig. 3 20 Gbps simulation results for the T-CAM row architecture of Fig. 1b. Time scale: 50 ps/div for traces and 25 ps/div for eye diagrams

fed into the XOR gates of the 4 T-CAM cells as parallel streams in order to get compared with the respective T-CAM cell contents. The *search-bit* pulse traces are Non-Return-to-Zero (NRZ)  $2^7-1$  Pseudorandom Binary Sequences (PRBS) at a line-rate of 20 Gb/s. Figure 3(viii) shows the XOR output signals that form also the T-CAM cell outputs and Fig. 3(ix) illustrates the power level of the final matchline signal that is produced at the row output and just after the AWG multiplexer. As can be noticed, this is a multilevel signal with every different power level corresponding to a different number of bit-level search misses. When all T-CAM cells match the 4 bits of the incoming *search-input* signal, no optical power is recorded at the AWG output.

Successful matchline operation of the complete T-CAM row can be verified for the entire pulse traces used as the 4 parallel *search bit* sequences. Three representative examples at the timeslots #1, #9 and #27 have been highlighted in order to facilitate understanding of the T-CAM row performance in different situations. In the example of timeslot #1, all 4 T-CAM cells are in the "X" state since all respective XFF outputs are equal to 0, which finally results to XOR output of 0 regardless of the TCFF and search-bit values. As expected, the final matchline signal at timeslot #1 is also equal to 0, corresponding to a complete match between the T-CAM row and the *search-input* contents. Within timeslot #9, none of the T-CAM cells is in the "X" state since all XFF outputs are equal to a logical 1. For T-CAM cell #1, the XOR output equals to a logical 0 since the TCFF and the *search-bit* content are equal. However, for the remaining three T-CAM cells the respective XOR outputs equal to a logical 1 denoting the different content between the corresponding TCFF and *search-bit* signals. The presence of three optical pulses at different wavelengths but within the same timeslot #9 designates that the optical power obtained at the AWG output will equal the sum of the power levels of the three individual pulses, obviously leading to a matchline signal with non-zero power that indicates a nonperfectly matched search operation. In the example of timeslot #27, T-CAM cells #2 and #4 are in the "X" state since their XFF content equals a logical 0. As such, the respective XOR outputs are also equal to 0 and this happens even in the case of T-CAM cell #4 where the TCFF content and its respective search-bit are different. Regarding cell #3, the XOR output equals 0 because both the TCFF content and the *search-bit* are equal. However, cell #1 has a XOR output of 1 since TCFF content and the respective *search-bit* have different values. This single optical pulse obtained as the result of the comparison along the entire T-CAM row is then also translated into a non-zero power level at the AWG output, suggesting again a non-matched row. Figure 3(x) presents clearly open eyes for all 4 T-CAM cells with an average extinction ratio of 9.1 dB.

### 4 Conclusion

An all-optical T-CAM cell and row architecture for high-speed Address Lookup operation for future network router applications has been presented. Physical layer simulation results demonstrate successful Search and Write operation at a line-rate of 20 Gbps for a complete 4-bit optical word. Although the proposed T-CAM cell and row architectures employ SOA-based switch and FF modules, they are compatible with alternative optical switch and FF technologies (Alexoudi et al. 2016) that can in principle yield higher integration levels, lower-footprint and lower-energy T-CAM-based routing table implementations.

Acknowledgements This work has been supported by the FP7-PEOPLE-2013-IAPP-COMANDER Project (Contract Number 612257).

## References

- Agarwal, A., Hsu, S., Mathew, S., Anders, M., Kaul, H., Sheikh, F., Krishnamurthy, R.: A 128 × 128b highspeed wide-and match-line content addressable memory in 32 nm CMOS, ESSCIRC 2011, Finland, pp. 83–86 (2011)
- Alexoudi, T., Fitsios, D., Bazin, A., Monnier, P., Raj, R., Miliou, A., Kanellos, G.T., Pleros, N., Raineri, F.: III–V-on-Si photonic crystal nanolaser technology for optical random access memories (RAMs). IEEE
  J. Sel. Top. Quantum Electron. 22(6), 1–10 (2016)
- Beheshti, N., Burmeister, E., Ganjali, Y., Bowers, J.E., Blumenthal, D.J., McKeown, N.: Optical packet buffers for backbone internet routers. IEEE/ACM Trans. Netw. 18(5), 1599–1609 (2010)
- Jiang, W., Wang, Q., Prasanna, V.K.: Beyond TCAMs an SRAM-based parallel multi pipeline architecture for terabit IP lookup, pp. 2458–2466. INFOCOM 2008, Phoenix, USA (2008)
- Kehayas, E., Vyrsokinos, K., Stampoulidis, L., Christodoulopoulos, K., Vlachos, K., Avramopoulos, H.: ARTEMIS: 40-gb/s all-optical self-routing node and network architecture employing asynchronous bit and packet-level optical signal processing. JLT 24(8), 2967–2977 (2006)
- Krioukov, D., Claffy, K.C., Fall, K., Brady, A.: On compact routing for the internet. ACM SIGCOMM Comput. Commun. Rev. 37(3), 41–52 (2007)
- Link 1. http://www.ethernetalliance.org/roadmap/. Accessed 8 June 2017
- Liu, Y., Mcdougall, R., Hill, M.T., Maxwell, G., Zhang, S., Harmon, R., Huijskens, F.M., Rivers, L., Dorren, H.J.S., Poustie, A.: Packaged and hybrid integrated all-optical flip-flop memory. Electron. Lett. 42(24), 1399–1400 (2006)
- Maniotis, P., Fitsios, D., Kanellos, G.T., Pleros, N.: Optical buffering for CMPs: a 16 GHz optical cache memory architecture. JLT 31(24), 4175–4191 (2013)
- Pagiamtzis, K., Sheikholeslami, A.: Content-addressable memory (CAM) circuits and architectures: a tutorial and survey. IEEE J. Solid State Circuits 41(3), 712–727 (2006)
- Pitris, S., Vagionas, C., Maniotis, P., Kanellos, G.T., Pleros, N.: An optical content addressable memory (CAM) cell for address look-up at 10 Gb/s. IEEE PTL 28(16), 1790–1793 (2016)
- Pleros, N., Apostolopoulos, D., Petrantonakis, D., Stamatiadis, C., Avramopoulos, H.: Optical static RAM cell. IEEE Photon. Technol. Lett. 21(2), 73–75 (2009)
- Wang, Q., Zhu, G., Chen, H., Jaques, J., Leuthold, J., Piccirilli, A.B., Dutta, N.K.: Study of all-optical XOR using MZI and differential scheme. IEEE JQE 40(6), 703–710 (2004)