# Crosstalk Mitigated On-chip Interconnect Design for High-speed Network-on-Chip (NoC) of Full Wafer Scale Chip (FWSC)

Juneyoung Kim, Seonguk Choi, Seongguk Kim, Jihun Kim, Boogyo Sim, Junghyun Lee, Taein Shin, Hyunwoo Kim, Jonghyun Hong, Haeyeon Kim, Joonsang Park and Joungho Kim School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST) Daejeon, Republic of Korea

E-mail: Juneyoungkim@kaist.ac.kr

Abstract— Full wafer scale chip (FWSC) is a promising AI computing architecture that integrates the entire wafer into a single chip, surpassing the limitations of off-chip interconnect. The large-scale FWSC requires communication between cores separated by long and dense on-chip interconnect which can induce SI problems like crosstalk. In this paper, we designed and analyzed the on-chip interconnect of the 2D mesh Network-on-Chip (NoC) for the FWSC. We introduced a power/ground shield for the on-chip interconnect, significantly reducing the crosstalk effect. Additionally, we added a feed-forward equalizer (FFE) to the transceiver (Tx) to compensate for inter-symbol interference (ISI) due to the shield. On-chip interconnect with 5 nm process technology was designed and analyzed using full 3D electromagnetic (EM) and circuit simulations. As a result, the proposed design effectively mitigated crosstalk effects and improved the timing jitter by 79.4%.

### *Index Terms*— Crosstalk, eye-diagram, full wafer scale chip, network-on-chip, on-chip interconnect, signal integrity

#### I. INTRODUCTION

As the size of artificial intelligence (AI) models rapidly increases, the computational requirements for training AI models have grown exponentially. In this context, full wafer scale chip (FWSC) is a promising solution. As shown in Fig. 1, FWSC integrates the entire wafer into a single chip without the use of off-chip interconnects which increases the system's power consumption and latency. FWSC has huge on-chip memory and nearly one million programmable processing elements (PEs), making it more suitable for AI computing than conventional hardware [1]. For this wafer-scale system, FWSC uses a large 2D mesh network-on-chip (NoC) for communication between PEs.

However, with a nanometer process, signaling in dense and long on-chip interconnects induces signal integrity (SI) issues, especially crosstalk. Moreover, as the next generations of FWSC are anticipated to operate at higher frequencies, the impact of the crosstalk will become more severe. Therefore, there is a need for novel on-chip interconnect structures for FWSC to alleviate crosstalk effects. However, previous studies have not addressed the SI analysis of wafer-scale on-chip interconnect or suitable onchip interconnect structure for FWSC [2], [3].

In this paper, we proposed a crosstalk mitigated on-chip interconnect for FWSC with high-speed signaling. The proposed structure contains both a power/ground shielding structure and a feed-forward equalizer (FFE). The proposed shielding structure significantly mitigated the crosstalk effects. However, this



Fig. 1. Conceptual view of FWSC module consisting of identical dies.



Fig. 2. PEs in FWSC' single die with the 2D mesh NoC consisting of the router and on-chip interconnects.

shield induces more inter-symbol interference (ISI) based on the increased RC delay. To reduce the ISI, the proposed design introduced the FFE in the transceiver (Tx). To verify the overall SI with the proposed design, we focused on analyzing the eyediagram. The analysis was performed at a higher data rate than the existing FWSC to consider future generations of FWSC.

#### II. PROPOSED DESIGN OF ON-CHIP INTERCONNECT IN 2D MESH NOC FOR FWSC

#### A. Assumption of 2D mesh NoC for FWSC

All the information regarding WSE-2, which is referenced in FWSC modeling, is based on the materials published by Cerebras [1]. 2D mesh NoC for FWSC was configured with the routers and on-chip interconnects, as shown in Fig. 2. We considered each router with a switching arbiter and a crossbar switch to arrange communication paths. The crossbar switch consisted of multiple MUX [4]. We applied the characteristics of these routers for the SI simulation setup. Between the routers, 32 unidirectional single-ended signal lines were designed for signal transmission in each direction.

#### B. Design on-chip interconnect of NoC for FWSC

As shown in Fig. 3(a), the on-chip interconnect stack-up assumed a 15-layer back-end of line for high-performance



Fig. 3. Cross-section view of (a) conventional on-chip interconnect stack-up and (b) proposed on-chip interconnect stack-up with power/ground shield.

computing. The signal layer was designed on the 14th metal layer, which is suitable for global communication. The channel length was set as the width of a single PE, which is 228  $\mu$ m. the channel width and space were both designed as 80 nm, and the channel thickness was set as 160 nm, based on the 5 nm process technology [5]. In the same way, the dielectric thickness was designed as 80 nm. Ultra-low k carbon-doped oxide (ULK CDO), which is commonly used for reducing the interference between channels, was applied as a dielectric material. [6].

#### C. Proposed on-chip interconnect and Tx design for FWSC

To reduce crosstalk effects, we designed a power/ground shield structure for the 2D mesh interconnect, as shown in Fig. 3(b). The shield structure with a width of 20 nm was inserted between the signal lines without channel spacing to maintain interconnect area. The power/ground shield structure mitigates electromagnetic interference generated by adjacent aggressor channels, significantly reducing the crosstalk effects between signal lines. However, the shield structure without channel spacing increases the total capacitance of the signal line, causing ISI from the increased RC delay.

To solve this problem, a two-tap feed-forward equalizer (FFE) was added to the proposed Tx design, as shown in Fig. 4(a) [7]. Among the several equalizer schemes, FFE was utilized for the proposed design because the FFE effectively reduces the ISI and slightly mitigates the crosstalk effect with the lower incident voltage level [8]. With power/ground shield structure and FFE, the proposed design reduces the crosstalk effect and alleviates the ISI from the increased RC delay. The FFE coefficients were designed with a main-cursor of 0.85 and a post-cursor of -0.15 at 2.2 Gb/s. For 6.6 Gb/s, considering more ISI at the higher data rate, the main-cursor was set to 0.6 and the post-cursor to -0.4.

#### III. SIGNAL INTEGRITY ANALYSIS OF THE PROPOSED ON-CHIP INTERCONNECT OF 2D MESH NOC FOR FWSC

The simulations were conducted at the farthest signaling distance in FWSC, which is 43 cm, with variations in the



Fig. 4. (a) Proposed Tx with FFE and SBR with equalization. (b) The simplified model of the initial set-up for eye-diagram simulation.

presence of aggressors, data rate, the proposed shield, and FFE. To evaluate the SI of the proposed design, we conducted time domain simulations, including eye-diagram and single bit response (SBR) analysis. The time domain simulation set-up is depicted in Fig. 4(b). Considering the crossbar switch, the resistance and capacitance of Tx were assumed 20  $\Omega$  and 1.2 fF, while the receiver (Rx) was represented with a capacitance of 1.8 fF. The input signals were applied using the pseudo-random bit sequence (PRBS) of 28-1 bits, with a defined voltage swing of 1.2 V. The data rate of the input signals was set to the same as the current FWSC, which is 2.2 Gb/s. Furthermore, considering future generations of FWSC, simulations were performed with the data rate of 6.6 Gb/s. The rising and falling times were set at 10 % of 1-unit interval (UI). We set up a victim channel and 4 aggressor channels in the signal layer to consider the crosstalk effects that occur in dense on-chip interconnects.

# A. Eye-diagram simulations of conventional on-chip interconnect of 2D mesh NoC for FWSC

The eye-diagram simulation represents a comprehensive evaluation of the SI properties. The overall eye-diagram simulation results are presented in Fig. 5. The eye-diagrams depending on crosstalk at 2.2 Gb/s are represented in Fig. 5(a) and Fig. 5(b). Due to the crosstalk effects caused by the aggressors, the voltage overshoot increased to 149 mV, and the timing jitter increased to 77.2 ps. Moreover, the eye-height decreased to 0.799 V. These reductions in the timing and voltage margins adversely affect the performance and reliability of the system. Furthermore, with the data rate of 6.6 Gb/s, the eye was closed by crosstalk effects, as shown in Fig. 5(c) and Fig. 5(d). These results imply that for future generations of FWSC, the novel on-chip interconnect structure for crosstalk reduction is essential.

## *B. Time domain simulations of the proposed design with the shield and FFE*

Simulation results of SBR and eye-diagrams with the proposed design are shown in Fig. 6. As shown in Fig. 6(a), the two-tap



Fig. 5. Eye-diagram results of 43 cm on-chip interconnect of 2D mesh NoC for FWSC (a) without crosstalk at 2.2 Gb/s, (b) with crosstalk at 2.2 Gb/s, (c) without crosstalk at 6.6 Gb/s, (d) with crosstalk at 6.6 Gb/s.



Fig. 6. Simulation results of proposed design for FWSC (a) SBR at 6.6 Gb/s. Eye-diagram (b) with shield at 2.2 Gb/s, (c) with shield at 6.6 Gb/s, (d) with shield and FFE at 2.2 Gb/s, (e) with shield and FFE at 6.6 Gb/s.

FFE in Tx reduced dispersion which mitigates the ISI caused by shield channels. As shown in Fig 6(b), with the proposed shield, the eye-diagram shows an overall reduction in crosstalk effects. However, the ISI induced by RC delay from the shield makes degradation of the eye. As shown in Fig. 6(c), by introducing the FFE in Tx, the timing jitter caused by crosstalk and RC delay was improved to 15.9 ps and the eye-height increased to 0.527 V. As shown in Fig. 6(d) and Fig. 6(e), at the data rate of 6.6 Gb/s, the interconnect with only the shield was unable to open the eye, but the design with both the shield and the FFE successfully opened the eye. These results show that the design we proposed reduces crosstalk effects and makes the eye open at a higher data rate.

#### IV. CONCLUSION

In this paper, we proposed the on-chip interconnect design with the power/ground shield and FFE to reduce the crosstalk effect in FWSC. The design and analysis of the long and dense on-chip interconnect of FWSC were performed using the 5nm process technology. The simulation results highlight crosstalk as the primary factor degrading SI in the on-chip signaling of FWSC. Based on these results, the proposed design introduced the power/ground shield in on-chip interconnect to mitigate the crosstalk effects. Furthermore, to compensate ISI from RC delay by the shield structure, the FFE was utilized in the Tx. As a result, the proposed design reduced the crosstalk effects significantly and improved the eye-opening. These studies can contribute to advancing SI and performance in future FWSC architectures.

#### ACKNOWLEDGMENT

We would like to acknowledge the technical support from ANSYS Korea. This research was supported by National R&D Program through the National Research Foundation of Korea (NRF) funded by Ministry of Science and ICT (NRF-2022M3I7A4072293). This work was supported by Samsung Electronics Co., Ltd (IO201207-07813-01, MEM230315\_0004) REFERENCES

- Lie, Sean., "Cerebras Architecture Deep Dive: First Look Inside the Hardware/Software Co-Design for Deep Learning: Cerebras systems." 2022 IEEE Hot Chips 34 Symposium (HCS). IEEE Computer Society, 2022, pp. 1-34.
- [2] J. Zhang and E. G. Friedman, "Effect of shield insertion on reducing crosstalk noise between coupled interconnects," 2004 IEEE International Symposium on Circuits and Systems (ISCAS), 2004, pp. 529-532.
- [3] T. Zhang and S. S. Sapatnekar, "Simultaneous Shield and Buffer Insertion for Crosstalk Noise Reduction in Global Routing," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 15, no. 6, 2007, pp. 624-636.
- [4] V. Tiwari et al., "Efficient Configurable Crossbar Switch Design For NoC." 2019 International Journal of Scientific & Technology Research (IJSTR), 2019, pp. 959-964.
- [5] E. Sicard et al., "Introducing 5-nm FinFET technology in Microwind." 2021.
- [6] D. Ingerly et al., "Low-K Interconnect Stack with Thick Metal 9 Redistribution Layer and Cu Die Bump for 45nm High Volume Manufacturing," 2008 International Interconnect Technology Conference, 2008, pp. 216-218.
- [7] L. A. Valenzuela et al., "A 2.85 pJ/bit, 52-Gbps NRZ VCSEL Driver with Two-Tap Feedforward Equalization," 2020 IEEE/MTT-S International Microwave Symposium (IMS), 2020, pp. 209-212.
- [8] S. Parikh, et al., "A 32Gb/s wireline receiver with a low-frequency equalizer, CTLE and 2-tap DFE in 28nm CMOS," 2013 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers. IEEE, 2013, pp. 28-29.