Packaging Photonics for AI/ML Systems

Sujit Ramachandra

 sujit.r@ieee.org

Abstract—The rapid emergence of a multitude of machine learning (ML) models with trillions of parameters has highlighted the need for high performance compute systems leveraging Artificial Intelligence (AI) accelerators with disaggregated memory. Considering the demanding bandwidth, density, energy and latency requirements, Silicon photonics is the technology of choice to realize these architectures. Scalable solutions can only be implemented by developing novel and reliable packaging schemes, with emphasis on thermal budgeting, reduced parasitics, and increased bandwidth density. This article covers some challenges seen when packaging photonic circuits for AI/ML systems and some innovative solutions that have been developed in the field.

 I. INTRODUCTION

With the advent of GPT-4 The rapidly growing size and complexity of Machine learning (ML) and Artificial Intelligence (AI) models has crossed the trillion-mark w.r.t parameters involved [1]. The turn of the decade has seen a large number of ML machine learning models made public, each with billions of parameters. Fig. 1 shows the exponential trend of number of parameters in published ML models over the last five decades. This exponentially growing number of parameters also brings in the need for parallelization of data over tens of thousands of memory and processor nodes. Each of these nodes requires ultra-low latency and power to meet standards, Tb/s optical I/Os and high-speed interconnects between the multiple processing units involved. For instance, early publications report NVIDIA DGX systems consisting of 8 H100 GPUs, designed with a 7.2 Tb/s off chip bandwidth [2].

Read more