Loading Events
  • This event has passed.

Training Large-scale Foundation Models on Emerging AI Accelerators

December 1, 2023 @ 00:00 - 01:00

Foundation models such as GPT-4 have garnered significant interest from both academia and industry. An outstanding feature of such models is so-called emergent capabilities, including multi-step reasoning, instruction following, and model calibration, in a wide range of application domains. Such capabilities were previously only attainable with specially designed ML models, such as those using carefully constructed knowledge graphs, in specific domains. As the capabilities of foundation models have increased, so too have their sizes at a rate much faster than Moore’s law. The training of foundation models requires massive computing power. For instance, training a BERT model on a single state-of-the-art GPU machine with multi-A100 chips can take several days, while training GPT-3 models on a large multi-instance GPU cluster can take several months to complete the estimated 3*10^23 flops. This talk provides an overview of the latest progress in supporting foundation model training and inference with new AI accelerators. It reviews progress on the modeling side, with an emphasis on the transformer architecture, and presents the system architecture supporting training and serving foundation models. Explore the frontier of AI with us as we delve into the power and potential of foundation models like GPT-4. Discover how emergent capabilities are pushing the boundaries of what’s possible and the groundbreaking AI accelerators making it all happen. Join our talk to uncover the future of AI training and application! Speaker(s): Jun (Luke) Huan, Virtual: https://events.vtools.ieee.org/m/382797