Lightning AI Unveils Next-Generation AI Compiler 'Thunder' to Boost Model Training Efficiency

Open-source AI development platform Lightning AI, in collaboration with Nvidia, announced the launch of Thunder, a source-to-source compiler for the open-source machine learning (ML) framework PyTorch. This new tool aims to expedite AI model training using multiple GPUs, enhancing efficiency.

Lightning AI reports that Thunder can achieve up to a 40% speed increase in training large language models (LLMs) compared to unoptimized code in real-world applications. Importantly, Thunder is open source under the Apache 2.0 license and is available at no cost.

Lightning AI showcased Thunder at Nvidia GTC, presenting it as a solution to maximize GPU utilization without simply adding more GPUs. Since 2022, Lightning AI has been focused on developing next-generation deep learning capabilities for PyTorch, compatible with Nvidia's suite of products such as torch.compile, nvFuser, Apex, and CUDA Deep Neural Network Library (cuDNN), as well as OpenAI’s Triton.

Formerly known as Grid AI, Lightning AI is the creator of the open-source Python library PyTorch Lightning, and its mission is to streamline workloads through optimization. The company collaborates with leading open-source communities, including OpenAI, Meta AI, and Nvidia.

Led by PyTorch core developer Thomas Viehmann, recognized for his work on TorchScript and enabling PyTorch on mobile devices, Thunder is designed to support generative AI models across multiple GPUs. Lightning AI CEO William Falcon expressed his enthusiasm about Viehmann's leadership, stating, “Thomas literally wrote the book on PyTorch. At Lightning AI, he will spearhead the performance breakthroughs we plan to introduce to the PyTorch and Lightning AI community.”

The model training process involves data collection, model configuration, and supervised fine-tuning, making it time-consuming and expensive. Additional challenges arise from the need for technological expertise and resource management. In the context of adversarial AI, where attackers train LLMs to manipulate AI systems, organizations face significant threats.

Lightning's Chief Technology Officer Luca Antiga emphasizes that performance optimization and profiling tools are essential for scaling model training. “We see that customers aren’t fully utilizing available GPUs and are instead adding more,” Antiga noted. He highlighted that, alongside Lightning Studios and its profiling tools, Thunder enables clients to optimize GPU usage and accelerate LLM training on a larger scale.

Thunder is now available following the release of Lightning 2.2 in February. Lightning Studios offers four pricing tiers: free for individual developers, pro for engineers and researchers, teams for startups, and enterprise for larger organizations.

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles