Sakana AI Uses Evolutionary Algorithm to Unveil Innovative Architectures for Generative Models

A groundbreaking technique developed by Tokyo-based startup Sakana AI, known as Evolutionary Model Merge, automates the creation of generative models. Inspired by natural selection, this approach combines elements from existing models to produce more advanced iterations.

Sakana AI, co-founded in August 2023 by notable AI researchers such as former Google experts David Ha and Llion Jones—co-author of the influential paper "Attention Is All You Need"—is positioned at the forefront of generative AI innovation.

Revolutionizing Model Development

Sakana's Evolutionary Model Merge enables developers and organizations to create and explore new models in a cost-effective manner, eliminating the need for expensive training and fine-tuning of proprietary models.

The startup recently introduced large language models (LLMs) and vision-language models (VLMs) developed using this innovative technique.

Understanding Model Merging

Training generative models is often prohibitively expensive and complex. However, with the emergence of open models like Llama 2 and Mistral, developers are leveraging model merging—combining various components of two or more pre-trained models to form a new model. This method allows the newly merged model to inherit the strengths of its predecessors without requiring additional training, thus making it a highly economical option. Many leading models on Open LLM leaderboards are now merged variants of popular foundational models.

Sakana AI's researchers note, “A vibrant community of researchers, hackers, and artists is actively developing new foundation models by fine-tuning and merging existing models.” With over 500,000 models available on Hugging Face, model merging offers extensive opportunities for creating innovative solutions at minimal costs, although it does require significant intuition and domain knowledge.

Introducing Evolutionary Model Merge

Sakana AI aims to optimize the model merging process using a systematic approach. Drawing from evolutionary algorithms—optimization techniques that mimic natural selection—Evolutionary Model Merge identifies the most effective ways to combine different models.

David Ha emphasizes, “The ability to evolve new models from diverse existing models has crucial implications.” In the face of rising resource demands for training foundational models, this evolutionary approach may prove beneficial for institutions or governments looking to quickly develop prototype models without substantial investment.

Evolutionary Model Merge operates automatically, assessing existing models' layers and weights to create new architectures tailored to user requirements.

Demonstrating Evolutionary Merging

To explore the potential of this approach, Sakana AI researchers applied Evolutionary Model Merge to create a Japanese LLM capable of mathematical reasoning and a Japanese VLM. The resulting models outperformed several benchmarks without explicit optimization. For example, their EvoLLM-JP, a 7-billion-parameter Japanese math LLM, excelled against even some 70-billion-parameter competitors.

For the Japanese VLM, the team merged LLaVa-1.6-Mistral-7B with Shisa-Gamma 7B, yielding EvoVLM-JP, which surpassed both LLaVa-1.6-Mistral-7B and the pre-existing JSVLM. Both models are available on Hugging Face and GitHub.

Sakana AI is also adapting its evolutionary merging methods for image-generation diffusion models, aiming to enhance the performance of Stable Diffusion XL for Japanese prompts.

Sakana AI’s Vision

Founded by David Ha and Llion Jones, Sakana AI seeks to harness nature-inspired concepts like evolution and collective intelligence to create foundational AI models. The team believes that the future of AI will not revolve around a singular, all-encompassing system but rather a network of specialized AI systems tailored to distinct niches, collaborating and evolving to meet various needs.

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles