Stability AI Launches Stable Audio: A Game-Changer for Sound Design Professionals

Stability AI is launching its generative AI for audio with the introduction of Stable Audio Open 1.0.

Known for its Stable Diffusion text-to-image technology, Stability AI offers a diverse portfolio that includes various models for code, text, and now audio. In September 2023, the company unveiled Stable Audio, a text-to-audio generative AI tool. The follow-up, Stable Audio 2.0, released on April 3, enhanced audio clarity and length.

Stable Audio Open, while available for general commercial use, focuses on producing shorter audio pieces, such as sound effects, rather than full songs. This model is not fully open source; instead, it operates under the Stability AI non-commercial research community agreement license, providing limited use.

“Our goal with Stable Audio Open is to give audio researchers and producers hands-on access to one of our generative audio models to facilitate research, adoption, and creative exploration,” said Zach Evans, head of audio research at Stability AI.

What is Stable Audio Open?

Stable Audio Open specializes in creating drum beats, instrument riffs, ambient sounds, and other audio samples for music production and sound design. Unlike the commercial Stable Audio product, which generates coherent musical tracks up to three minutes long, Stable Audio Open focuses on producing high-quality audio clips lasting up to 47 seconds, driven by text prompts.

Stability AI prioritizes responsible training practices, using audio data from FreeSound and the Free Music Archive to avoid any copyrighted materials without permission.

Fine-Tuning for Creative Freedom

A significant advantage of Stable Audio Open is its fine-tuning capability, allowing users to customize the model with their audio data. For example, drummers can refine the model using their drum recordings to generate unique beats.

The fine-tuning process utilizes the Stable Audio Tools library, which is licensed under an open-source framework. The model weights are also available on Hugging Face.

“The audio research team is continuously working to enhance the quality and control of our generative audio models,” Evans added. “We anticipate future commercial and open model releases that reflect our research advancements.”

Most people like

Find AI tools in YBX