Stability AI Launches Stable Audio Open: A New AI Model for Sound Creation
Stability AI, renowned for its AI-driven art generator, Stable Diffusion, has introduced Stable Audio Open, an innovative open-source model designed for generating sounds and music. The company claims this model was meticulously trained using exclusively royalty-free recordings.
What is Stable Audio Open?
Stable Audio Open allows users to input text descriptions (e.g., “Rock beat played in a treated studio, session drumming on an acoustic kit”) and produces a sound recording lasting up to 47 seconds. The model draws from approximately 486,000 samples sourced from free music repositories, including Freesound and the Free Music Archive.
Stability AI asserts that this versatile model can create drum beats, instrumental riffs, ambient sounds, and production elements for a variety of media projects, such as videos, films, and television shows. It also features capabilities for "editing" existing tracks or adapting the style of one song (like smooth jazz) to another.
“A significant advantage of this open-source release is that users can refine the model with their own audio data,” Stability AI explained in a corporate blog post. “For instance, a drummer could customize it with samples of their recordings to generate new beats.”
Limitations of Stable Audio Open
However, Stable Audio Open does come with certain limitations. The model is not capable of generating full songs, melodies, or quality vocal performances, a shortcoming that Stability AI acknowledges. Users seeking those functionalities may want to consider the company’s premium service, Stable Audio.
Furthermore, commercial use of Stable Audio Open is prohibited under its terms of service. It also does not perform uniformly across various musical styles and cultural contexts, nor does it handle descriptions in languages other than English effectively. Stability AI cites the training data as a contributing factor to these biases. They stated, “The source of data is potentially lacking in diversity, meaning that not all cultures are equally represented in the dataset. Generated samples may reflect biases present in the training data.”
Amid Controversy
Stability AI has faced challenges in recent times, particularly after the resignation of its Vice President of Generative Audio, Ed Newton-Rex, who disagreed with the company’s assertion that training generative AI models on copyrighted works falls under “fair use.” The launch of Stable Audio Open can be seen as an effort to reshape this narrative, even as it discreetly promotes Stability AI's paid offerings.
As the popularity of music generators continues to soar, copyright issues and concerns surrounding potential misuse by some creators are coming under increased scrutiny. In May, Sony Music, which represents notable artists like Billy Joel, Doja Cat, and Lil Nas X, issued a letter to 700 AI companies cautioning against the “unauthorized use” of its content for training audio generators. Additionally, Tennessee enacted the United States' first law aimed at curtailing abuses of AI in the music industry in March.