Meta is simplifying audio production for artists and sound designers with its new open-source AudioCraft kit. This innovative tool integrates three generative AI models designed to create sounds from text descriptions: AudioGen, which generates sound effects; MusicGen, which composes music; and EnCodec, which compresses sounds for enhanced quality.
With AudioCraft, musicians and sound designers can quickly access pre-trained AudioGen models or dive into the complete code and model weights for deeper exploration and customization. This open-source initiative allows professionals and researchers to train models using their own datasets, ensuring that all pre-trained models utilize public or Meta-owned resources to avoid copyright issues.
Meta aims to make generative audio more straightforward and accessible, addressing the perception that sound has lagged behind visual and text-based AI. Many existing projects in the audio space tend to be complex and restrictive, but AudioCraft empowers creators to customize models and expand their creative possibilities.
While AudioCraft is not the only text-to-audio AI tool available—Google’s MusicLM was released earlier—it caters to a technically minded audience and emphasizes research applications. The developers are focused on enhancing model performance and control methods to increase their versatility.
In its current form, AudioCraft signifies a shift in AI's influence on music production. While artists like Holly Herndon continue to blend technology with personal creativity, the toolkit provides new opportunities for generating backing tracks, samples, and more, enabling a smoother creative process.