Meta Unveils New AI Models for Audio, Text, and Watermarking Innovations

Home AI News Meta Unveils New AI Models for Audio, Text, and Watermarking Innovations

Updated on October 25 2024

Meta's Fundamental AI Research (FAIR) team is unveiling several new AI models and tools for researchers, focusing on audio generation, text-to-vision capabilities, and watermarking technologies.

“By sharing our early research publicly, we aspire to inspire innovation and advance AI in a responsible manner,” the company stated in a press release.

Audio Creation Model: JASCO and Watermarking Tools

Meta introduces JASCO, which stands for Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation. This model enhances audio creation by allowing users to input various elements, such as chords or beats, to refine the final output. FAIR's research indicates that JASCO enables users to manipulate characteristics of the generated audio—like chords, drums, and melodies—through textual commands, facilitating the desired sound.

FAIR will release the JASCO inference code as part of its AudioCraft AI audio model library under an MIT license, while the pre-trained model will be available on a non-commercial Creative Commons license. Additionally, Meta is launching AudioSeal, an innovative tool that watermarks AI-generated speech, helping to identify such content more effectively.

Meta asserts, “AudioSeal is the first audio watermarking technique designed specifically for localized detection of AI-generated speech, enabling the identification of AI-created segments within longer audio files.” This tool enhances detection efficiency, reportedly increasing detection speed by 485 times compared to traditional methods. Unlike other models, AudioSeal will be released under a commercial license.

Chameleon Model Release

FAIR is also planning to release two versions of its multimodal text model, Chameleon, under a research-only license. The Chameleon 7B and 34B models are designed for tasks that require visual and textual understanding, such as image captioning. However, Meta has announced that it will not make the Chameleon image generation model available at this time, limiting access to the text-related functionalities.

Furthermore, researchers will gain access to a multi-token prediction method that trains language models on multiple future words simultaneously rather than sequentially. This feature will be accessible exclusively under a non-commercial and research-only license.

Discover Maxim: Your Comprehensive Evaluation Platform for Tackling AI Quality Challenges

Decagon Unveils ‘Human-Like’ AI Agents to Revolutionize Enterprise Customer Support from Stealth Mode

Most people like

Komo Search

148K

Komo Search: Your Private, Ad-Free AI-Powered Search Engine. Discover a new way to explore the web with enhanced privacy and an uninterrupted browsing experience. Enjoy AI-driven results tailored just for you!

AI search AI Search Engine

Airstrip

5.6K

Introducing an AI-Powered Legal Co-Pilot Tailored for Startups. Navigate the complexities of legal requirements with ease and confidence.

AI Legal Assistant

Super Teacher

5.6K

Super Teacher provides unlimited private lessons in a wide range of subjects for children aged 3-8, delivering superior results compared to traditional private tutoring.

education AI Education Assistant

AI Picasso

19.4K

Create stunning artwork effortlessly with cutting-edge AI technology at the AI Picasso website. Unleash your creativity and explore a world of artistic possibilities today!

Artificial Intelligence AI Art Generator

Find AI tools in YBX