Facebook's Latest AI Breakthrough: Direct Language Translation Made Easy

Home AI News Facebook's Latest AI Breakthrough: Direct Language Translation Made Easy

Updated on October 19 2020

Facebook's innovative machine translation (MT) system now enables direct translation between 100 languages, eliminating the need for English as an intermediary. This advancement, known as M2M-100, facilitates seamless translation from languages like Chinese to French without sacrificing accuracy. Traditional systems often convert through English, but this process can complicate translations and reduce overall precision.

Angela Fan, a Facebook AI research associate, emphasized the importance of catering to the diverse linguistic needs of users worldwide. With two-thirds of daily posts on Facebook’s platform originating in languages other than English, the demand for a more effective translation solution is clear.

The M2M-100 model leverages a vast data set of 7.5 billion sentences across 100 languages, trained with over 15 billion parameters. This universal model captures nuances and relationships between related languages, enhancing translation quality. The initiative builds on years of research and data collection from various sources.

To gather data, Facebook utilized CommonCrawl, a repository of web crawl data, alongside FastText for language classification. This method allows the team to categorize large text volumes by language, ultimately identifying pairs of sentences for translation. Traditional approaches involving human translators are often impractical due to the complexity of finding bilingual individuals for less common language pairs.

To efficiently create translation data on a large scale, Fan's team applied the LASER system, which generates mathematical representations of sentences. This enables alignment between similar sentences in different languages, facilitating accurate translation mapping.

In cases where written content is scarce for certain languages, the team incorporated monolingual data. For instance, with translations from Chinese to French, they utilized high-quality French content to back-translate into Chinese. This process produces synthetic data that enhances the training model's accuracy.

While the M2M-100 model lays a strong foundation for language translation, challenges remain for low-resource languages. Fan points out that while progress has been made with languages like Swahili and Afrikaans, more work is needed for languages such as Zulu.

Facebook plans to release the M2M-100 data set, model, training methodologies, and evaluation setups as open source, fostering further advancements in translation technology. The company aims to integrate this innovative system into its daily operations, contributing to a more connected global community.

"Explore Photoshop's New AI Features: Neural Filters and Sky Replacement Upgrades"

Microsoft Claims Its AI Can Describe Images Just Like Humans

Most people like

The StoryGraph

3.3M

Discover and select books tailored to your mood and personal preferences effortlessly.

book tracking AI Book Writing

Google Gemini AI for Google Sheets

12.4K

Discover an easy-to-use, cost-free AI tool designed specifically for Google Sheets. Boost your productivity and streamline your workflow with this effective solution!

Google Sheets Large Language Models (LLMs)

SciSummary

239.8K

Quickly summarize and comprehend scientific articles with SciSummary's AI-powered platform. Experience enhanced understanding and efficient reading with our innovative tools designed to simplify complex research findings.

AI Summarizer

LinkedIn Aura Check

14.5K

Discover a cutting-edge AI tool designed to transform your LinkedIn experience by providing in-depth analysis of profiles and personas. Enhance your networking and recruitment strategies with insights that drive meaningful connections and boost professional growth.

LinkedIn analysis Other

Find AI tools in YBX