Recently, advanced multimodal models like GPT-4o and Sora have made significant strides in generative AI, igniting interest among researchers and industry leaders about the forthcoming revolutionary breakthroughs in artificial intelligence.
AI expert Andrew Ng from Stanford University has praised the potential of intelligent agents. In a recent blog post, he noted that "AI agent workflows will drive substantial progress in artificial intelligence this year." The recognition of AI agents' future potential continues to grow.
Ng has also open-sourced an AI agent designed for machine translation. He believes this development can significantly enhance traditional neural machine translation, highlighting its "tremendous potential yet to be fully realized." Alongside this, he introduced a demo of the translation agent, made available under the MIT License, enabling users to freely use, modify, and distribute the code for various purposes.
Preliminary tests conducted by his research team indicate that the open-sourced translation agent occasionally performs comparably to leading commercial solutions, though there are instances of underperformance. Nonetheless, it provides a highly controllable translation experience where users can easily adjust prompts to specify tone (formal or informal), regional variations (such as distinguishing between the Spanish spoken in Spain and Latin America), and ensure terminology consistency through glossaries. Ng sees significant potential for further enhancements, especially given the positive outcomes observed in its reflective workflows.
In this open-source project, Ng outlines the AI agent's translation workflow:
Using a Reflective Workflow for Intelligent Translation
This machine translation model utilizes a reflective agent workflow demonstrated in Python, consisting of three main steps:
1. Input a prompt instructing a large language model (LLM) to translate text from the source language to the target language.
2. Allow the LLM to reflect on the translation outcomes and propose improvements.
3. Implement these suggestions to enhance the translation.
Customization Features
Leveraging the LLM as the core translation engine offers users a high degree of control. For instance, modifying the prompt can enable features that conventional machine translation (MT) systems struggle with:
- Adjusting the output's style, such as switching between formal and informal tones.
- Guiding the handling of idioms and special terms, like names and technical jargon. Including a glossary in the prompt guarantees consistent translations of specific terms (e.g., "open source" or "GPU").
- Adapting language use to specific regions or dialects, meeting the needs of the target audience. For example, Spanish varies significantly between Latin American countries and Spain.
Evaluating Translation Quality
The translation quality assessment employs BLEU (Bilingual Evaluation Understudy) scores, differentiating it from traditional machine translation methods. Evaluations show that while this workflow can compete with commercial products, its performance may fluctuate. However, it occasionally achieves impressive results that surpass those of commercial alternatives.
The authors assert that this is merely the beginning for AI-driven translation, highlighting vast potential for improvement and encouraging open dialogue and experimentation within the research community.
Getting Started with the Translation Agent
To initiate the translation agent, follow these steps:
1. Installation: Ensure you have the Poetry package manager installed. Depending on your environment, this may involve running:
pip install poetry
Then, clone the repository and install the dependencies:
git clone https://github.com/andrewyng/translation-agent.git
cd translation-agent
poetry install
poetry shell Activates the virtual environment
2. Environment Configuration: To run the workflow, create a .env file and insert your OpenAI API key, using .env.sample as a template.
3. Usage:
python
import translation_agent as ta
sourcelang, targetlang, country = "English", "Spanish", "Mexico"
translation = ta.translate(sourcelang, targetlang, source_text, country)
Refer to examples/example_script.py for a sample script to get started.
Future Development of the Translation Agent
Ng proposes several avenues for the open-source community to explore, aiming to maximize the translation agent's potential:
- Experiment with alternate language models alongside GPT-4-turbo to evaluate performance across different LLMs and hyperparameter choices.
- Build glossaries efficiently for domain-specific terminology that might be unfamiliar to LLMs.
- Investigate how glossaries can be optimally integrated into prompts.
- Assess performance variations across different languages and explore strategies to improve results for specific source or target languages, particularly those with low resources.
- Conduct error analyses to identify strengths and weaknesses in performance in specialized fields like law or medicine, as well as assess challenges with various text types.
- Develop improved assessment metrics that accurately evaluate translation quality and align with human preferences.
Despite being in its early stages, Ng’s open-sourced translation agent showcases promising results within machine translation datasets and paves the way for future advancements in AI agents.