"AutoToS: Accelerating LLM Planning with Speed, Precision, and Affordability"

Home AI News "AutoToS: Accelerating LLM Planning with Speed, Precision, and Affordability"

Updated on October 25 2024

Large Language Models (LLMs) have demonstrated potential in addressing planning and reasoning tasks by exploring various solutions. Nonetheless, current methods can be slow, computationally intensive, and sometimes yield unreliable outcomes.

To address these challenges, researchers from Cornell University and IBM Research developed AutoToS, a technique that synergizes the planning capabilities of LLMs with the efficiency and precision of rule-based search algorithms. AutoToS minimizes human intervention and significantly reduces the computational expenses associated with solving planning problems, making it a viable solution for LLM applications requiring reasoned decision-making over extensive solution spaces.

Innovative Techniques for Planning

Interest in utilizing LLMs for planning issues has surged, leading to the creation of various methods. Among the most effective, Tree of Thoughts employs LLMs as a search algorithm to validate solutions and suggest corrections. However, these techniques face two critical challenges: a high demand for LLM calls, which can be costly, and a lack of guarantees regarding “completeness” and “soundness.” Completeness ensures that a solution will eventually be found if one exists, while soundness confirms that any solution provided is valid.

Thought of Search (ToS) proposes an alternative by leveraging LLMs to generate code for pivotal components of search algorithms: the successor function, which explores different nodes, and the goal function, which determines if the desired state has been reached. This method enhances efficiency by reducing the need for LLM involvement throughout the search process.

Michael Katz, a principal research staff member at IBM Research, explains, “Historically, the planning community either manually coded these components for new problems or generated them from planning language descriptions, which were either hand-coded or learned from data. We aimed to use large language models to generate code for search components from textual problem descriptions.”

The original ToS technique yielded promising advancements in the soundness and completeness of search algorithms but required human experts for feedback on the generated code, creating a bottleneck that hampered the algorithm’s speed.

Automating the Process with AutoToS

To tackle this limitation, AutoToS automates the feedback and debugging process leveraging unit tests and debugging statements, along with few-shot and chain-of-thought (CoT) prompting techniques.

AutoToS operates in several steps. Firstly, it supplies the LLM with a problem description and prompts it to generate code for the successor and goal functions. Next, unit tests assess the goal function, providing feedback for necessary revisions. Once the goal function passes testing, the algorithm conducts a limited breadth-first search to verify soundness and completeness, iterating until the functions meet all criteria. Finally, the validated functions are incorporated into a classic search algorithm, executing the full search efficiently.

Evaluation of AutoToS

The researchers assessed AutoToS across various planning and reasoning tasks, including BlocksWorld, Mini Crossword, and the 24 Game—where four integers must be combined arithmetically to total 24. They utilized diverse LLMs, including GPT-4o, Llama 2, and DeepSeek Coder, to analyze performance variations based on model size.

Their findings indicated that AutoToS enabled all models to identify and rectify code errors using feedback. Larger models generally produced accurate goal functions without feedback and required minimal iterations to enhance the successor function. Notably, GPT-4o-mini exhibited strong accuracy outcomes despite its smaller size.

The researchers noted, “With just a few calls to the language model, we demonstrate that we can obtain the search components without direct human feedback, ensuring soundness, completeness, and nearly 100% accuracy across all models and domains.” AutoToS drastically minimizes LLM calls in comparison to other approaches; for example, solving the 1,362 puzzles in the 24 Game dataset required roughly 100,000 calls to GPT-4 with previous methods, whereas AutoToS necessitated only 2.2 calls on average.

Katz remarked, “With these components, we can employ the standard BFS algorithm to solve all 1,362 games in under 2 seconds with complete accuracy, something previous methods could not achieve.”

Implications for Enterprise Applications

AutoToS holds significant potential for enterprise contexts requiring planning solutions. By reducing LLM usage costs and reliance on manual input, it allows experts to focus on high-level planning and goal specifications.

Katz emphasizes, “We hope AutoToS will enhance both the development and deployment of planning-based solutions, using language models to create verifiable search components and speeding up development while circumventing issues typical with LLM deployment.”

ToS and AutoToS exemplify neuro-symbolic AI, a hybrid approach that merges deep learning and rule-based systems to tackle complex challenges. This approach is increasingly recognized as an effective direction to address the shortcomings of current AI systems.

“I have no doubt about the future role of hybrid systems in AI,” stated Harsha Kokel, research scientist at IBM. “Current language models can be viewed as hybrid systems since they perform search to determine the next tokens.”

While ToS and AutoToS show considerable promise, further exploration remains essential.

“It’s exciting to witness how planning with natural language evolves, and how LLMs can enhance the integration of planning tools in decision-making processes, paving the way for future intelligent agents,” Kokel and Katz concluded. “We are eager to explore how the world knowledge of LLMs can enrich planning and action in real-world situations.”

Google Enhances Enterprise Contact Centers with Gemini 1.5 Flash Upgrade

OpenAI Introduces Humanlike ChatGPT Advanced Voice Mode for U.S. and Team Users!

Most people like

Threado AI

Discover how AI-driven solutions can transform the support experience for both customers and internal teams. By automating processes and providing real-time assistance, AI enhances responsiveness, leading to increased satisfaction and streamlined operations. Embrace the future of support with intelligent tools designed to meet the needs of users and team members alike.

Automated support system AI Customer Service Assistant

Clerk Chat

Elevate your business communications by converting your landline into a textable phone number. This game-changing solution enhances customer engagement and streamlines your messaging capabilities.

business messaging Other

FindErnest

In today's fast-paced digital landscape, businesses face increasing demands for efficiency and innovation. Enterprise solutions encompassing Technology, Cloud, Data, and AI are crucial for organizations looking to optimize operations, enhance decision-making, and drive growth. By leveraging these advanced tools, companies can stay competitive, streamline processes, and unlock new opportunities in an ever-evolving market.

Technology Consulting Other

VMEG - Clips to Videos

Unleash the power of artificial intelligence to turn your raw clips into stunning marketing videos that captivate your audience. Discover innovative tools and strategies to enhance your visual content, boost brand visibility, and drive engagement. Elevate your marketing efforts effortlessly with AI-driven video transformation!

Video AI Editor AI Script Writing

Find AI tools in YBX