Google DeepMind Introduces 'Self-Discover' Framework to Enhance LLMs and Boost GPT-4 Performance

In an effort to enhance the reasoning capabilities of large language models (LLMs), researchers from Google DeepMind and the University of Southern California have introduced a groundbreaking "self-discover" prompting framework.

Published on arXiv and Hugging Face, this innovative approach surpasses existing prompting techniques and has demonstrated improvements in the performance of various models, including OpenAI's GPT-4 and Google's PaLM 2.

“Self-discover significantly boosts GPT-4 and PaLM 2's performance on demanding reasoning benchmarks, such as BigBench-Hard and MATH, by up to 32% compared to Chain of Thought (CoT) methodologies,” the researchers state in their paper.

The self-discover framework enables LLMs to identify task-specific reasoning structures autonomously to address problems effectively. By analyzing multiple atomic reasoning modules—such as critical thinking and step-by-step reasoning—the models can construct an explicit reasoning framework to follow during problem-solving.

One of the most compelling aspects of this approach is its efficiency, requiring 10 to 40 times less computational power, making it highly advantageous for businesses.

Evolution of LLM Reasoning

LLMs have matured to tackle a variety of tasks, thanks to their capacity to process instructions, reason, and generate coherent answers. Utilizing transformer architecture, these models employ diverse prompting strategies drawn from cognitive theories on human reasoning and problem-solving. This includes few-shot and zero-shot chain-of-thought prompting, decomposition of tasks into subproblems, and reflective step-back prompting to derive general principles.

While these methods, especially chain-of-thought, are effective, they often rely on implicit assumptions about how to approach a task. The researchers argue that this may not be optimal, as each task possesses a unique intrinsic structure that may benefit from a tailored technique.

With their latest research, the DeepMind and USC team propose a comprehensive prompting framework that autonomously identifies the underlying structure to select the most appropriate reasoning strategy while optimizing efficiency.

“Self-discover is modeled after how humans create internal reasoning programs for problem-solving. From a set of natural language atomic reasoning modules, such as ‘break down into sub-tasks’ and ‘critical thinking,’ the LLM composes a coherent reasoning structure intrinsic to the task in Stage 1, and then applies this structure in Stage 2 to solve specific instances of the task,” the researchers elaborate.

Remarkable Performance Gains

To evaluate the effectiveness of the new framework, the researchers tested it on multiple models, including GPT-4 and PaLM 2-L, across 25 reasoning tasks, including BigBench-Hard and MATH. The self-discover framework outperformed the chain-of-thought method in 21 of the 25 tasks, achieving performance gains of up to 32% and significantly enhancing efficiency by requiring 10 to 40 times less inference compute.

According to the results, when tested with GPT-4, the self-discover method achieved accuracies of 81%, 85%, and 73% on Big-Bench Hard, Thinking for Doing, and MATH tasks, respectively. In contrast, the chain-of-thought method yielded lower accuracies of 75%, 52%, and 71%. A similar performance gap was noted with comparisons to the plan-and-solve approach.

For PaLM 2-L, the accuracies achieved were 67%, 69%, and 50.5% across the three tasks, outperforming chain-of-thought (60%, 40%, and 42%) and plan-and-solve (61%, 42%, and 49%).

Advancing AI's Reasoning Capabilities

The self-discover prompting framework has the potential to revolutionize how LLMs approach problem-solving, bringing them closer to achieving general intelligence. Transferability studies indicate that the composed reasoning structures are broadly applicable across model types and share characteristics with human reasoning.

“Looking ahead, we are eager to continue exploring structured reasoning in LLMs to advance problem-solving capabilities and uncover new avenues for Human-AI collaboration,” the team concluded.

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles