Can the Research Process Be Fully Automated?
An international team focused on machine learning is venturing into uncharted territories. Recent reports highlight the collaboration between Japan's Sakana AI and scientists from Canada and the UK, leading to the creation of an "AI Scientist" based on large language models. This innovative system can autonomously navigate the entire research cycle—from reviewing literature and generating hypotheses to testing solutions and drafting papers. Sakana AI asserts that this comprehensive AI marks the dawn of a new era in scientific discovery. While it exhibits remarkable potential, caution is advised regarding the risks of misuse.
Streamlining the Research Process
The advancements in AI technology empower scientists to utilize models for brainstorming and coding. However, these models often necessitate substantial human oversight, especially for specific tasks. The team, including experts from the University of British Columbia, successfully developed the first "AI Scientist." During the idea generation phase, the AI relies on a foundational template to brainstorm diverse research avenues, ensuring that ideas are both novel and intriguing. In the experimental iteration phase, it conducts experiments based on proposed ideas, creates visual results, and annotates each chart. The paper writing phase sees the AI mimicking established styles from machine learning conferences, yielding concise yet content-rich papers with properly cited references. Additionally, the research team designed an automated "AI Reviewer" which evaluates the generated papers' accuracy comparably to human reviewers. This feedback loop allows for continuous improvements in the "AI Scientist's" research outcomes.
In initial demonstrations, the "AI Scientist" delved into subfields such as diffusion models, transformer architectures, and AI understanding—producing ten papers, each at a cost of around $15. According to Jevin West, a computational social scientist at the University of Washington, the seamless execution of the entire research process by the "AI Scientist" is impressive and has the potential to hasten scientific discovery.
Far from Perfection
Despite its enormous potential, the "AI Scientist" is not without flaws. Sakana AI acknowledges that the system currently lacks visual capabilities, hindering its ability to rectify issues with generated charts. For example, it sometimes produces unreadable graphs or tables that exceed page limits, and the layout can be unappealing. Moreover, there are instances where the AI correctly identifies ideas but fails in execution, leading to misleading outcomes due to improper comparisons. Serious errors may also arise in paper writing and conclusion evaluation, particularly regarding its difficulty in comparing numerical values—a common limitation of large language models.
The research team has taken steps to address these challenges by ensuring that all experimental results are reproducible and all execution files are stored systematically. They anticipate that future multimodal models could enhance the "AI Scientist's" capabilities. Currently, its focus is restricted to machine learning research and it lacks crucial elements of the scientific process, such as hands-on experimentation. Tom Hope, a computer scientist at the Allen Institute for AI, points out that this language model is still unable to propose and formulate truly innovative scientific directions. Hebrand Hyde, a materials scientist at Lawrence Berkeley National Laboratory, believes that while the system may not yet tackle more creative tasks, it can automate some repetitive aspects of the research process.
Sakana AI emphasizes that it remains uncertain whether such systems can generate genuinely transformative ideas, much less invent concepts like artificial neural networks or information theory.
Greater Power Requires Greater Caution
Researchers assert that expanding the "AI Scientist's" capabilities to explore more abstract areas, such as pure mathematics, may necessitate integrating technologies beyond language models. For instance, solving math problems often requires logical reasoning, a domain where most AI models currently struggle. To address this, Google DeepMind has developed AlphaGeometr, which combines language models with symbolic engines for reasoning, forming a neural-symbolic hybrid system. In the recent Math Olympiad, the upgraded AlphaGeometry2 solved a problem in just 19 seconds, significantly outpacing human competitors.
Experts believe that the current iteration is merely the beginning. The "AI Scientist" represents a foundational step in the automation of scientific research, akin to the role of GPT-1 in AI development. As iterations continue, it is expected to catalyze a new scientific revolution similar to what GPT-4 has achieved today. However, like many new technologies, the "AI Scientist" also presents potential risks, opening a "Pandora's box" that could lead to misuse. For example, its ability to automatically generate and submit papers could overwhelm peer reviewers, compromise the quality control of scientific research, and exert pressure on academic progress. Additionally, there is concern that the "AI Scientist" could be exploited to create dangerous viruses, posing tangible risks to society.