AI Chatbots: Why Advanced Systems Rarely Admit They Don't Have All the Answers

Home AI News AI Chatbots: Why Advanced Systems Rarely Admit They Don't Have All the Answers

Updated on September 26 2024

Researchers have identified a significant drawback in the evolution of advanced chatbots. While AI models become more accurate over time, they also tend to answer questions outside their expertise rather than admitting uncertainty. This leads users to take their confident yet incorrect answers at face value, perpetuating a cycle of misinformation. “They are answering almost everything these days,” says José Hernández-Orallo, a professor at the Universitat Politecnica de Valencia, Spain. “That means more correct answers, but also more incorrect ones.”

Hernández-Orallo, the lead on this study conducted with colleagues at the Valencian Research Institute for Artificial Intelligence, explored three families of large language models (LLMs): OpenAI’s GPT series, Meta’s LLaMA, and the open-source BLOOM. The team examined a range of models, starting from the relatively basic GPT-3 ada and moving through to the more advanced GPT-4, which was released in March 2023. Notably, the latest versions, GPT-4o and o1-preview, were not included in their analysis.

The researchers assessed each model with thousands of questions across various topics including arithmetic, geography, and science, as well as tasks like alphabetizing lists. They categorized prompts by their perceived difficulty. The results revealed that as the models advanced, the frequency of incorrect answers increased, indicating that more sophisticated chatbots resemble overconfident professors who believe they hold the answers to every query.

Human interaction further complicates the problem. Volunteers tasked with evaluating the accuracy of the AI outputs often misclassified wrong answers as correct, with error rates ranging from 10 to 40 percent. Hernández-Orallo concluded, “Humans are not able to supervise these models effectively.”

To mitigate this issue, the research team suggests that AI developers focus on enhancing performance in easier tasks and program chatbots to refrain from attempting more complex inquiries. “We need people to recognize: ‘I can use it in this area, and I shouldn’t use it in that area,’” Hernández-Orallo added.

While this is a prudent suggestion, there may be little incentive for AI companies to adopt it. Chatbots that frequently admit to not knowing answers might seem less advanced or valuable, resulting in decreased usage and revenue for the developers. Consequently, we continue to see disclaimers indicating that “ChatGPT can make mistakes” or that “Gemini may display inaccurate information.”

Ultimately, it becomes our responsibility to scrutinize and verify the answers provided by chatbots to avoid disseminating incorrect information that could cause harm. For the sake of accuracy, always fact-check your chatbot’s responses.

FCC Penalizes Political Consultant $6 Million for Deceptive Deepfake Robocalls

DoNotPay's 'Robot Lawyer' Fined $193K by FTC for Practicing Law Without a License

Most people like

myStylus

62.4K

Introducing our AI platform designed specifically for writing, editing, and research. This innovative tool harnesses the power of artificial intelligence to enhance your writing experience, streamline the editing process, and support in-depth research. Whether you're a student, professional, or creative thinker, our AI platform is here to elevate your content creation journey. Discover how our technology can transform your writing projects with precision and efficiency.

AI writing assistant AI Analytics Assistant

SongR

94.2K

Effortlessly craft customized songs with the SongR app by simply entering a few keywords. Enjoy a seamless experience as you bring your musical ideas to life!

Music creation AI Content Generator

Opinion Stage

391.6K

Boost your lead generation strategy with interactive quizzes. These engaging tools not only capture attention but also gather valuable insights about your audience. Discover how incorporating quizzes into your marketing efforts can enhance user engagement and convert prospects into loyal customers.

Quiz Maker Other

Alter

35.2K

Discover personalized at-home fitness tailored to your unique genes and biometrics. Unlock your potential with a workout program designed specifically for you, optimizing your results and enhancing your wellness journey.

at-home fitness Fitness

Find AI tools in YBX