OpenAI's GPT-4: Now Capable of Understanding Text and Image Inputs

Following Google's recent Workspace AI announcement and just before Microsoft's Future of Work event, OpenAI has unveiled GPT-4, the latest version of its generative pre-trained transformer system. Unlike GPT-3.5, which powers the popular ChatGPT and is limited to text input, GPT-4 introduces a multi-modal capability that enables it to generate text based on image inputs as well.

Although OpenAI acknowledged that GPT-4 is "less capable than humans in many real-world scenarios," it demonstrated human-level performance on several academic and professional benchmarks. After six months of tuning based on user feedback, the new model has significantly improved its performance: GPT-4 scored in the top 10% of test takers on simulated exams like the Uniform Bar, LSAT, and GRE, while GPT-3.5 scored in the bottom 10%. Additionally, GPT-4 has surpassed other leading large language models in various evaluations, achieving record performance in factuality, steerability, and adherence to guidelines.

Access to GPT-4 is available for both ChatGPT and the API. Users must be ChatGPT Plus subscribers to utilize the new model, which will have a usage limit. API access operates through a waitlist. According to OpenAI, GPT-4 is more reliable and creative, capable of understanding and responding to nuanced instructions.

The new multi-modal input feature allows you to submit a range of documents, such as marketing reports, textbooks, and shop manuals, and receive concise text summaries. The system can customize its outputs according to the specific style and tone prescribed by developers in the 'system' message, offering more flexibility than the fixed characteristics of its predecessor.

GPT-4 also exhibits a 40% reduction in "hallucinating" facts and is 82% less likely to comply with requests for inappropriate content. To enhance accuracy, OpenAI consulted 50 experts across various fields, from cybersecurity to international security, to test the model and mitigate inaccuracies. However, users are still urged to exercise caution, particularly in high-stakes scenarios, and to employ thorough review protocols to ensure the reliability of language model outputs.

Most people like

Find AI tools in YBX