OpenAI is reportedly facing a cash crunch, but that hasn’t slowed the company down. They continue to roll out new models and updates, including the recent introduction of GPT-4o Long Output, a variation of the original GPT-4o model released in May. This new model offers a substantial increase in output size, allowing up to 64,000 tokens—a dramatic increase from the original's 4,000 tokens, marking a 16-fold enhancement.
Tokens represent the numerical forms of words and phrases used by large language models (LLMs). For example, both "Hello" and "hi" count as one token each. Users can explore tokens further through OpenAI's interactive or through Simon Willison’s token encoder/decoder.
With the GPT-4o Long Output, OpenAI empowers users and developers to generate responses as lengthy as 200 pages. This update responds to customer feedback requesting longer output contexts. An OpenAI spokesperson mentioned, “We heard feedback from our customers that they’d like a longer output context. We are always testing new ways we can best serve our customers’ needs.” The alpha testing phase will last a few weeks to assess how well this enhanced capability meets user requirements.
This extended output is particularly beneficial for applications that need detailed information, such as code editing and writing improvement, allowing for more nuanced and comprehensive responses.
Context vs. Output
Since its launch, the original GPT-4o has provided a maximum context window of 128,000 tokens, which includes both input and output tokens. For the new GPT-4o Long Output, this context window remains unchanged. The increase to 64,000 output tokens is significant given the following functionality:
- GPT-4o: Up to 124,000 input tokens yields a maximum of 4,000 output tokens.
- GPT-4o Mini: Up to 112,000 input tokens allows for a maximum of 16,000 output tokens.
- GPT-4o Long Output: Users can input up to 64,000 tokens and receive 64,000 output tokens back, prioritizing longer responses while maintaining an overall cap of 128,000 tokens.
This flexibility offers users the option to sacrifice input tokens for more verbose outputs, catering to those who need extensive answers.
Pricing Structure
The pricing for the new GPT-4o Long Output model is set at:
- $6 USD per 1 million input tokens
- $18 USD per 1 million output tokens
In comparison, the standard GPT-4o pricing is $5 per million input tokens and $15 per million output tokens, while the GPT-4o Mini costs $0.15 per million input tokens and $0.60 per million output tokens. This pricing strategy reflects OpenAI's commitment to making advanced AI both affordable and accessible for developers.
Currently, access to the experimental model is limited to a select group of trusted partners. An OpenAI spokesperson stated, “We’re conducting alpha testing with a few trusted partners to see if longer outputs help their use cases.” Depending on the feedback received, OpenAI may consider expanding access to a broader audience.
Future Directions
The ongoing alpha testing will yield crucial insights into the practical applications and advantages of this extended output capability. If initial partners respond positively, OpenAI may broaden access to enable more users to leverage these enhanced functionalities.
With the GPT-4o Long Output, OpenAI is determined to meet a wider array of customer demands, powering applications that require detailed responses.