"Save Big with Anthropic's New Claude Prompt Caching: A Game Changer for Developers"

Home AI News "Save Big with Anthropic's New Claude Prompt Caching: A Game Changer for Developers"

Updated on October 25 2024

Anthropic has launched a public beta of prompt caching for its API, allowing developers to maintain context between API calls and eliminate redundancy in prompts. Currently, this feature is available for Claude 3.5 Sonnet and Claude 3 Haiku, with support for the larger Claude model, Opus, expected soon.

Prompt caching, outlined in a 2023 paper, enables users to save frequently used contexts during sessions. This means that users can add more background information without incurring higher costs, particularly beneficial when sending extensive context and needing to reference it across multiple conversations. This functionality also empowers developers to fine-tune model responses more effectively.

Early adopters have reported significant speed and cost efficiencies with prompt caching across various applications, such as integrating a comprehensive knowledge base or managing 100-shot examples, as well as tracking conversation turns within prompts.

Potential use cases for prompt caching include reducing costs and latency for lengthy instructions, streamlining document uploads for conversational agents, enhancing code autocompletion, and embedding entire documents within prompts.

Pricing for Cached Prompts

One of the primary advantages of caching prompts is the reduced cost per token. Anthropic indicates that using cached prompts is "significantly cheaper" than the standard input token price.

For Claude 3.5 Sonnet, sending a prompt for caching costs $3.75 per million tokens (MTok), while utilizing a cached prompt is only $0.30 per MTok. The base input price for Claude 3.5 Sonnet stands at $3/MTok, meaning that an initial higher investment can yield up to a 10x cost reduction in subsequent uses.

For Claude 3 Haiku, the cost is $0.30/MTok to cache prompts and $0.03/MTok for using stored prompts. Although prompt caching isn't yet available for Claude 3 Opus, pricing has been announced: caching will cost $18.75/MTok, while accessing cached prompts will be $1.50/MTok.

Notably, as AI influencer Simon Willison pointed out, Anthropic’s cache has a five-minute lifespan and resets upon each use.

Competitive Landscape

Anthropic is vying for a competitive edge in the AI field through aggressive pricing strategies. Prior to the Claude 3 family of models, the company reduced token prices to stay competitive with rivals like Google and OpenAI, all of whom are engaged in a price drop competition aimed at third-party developers.

A Highly Requested Feature

Prompt caching is not exclusive to Anthropic; other platforms, such as Lamina, utilize KV caching to lessen GPU costs. A glance at OpenAI's developer forums reveals numerous inquiries about prompt caching.

It’s important to differentiate between cached prompts and large language model memory. For instance, OpenAI’s GPT-4o features a memory function that retains preferences but does not store actual prompts and responses in the same way as prompt caching does.

Pindrop Aims for 99% Accuracy in Detecting AI Audio Deepfakes

Elon Musk's xAI Challenges 'Woke' Censorship with Bold Grok 2 AI Launch

Most people like

Ortto

Ortto is a powerful platform designed to enhance marketing strategies and boost customer engagement using data-driven insights.

marketing automation AI Product Description Generator

RazorSign

Unlock the power of contracts and legal operations to create smarter agreements and enhance efficiency. Discover how integrating these elements can transform your legal processes.

Contract lifecycle management AI Contract Management

HyperWrite

Unlock the power of your very own AI writing assistant, designed to elevate your writing experience. Whether you're crafting engaging blog posts, compelling articles, or creative stories, this tool simplifies your writing process, enhances clarity, and boosts productivity. Say goodbye to writer's block and hello to seamless creativity with your personal AI ally, ready to assist you every step of the way.

AI writing assistant Writing Assistants

Aidocmaker.com

Discover an intuitive AI platform that effortlessly generates and edits documents, streamlining your workflow and enhancing productivity.

AI document creator AI Presentation Generator

Find AI tools in YBX