Seattle-based OctoAI has launched OctoStack, a platform designed to empower enterprises to deploy private generative AI models. This turn-key production solution can be utilized in a virtual private cloud or on-premises, providing access to optimized inference, model customization, and asset management. OctoAI aims to give companies the flexibility to develop and operate generative AI applications tailored to their specific needs.
“Building viable and future-proof generative AI applications requires more than just affordable cloud inference,” said Luis Ceze, CEO of OctoAI. “Hardware portability, model onboarding, fine-tuning, and load-balancing are full-stack challenges that demand comprehensive solutions.”
OctoStack supports the fine-tuning and deployment of various open-source and commercial AI models, including Meta’s Llama family, Mistral’s 8x8B, and Stable Diffusion. However, it does not include Anthropic’s Claude model, which is exclusively available in the cloud. “We provide highly capable open-source models that clients can fully control and customize,” Ceze added.
Fully Managed vs. Self-Managed Solutions
This release follows OctoAI's previous offering, a self-optimizing infrastructure service. Ceze highlighted that OctoStack is a self-managed solution, as demand for private deployments grew with client usage reaching “billions of tokens per day.” He likened it to hosting a blog on a private server instead of a shared platform, underscoring the importance of data control for enterprises.
“As companies increasingly embrace AI, they become concerned about sending data via APIs outside their jurisdiction,” Ceze explained. “OctoStack allows clients to select and customize their models while providing a completely private API. We manage the infrastructure to ensure models are reliable and efficient across their GPUs.”
While hundreds of clients utilize OctoAI’s fully managed solution, Ceze did not disclose the number of users for OctoStack. He mentioned companies already experimenting with generative AI tools, such as Apate.ai, Otherside AI, Latitude Games, and CapitalAI, as being prime targets for this offering.
Growth Potential for Generative AI in Enterprises
The enterprise market presents significant opportunities for generative AI adoption. A report by Menlo Ventures revealed that $400 billion was spent on cloud software in this sector last year, with AI accounting for $70 billion (18%), and generative AI comprising only $2.5 billion—less than 1%.
“Current usage and interest in generative AI among enterprises is high, with over half of CIOs planning formal deployment,” noted Hyoun Park, CEO of Amalgam Insights. “However, the capabilities for model customization and fine-tuning remain limited.”
Ray Wang, founder of Constellation Research, observed that many organizations are optimizing for a multi-vendor landscape without a pure generative AI stack. He views OctoStack positively, as it centralizes capabilities, simplifying the deployment process.
OctoAI faces competition from startups and established players such as Nvidia, Databricks, and SambaNova Systems. However, Ceze remains confident in OctoAI's position. “This is a dynamic space, and while competition will intensify, our unique focus on cross-technology optimizations sets us apart. That’s the essence of our company’s foundation.”