Presented by Intel
Generative AI has the potential to significantly boost human productivity, yet only a few organizations currently have the expertise and resources required to develop and train essential foundation models from the ground up. The challenges are twofold: first, collecting the necessary training data is increasingly difficult due to stringent intellectual property rights held by content owners. Second, the financial resources required for training can be prohibitively costly. Nonetheless, the societal benefits of making generative AI technologies widely accessible are substantial.
So, how can small businesses or individual developers integrate generative AI into their applications? The solution lies in creating and deploying customized versions of existing foundation models.
Given the considerable investment involved in developing new generative AI models, they must be versatile enough to accommodate a variety of applications—much like the numerous ways in which GPT-based models are currently utilized. However, a general-purpose model may not adequately meet the specific needs of different domains. Employing a large, general-purpose model for a niche application can also lead to unnecessary consumption of computing resources, time, and energy.
Therefore, most enterprises and developers are best served by starting with a large generative AI model as their foundation, adapting it to fit their specific needs with far less development effort. This approach also offers infrastructure flexibility by utilizing available CPUs or AI accelerators, sidestepping issues related to GPU shortages. The key is to concentrate on the specific use case, narrowing the project’s scope while optimizing flexibility through open, standards-based software and widely available hardware.
Taking the Use Case Approach for AI Application Development
In software development, a use case outlines the target user’s characteristics, the problem to be solved, and how the application will achieve this. This definition determines product requirements, influences software architecture, and provides a roadmap for the product lifecycle. Most importantly, it clarifies what is outside the project’s scope.
For generative AI projects, establishing a use case can reduce the model's size, computational needs, and energy consumption, while enhancing accuracy by focusing on a specific dataset. This targeted approach leads to lower development efforts and costs.
The factors defining a use case for generative AI may vary by project, but several guiding questions can help:
- Data Requirements: What type and amount of training data are necessary and available? Is the data structured (data warehouse) or unstructured (data lake)? What restrictions apply? How will the application handle data—through batch processing or streaming? What is the frequency of model updates? Training large language models (LLMs) from scratch is time-consuming, so if real-time knowledge is vital for your application (e.g., healthcare), alternative approaches may be necessary to ensure up-to-date data.
- Model Requirements: Considerations such as model size, performance, and transparency of results are crucial when selecting the right model. Performance in LLMs can range from billions to trillions of parameters—Meta's Llama 2 offers versions from 7 to 70 billion parameters, while OpenAI's GPT-4 is reported to have 1.76 trillion parameters. Larger models generally yield higher performance, but smaller models may align better with your needs. Open models allow for deeper customization, whereas closed models offer off-the-shelf solutions with API access. Tailoring a model to your data can be important for applications needing traceability, such as generating summaries of financial statements for investors, while an off-the-shelf model might suffice for creative tasks like generating advertising copy.
- Application Requirements: Identify necessary standards for accuracy, latency, privacy, and safety. How many concurrent users should it support? How will users interact with the application? For instance, whether your model operates on a low-latency edge device or in a high-capacity cloud environment will significantly influence implementation decisions.
- Compute Requirements: Once the above factors are clarified, ascertain the necessary computing resources. Do you need to parallelize data processing using Modin*? Do your fine-tuning and inference requirements warrant a hybrid cloud-edge setup? Even if you have the talent and data to develop a generative AI model from scratch, evaluate if your budget can support the necessary compute infrastructure overhaul.
These considerations will guide discussions to define and scope your project requirements. Financial aspects—covering data engineering, upfront development expenses, and the business model supporting inference costs—also dictate the strategies for data, training, and deployment.
How Intel's Generative AI Technologies Can Help
Intel offers heterogeneous AI hardware solutions tailored to diverse computing needs. To maximize your hardware's potential, Intel supplies optimized versions of popular data analysis and end-to-end AI tools. Recently, Intel introduced an optimized model, the #1 ranked 7B parameter model on the Hugging Face open LLM leaderboard (as of November 2023). These resources, along with those from Intel's AI developer ecosystem, can meet your applications' accuracy, latency, and security demands. Start with hundreds of pre-trained models available on Hugging Face or GitHub optimized for Intel hardware. You can preprocess data using Intel tools like Modin, fine-tune foundation models using tools such as Intel® Extension for Transformers or Hugging Face Optimum, and automate model tuning with SigOpt, all based on optimizations contributed to open-source AI frameworks including TensorFlow, PyTorch, and DeepSpeed.
Generative AI Use Case Examples
1. Customer Service: Chatbot Use Case
LLM-based chatbots enhance service efficiency by providing immediate responses to common inquiries, allowing representatives to tackle more complex issues. General-purpose LLMs can converse in various languages but may lack specific business knowledge or may “hallucinate” information confidently despite having no basis. Fine-tuning updates the model incrementally, while retrieval methods, such as retrieval-augmented generation (RAG), fetch relevant data from an external database built from business-specific documents. Both approaches yield context-specific responses and can utilize readily available CPUs like Intel® Xeon® Scalable processors.
2. Retail: Virtual Try-On Use Case
Generative AI can offer immersive online shopping experiences, such as virtual try-ons, enhancing customer satisfaction and optimizing supply chain efficiency. This application is founded on image generation and should focus on the specific clothing line. Fine-tuning image models like Stable Diffusion may only require a limited number of images processed on CPU platforms. To safeguard customer privacy, images should be stored locally, possibly on consumer devices.
3. Healthcare: Patient Monitoring Use Case
Combining generative AI with real-time patient monitoring can generate personalized reports and action plans. This use case requires multimodal AI to process various input types and generate reports. Training models in healthcare raises privacy concerns, necessitating that patient data remains with providers. Federated learning allows the model to train locally without transferring sensitive data. While local inference is ideal, hybrid solutions involving both edge and cloud components may be necessary, potentially requiring optimization techniques.
How to Get Started
Begin by defining your use case using the guiding questions above to clarify data, compute, model, and application requirements. Next, explore relevant foundation models, reference implementations, and community resources available in the AI ecosystem. Identify and employ the fine-tuning and optimization techniques best suited to your project.
Realizing compute needs may take time, and they often evolve throughout the project. Intel® Developer Cloud offers various CPUs, GPUs, and AI accelerators to assist you as you start developing.
Finally, to ease the transition between different compute platforms during development and deployment, choose AI tools and frameworks that are open, standards-based, and capable of optimal performance across various devices without requiring extensive code rewrites.
Learn More: Intel AI software, Intel Developer Cloud, Intel AI Reference Kits, oneAPI for Unified Programming
Jack Erickson is Principal Product Marketing Manager, AI Software at Intel.
Chandan Damannagari is Director, AI Software, at Intel.