Unlock GenAI Models on Your PC with Nvidia’s Latest Tool

Nvidia is excited to boost interest in its latest GPUs by launching a new tool that enables users of GeForce RTX 30 Series and 40 Series graphics cards to run an offline AI chatbot on their Windows PCs.

Named Chat with RTX, this innovative tool allows users to customize a generative AI model similar to OpenAI’s ChatGPT, linking it to various documents, files, and notes for querying.

According to Nvidia’s blog post, “Instead of sifting through notes or stored content, users can simply type their questions.” For instance, if you ask, “What restaurant did my partner recommend while we were in Las Vegas?” Chat with RTX will intelligently scan designated local files and provide the answer along with relevant context.

The tool defaults to using Mistral’s open-source AI model but also supports other text-based models, including Meta’s Llama 2. However, users should be prepared for significant storage demands—between 50GB and 100GB—depending on the chosen model(s).

Currently, Chat with RTX can process various formats, including text, PDF, .doc, .docx, and .xml. By directing the application to a folder of compatible files, users can seamlessly integrate them into the model’s fine-tuning dataset. Moreover, the tool can utilize the URL of a YouTube playlist to gather video transcriptions, allowing the selected model to query the playlist’s content.

It’s essential to keep in mind some limitations outlined by Nvidia in their helpful guide.

Chat with RTX lacks the capability to retain context, meaning follow-up questions won’t benefit from previous queries. For example, if you ask, “What’s a common bird in North America?” and then follow up with “What are its colors?”, Chat with RTX won’t connect these queries.

Nvidia also notes that the accuracy of responses can vary based on several factors, such as how questions are phrased, the chosen model's performance, and the size of the fine-tuning dataset. Generally, asking for straightforward facts from a few documents will produce better results than requesting a summary of multiple documents. The quality of responses tends to improve significantly with larger datasets; thus, focusing Chat with RTX on extensive content about a specific topic can enhance response accuracy.

While Chat with RTX may be seen more as a novelty than a serious tool for production use, it represents a growing trend toward locally running AI models.

A recent report from the World Economic Forum predicts rapid growth in cost-effective devices capable of running generative AI models offline—encompassing PCs, smartphones, IoT devices, and networking equipment. The advantages of offline models include increased privacy—since data remains on the device—and lower latency compared to cloud-based solutions.

However, the democratization of AI tools also raises concerns about misuse, with many models available that have been fine-tuned on harmful content found online. Nonetheless, supporters of applications like Chat with RTX assert that the advantages can outweigh the potential risks. It remains to be seen how this balance will unfold in practice.

Most people like

Find AI tools in YBX