Moemate’s AI Avatar Scans Your Entire Screen: Mixed Yet Fascinating Insights Unveiled

The decline of Cortana highlights a significant trend: many AI assistants from the past have failed to meet user expectations. As a result, innovation is on the rise. Amazon is developing a new large language model similar to OpenAI’s GPT-4 to enhance its Alexa voice assistant. Meanwhile, Google is preparing to “supercharge” Google Assistant with capabilities similar to its AI-powered chatbot, Bard.

This revolutionary shift is not confined to major tech companies; startups are also reimagining AI assistants to be more helpful and engaging. One standout example is Moemate, a versatile assistant compatible with macOS, Windows, and Linux. Designed with an anime-inspired avatar, Moemate leverages a combination of models, including GPT-4 and Anthropic’s Claude, to provide vocalized answers to user inquiries. The term “Moe” reflects a sense of cuteness often found in anime.

While this functionality is not new—platforms like ChatGPT, Bard, and Bing Chat offer similar features—Moemate distinguishes itself by going beyond text prompts. It can actively assess what is displayed on a PC screen.

However, there are legitimate privacy concerns. Webaverse, the company behind Moemate, claims to primarily store assistant logs and preferences locally. Still, its privacy policy suggests that data such as PC specifications and unique identifiers may be used for legal compliance and investigations. Allowing such software access to your screen and activity inevitably raises privacy risks.

Despite these concerns, my curiosity led me to install Moemate, currently in open beta, on my work Mac.

Moemate impresses as a free early-access product, offering extensive customization options. Users can personalize avatars, animations, synthetic voices, and even create custom character models that can be shared with others. The assistant’s “personality” is determined by a selected text-generating model, and I opted for ElevenLabs’ voice, which sounded the most natural.

To keep conversations on track and minimize erratic behavior, Moemate provides each avatar with a scripted bio at the conversation's start. For example:

"You will be acting as Nebula, a serene voyager personality, always exploring the cosmic realm of knowledge. Their calm demeanor captivates all who meet them while steering clear of strenuous debates. Nebula would rather ponder the mysteries of the universe."

Users can create and modify these bios, which adds room for creativity but also raises concerns about potential exploitation in the form of prompt injection attacks. Malicious users could easily create and share harmful avatars.

Moemate includes features beneficial for Twitch users, such as focusing the chat window and tracking subscriber counts. However, I was unable to test these functionalities during my exploration. The assistant is marketed as being capable of engaging users even when there are no chat messages, yet I remain skeptical of its efficiency in this regard.

For basic inquiries, Moemate is competent but not groundbreaking. Its effectiveness depends on the chosen text-generating model. For instance, Claude often identifies itself as Claude during interactions. Moemate can generate images using the open-source Stable Diffusion model, but given the plethora of image generation tools available, this feels somewhat redundant.

What truly stands out is its screen capture functionality. As described by Webaverse, Moemate "can see your screen." It interprets the content and allows you to ask questions about what you are currently viewing, eliminating the need for elaborate explanations.

Despite its limitations, I successfully utilized Moemate to summarize recipes and webpages without having to copy text. Once, I asked a question about the macOS System Settings, and it provided a detailed description of the tabs, adding context about the tab I was focused on—essentially functioning as a helpful background resource.

In another instance, with GPT-4 selected as the model, I asked Moemate to analyze my chaotic desktop, which was cluttered with dozens of Chrome tabs. The assistant fixated on Google Messages, even identifying specific contacts I frequently texted by name.

For gamers, Moemate could assist in gameplay decisions. In a demo video, it showcased its ability to recommend Dota 2 characters and suggest suitable weapons. However, its focus can sometimes be unpredictable, leading to odd responses or suggestions to step away from stressful topics.

Some of its commands are also clunky. For instance, while it can adjust its voice volume, it cannot alter the system-wide volume. It can perform web searches, but only occasionally yields useful results.

Considering it’s still in beta, there’s an acknowledgment of its experimental nature. Webaverse plans to add automation features for tasks like organizing spreadsheets and sending emails—potentially an alarming development.

Despite its quirks, Moemate offers an intriguing glimpse into the future of AI assistants. The concept of combining various types of media analysis—including text and images—could pave the way for more advanced PC assistants. It remains to be seen if future iterations, like Windows Copilot, will emulate Moemate's approach to enhancing productivity and streamlining workflows.

In summary, Moemate presents a fascinating, though somewhat flawed, vision of AI's future.

Most people like

Find AI tools in YBX