As anticipated, a new large language model (LLM), Grok-2, from Elon Musk’s sister company xAI, launched last night within the mobile app for the social network X. Available through the Premium ($7/month) and Premium+ ($14/month, ad-free) subscription tiers, Grok-2 comes in two sizes: Grok-2 and Grok-2 mini. Grok-2 excels in diverse tasks such as chat, coding, reasoning, and vision applications, while Grok-2 mini is a compact, efficient version designed for faster responses to simpler text prompts.
Grok-2’s standout feature is its image generation capability, developed in partnership with Black Forest Labs and its impressive open-source diffusion AI model, Flux.1. Remarkably, Grok-2 surpasses AI models from major competitors, including OpenAI (GPT-4o), Anthropic (Claude 3.5 Sonnet), and Google (Gemini Pro 1.5) in third-party benchmark tests.
Notably, Grok-2 and Grok-2 mini excelled across significant benchmarks such as GPQA, MMLU, MMLU-Pro, MATH, HumanEval, MMMU, MathVista, and DocVQA. The lmsys-chatbot community, where companies often test AI models under pseudonyms, congratulated xAI on this achievement.
AI influencer Ethan Mollick from the University of Pennsylvania noted the emergence of five models in the GPT-4 class: GPT-4o, Claude 3.5, Gemini 1.5, Llama 3.1, and Grok-2. Musk celebrated his "hardworking xAI team!" on the platform.
Despite Grok-2's robust performance across various benchmarks, its integration with Flux.1 for image generation has captured significant attention. Before Grok-2's release, Flux.1 was already renowned in AI art circles for producing strikingly photorealistic images. The model allows users to generate images based on text prompts, similar to how OpenAI integrated DALL-E 3 into ChatGPT. Users of Grok-2 have reported that the image generation feature is notably permissive, producing controversial images of public figures like U.S. presidential candidates Kamala Harris and Donald Trump.
Unlike other leading image generators such as Midjourney and DALL-E 3, which restrict controversial content, Grok-2's more liberal approach aligns with Musk's "free speech" philosophy for X. However, this raises concerns regarding the potential for deepfakes and misinformation online. As user @Omiron33 expressed, "Yes, we’ve had MJ and Flux, but this is the first to make it usable and quick. Advertising, propaganda, and everything good or bad that comes with that just happened (IMO, the good outweighs the bad)."