True to his word, billionaire entrepreneur Elon Musk’s startup, xAI, has released its first large language model (LLM), Grok, as open source.
This announcement, which Musk promised would occur this week, allows entrepreneurs, programmers, companies, and individuals to access Grok’s weights—essentially the strength of connections between its artificial “neurons”—along with other relevant documentation. This means users can utilize Grok for various purposes, including commercial applications.
“We are releasing the base model weights and network architecture of Grok-1, our large language model,” the company revealed in a blog post. “Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI.”
Tech enthusiasts can download Grok's code from its GitHub page or via a torrent link, with Hugging Face providing an expedited download option.
What Grok’s Open Sourcing Means
In machine learning, parameters refer to the weights and biases that dictate a model's decisions. Generally, more parameters indicate a more complex and capable model. With 314 billion parameters, Grok surpasses open-source competitors like Meta's Llama 2 (70 billion) and Mistral 8x7B (12 billion).
Grok is released under the Apache License 2.0, enabling commercial use, modifications, and distribution without the possibility of trademarking. However, users must include the original license and copyright notice, as well as document any modifications made.
Built using a custom training stack on JAX and Rust in October 2023, Grok employs cutting-edge neural network designs. It utilizes 25% of its weights for each token, enhancing both efficiency and effectiveness.
Initially launched as a proprietary model in November 2023, Grok was previously available only through Musk’s social network, X (formerly Twitter), specifically via the $16 per month or $168 per year X Premium+ subscription.
Limitations and Continued Access
It's important to note that Grok’s release does not include the full corpus of its training data. While this limitation does not hinder model usage—since it has already been trained—it prevents users from analyzing its learning sources, which likely include user text posts on X. The xAI blog vaguely indicates the model was "trained on a large amount of text data, not fine-tuned for any particular task."
Additionally, Grok does not have access to real-time information on X, a feature Musk previously touted as a unique offering. For real-time updates, users must still subscribe to the X Premium+ service.
Strategic Positioning in AI Landscape
Grok is designed to compete directly with OpenAI's ChatGPT, co-founded by Musk and from which he distanced himself in 2018. The model’s name, derived from slang meaning "to understand," is a nod to Douglas Adams’ satirical sci-fi series, “The Hitchhiker’s Guide to the Galaxy.”
Musk has portrayed Grok as a more humorous and uncensored alternative to ChatGPT, appealing to users concerned about AI censorship. This positioning gains added relevance amid criticism of Google’s Gemini AI, which faced backlash for its erroneous image generations and controversial ideological views.
The open-sourcing of Grok also strengthens Musk’s stance in his ongoing lawsuit against OpenAI, where he accuses the organization of straying from its original non-profit mission. OpenAI has countered Musk's claims by releasing emails indicating his prior support for its shift to for-profit technology.
The AI community on X has responded enthusiastically to Grok's release, with technical discussions emerging around its use of GeGLU in feedforward layers and its normalization techniques, such as the intriguing sandwich norm.
Implications for the AI Industry
As Grok gains traction, it is likely to pressure other LLM providers, particularly open-source competitors, to demonstrate how their offerings surpass Grok's capabilities.