Abacus AI, a startup focused on developing an AI-driven, end-to-end machine learning (ML) and LLMOps platform, has released an uncensored open-source large language model (LLM) named Liberated-Qwen1.5-72B. This model is specifically tuned to adhere to system prompts, enhancing its usability in real-world applications.
Liberated-Qwen1.5-72B is based on the Qwen1.5-72B transformer-based decoder-only language model, created by researchers at Alibaba Group. Its optimized ability to follow system prompts significantly differentiates it from other open-source LLMs, making it better suited for various use cases, such as customer-facing chatbots.
Bindu Reddy, CEO of Abacus, describes the model as the world’s most effective uncensored LLM in terms of performance and adherence to system instructions.
Importance of Following System Prompts in LLMs
As enterprises increasingly integrate LLMs for tasks like customer support, maintaining control over AI interactions is vital. Users often engage in multi-turn conversations, and without proper limitations, the AI can deviate from its intended role. For instance, a user once misled a chatbot into accepting a $1 offer for a 2024 Chevy Tahoe, with the AI erroneously confirming the deal as legally binding.
To prevent such undesirable scenarios, ensuring strict compliance with system prompts is crucial. However, many open-source models on the market struggle to maintain this level of compliance. Abacus aims to rectify this with Liberated-Qwen1.5-72B.
The development team fine-tuned the model using a novel open-source dataset called SystemChat, which consists of 7,000 synthetic conversations generated with Mistral-Medium and Dolphin-2.7-mixtral-8x7b. This training enables the model to follow system messages, even when conflicting with user requests during conversations.
Reddy highlights on X, “Fine-tuning your model with this dataset makes it far more usable and harder to jailbreak!”
Performance Insights
According to testing on the MT-Bench and HumanEval benchmarks, Liberated-Qwen1.5-72B slightly outperformed the previous best open-source model, Qwen1.5-72B chat, with scores of 8.45000 compared to 8.44375. On the MMLU benchmark, evaluating world knowledge and problem-solving, the model scored 77.13—comparable to other high-performing models, including Qwen1.5-72B and Abacus’ Smaug-72B.
It’s important to note that while Liberated-Qwen1.5-72B is effective, it remains entirely uncensored, lacking built-in guardrails. This means it will answer all questions, including sensitive topics, while still adhering to system messages. Abacus advises users to implement their own alignment layers before deploying the model in any service context.
Currently, Liberated-Qwen1.5-72B is available under the tongyi-qianwen license, which is nearly equivalent to the MIT license. Reddy has expressed plans to enhance the model further, particularly for HumanEval, and to develop more advanced models by merging the SystemChat dataset with datasets from Smaug.
In the coming weeks, Abacus aims to refine its MT-bench scores, aspiring to achieve the top position on the HumanEval dashboard.