Bilibili Releases Open-Source Lightweight Index-1.9B Series Models: Base, Control, Dialogue, and Role-Playing Versions Available

On June 20, Bilibili announced the open-sourcing of its lightweight Index-1.9B model series, which includes several versions: base model, control group, chat model, and character-role model.

Official Overview:

- Index-1.9B Base: This foundational model features 1.9 billion non-word embedding parameters, pre-trained on a diverse corpus of 2.8 terabytes of Chinese and English data. It consistently outperforms competing models across multiple evaluation benchmarks.

- Index-1.9B Pure: This control model mirrors the base version in parameters and training strategy but excludes all instruction-related data to evaluate the impact of instructions on performance metrics.

- Index-1.9B Chat: Built on the Index-1.9B base, this chat model is refined through Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). Its training included extensive internet community data, resulting in a more engaging chat experience.

- Index-1.9B Character: This model further enhances SFT and DPO by incorporating Retrieval-Augmented Generation (RAG) for customizable few-shot role-playing. Utilizing the same 2.8TB dataset with a 4:5 language distribution of Chinese to English and 6% code representation, it features a built-in character named "San San," while users can create their own characters.

This new model series strengthens Bilibili's AI capabilities, enhancing user interaction and engagement with its advanced customizable features.

Most people like

Find AI tools in YBX