Alibaba Insider Reveals Insights on Developing LLMs in China

Chinese tech companies are mobilizing extensive resources and talent to close the gap with OpenAI, creating intriguing parallels in the experiences of researchers on both sides of the Pacific. A recent post from Alibaba researcher Binyuan Hui offers valuable insight into life at the e-commerce giant, which, like many other Chinese internet leaders, is striving to match the capabilities of ChatGPT.

In an effort to demonstrate everyday life for those developing large language models (LLMs), Hui shared his schedule on social media, reflecting a similar post from OpenAI researcher Jason Wei that gained widespread attention.

The comparison of their daily routines reveals remarkable similarities: both wake up around 9 a.m. and go to bed close to 1 a.m. Their mornings consist of meetings, followed by coding sessions, model training, and brainstorming with colleagues. After returning home, they continue working through experiments late into the night.

One key difference lies in how they view leisure activities. Hui, the Alibaba researcher, mentioned spending his free time reading academic papers and browsing social media for updates on global events. As noted by observers, Hui opts for intellectual engagement over unwinding with a glass of wine, a practice Wei enjoys.

This intense work culture is emblematic of China’s dynamic LLM landscape, where highly educated tech professionals are increasingly flocking to companies to develop cutting-edge AI models. Hui's rigorous schedule may reflect a personal ambition to not only match, but also project an image of competing with Silicon Valley firms in the AI arena, distinguishing it from the more conventional "996" work hours often associated with traditional Chinese internet businesses like gaming and e-commerce.

My typical day as a Member of Technical Staff at Qwen (For reference):

[9:00 a.m.] Wake up, may linger in bed for an additional 15 minutes.

[9:30 a.m.] Commute via cab while catching up on social media and checking @_jasonwei’s latest updates.

— Binyuan Hui (@huybery) February 21, 2024

Kai-Fu Lee, a respected AI investor and computer scientist, exemplifies this relentless work ethic. During my interview with him about his newly launched LLM unicorn, 01.AI, in November, he noted that long hours were standard, but employees willingly embraced the challenge. On that particular day, one team member excitedly texted him at 2:15 a.m. about their involvement in 01.AI’s mission.

These visible displays of dedication illustrate the urgency faced by tech companies in China, which directly relates to the rapid development and deployment of their LLMs.

For instance, Qwen has open-sourced several foundation models trained on both English and Chinese datasets. The largest of these boasts 72 billion parameters, a metric indicating the depth of knowledge the model has gained through historical training data, which shapes its ability to generate relevant responses. By comparison, OpenAI’s GPT-3 is estimated at 175 billion parameters, while GPT-4 has an impressive 1.7 trillion parameters. However, the purpose of an LLM plays a critical role in assessing its effectiveness, perhaps even more than the total parameter count.

Qwen has also been swift in integrating commercial applications; last April, Alibaba incorporated Qwen into its enterprise communication platform, DingTalk, as well as its online retail site, Tmall.

So far, no single leader has emerged in China’s LLM sector, prompting venture capitalists and corporate investors to diversify their investments among various competitors. Alongside developing its own LLM, Alibaba has actively invested in startups such as Moonshot AI, Zhipu AI, Baichuan, and 01.AI.

As competition heats up, Alibaba is working to establish a unique position in the market. Its recent launch of SeaLLM, an LLM capable of processing multiple Southeast Asian languages including Vietnamese, Indonesian, Thai, Malay, Khmer, Lao, Tagalog, and Burmese, could serve as a critical differentiator. With a significant presence in the region through its cloud computing initiatives and acquisition of the e-commerce platform Lazada, Alibaba is well-positioned to leverage SeaLLM in its future offerings.

How China is establishing a parallel generative AI universe

Most people like

Find AI tools in YBX