Mark Zuckerberg is making a significant investment in artificial intelligence (AI) as part of a broader mission to combat diseases. Through the Chan Zuckerberg Initiative (CZI), which he co-chairs alongside his wife, Priscilla Chan, the organization aims to establish one of the largest computing systems globally dedicated to nonprofit life science research. This ambitious high-performance computing cluster is projected to house over 1,000 graphics processing units (GPUs) specialized for AI and large language models (LLMs). By leveraging this computational power, scientists will gain access to predictive models that analyze both healthy and diseased cells, paving the way for groundbreaking medical advancements.
Patricia Brennan, Vice President of Science Technology at CZI, emphasizes the significance of this initiative: “Building this AI computing system is a crucial step towards curing, preventing, or managing all diseases by the end of the century. It will deepen the scientific community’s understanding of cells and their interactions within biological systems.”
The initiative plans to utilize the massive computing power to create a virtual biology simulator, enabling researchers to better grasp how cells contribute to the functioning of organs within the human body. One of the primary goals is to construct a "virtual cell," allowing scientists to map various cellular states, both in health and disease.
Many universities and research institutions struggle to afford the infrastructure necessary to analyze large volumes of biomedical data. As Brennan notes, “The AI cluster will be one of the most powerful high-performance computing systems for nonprofit life science research in the world. While the private sector is heavily investing in AI biomedical projects, the robust infrastructure needed to develop digital models of cells is economically unfeasible for many.”
Data quality is paramount for successful AI modeling; the initiative plans to feed large datasets into these models, harnessing resources like the CZ CELLxGENE tool, which contains over 50 million cell records. Additional data sources include the protein location and interaction atlas, OpenCell, and the cell atlas, Tabula Sapiens.
David M. Truong, an Assistant Professor of Biomedical Engineering at New York University Tandon School of Engineering, highlights the importance of high-quality datasets: “Previous efforts have struggled with the quality of data input. While modern large biological datasets are quite reliable, many biomedical researchers find them challenging to navigate. AI systems could effectively summarize and organize the data for researchers.”
Zuckerberg’s biomedical AI initiative is part of a larger trend. Innovative systems like AlphaFold—a groundbreaking protein structure database—and ESM's protein atlas are already advancing our understanding of human biology. Additionally, platforms such as Terra provide cloud-based access for biomedical researchers, facilitating data analysis and collaboration on substantial projects. Developed in partnership among the Broad Institute of MIT and Harvard, Alphabet, and Microsoft, Terra offers an advanced platform-as-a-service (PaaS) that simplifies resource management for users.
Nvidia’s contributions include Parabricks, which leverages GPUs to accelerate genome sequencing, dramatically reducing processing time while cutting costs. Furthermore, Nvidia’s BioNeMo framework supplies ready-to-use LLMs tailored for proteins and chemistry, streamlining the training and scaling processes.
The vision behind Zuckerberg’s AI project is ambitious. Chan, a former pediatrician, expressed the goal of curing, preventing, or managing all diseases by the century's end. Brennan elaborates on this mission, stating that the initiative seeks to help researchers monitor cellular changes throughout life, whether inherited or acquired.
Brennan adds, “Across our work, we seek opportunities to make a differentiated impact, recognizing the necessity of data, infrastructure, models, interfaces, and profound biological knowledge to construct comprehensive models of human cells and systems.”
Eduardo Abeliuk, CEO of the biotech software company Teselagen, acknowledges the enormity of the endeavor, stating, “This initiative suggests an effort of unparalleled scale, aiming to exceed past infrastructural projects in terms of powerful computational access.” However, he also points out that achieving these lofty objectives will require more than technological advancements. “Significant progress will depend on global collaboration, social efforts, and considerable advancements in basic science.”