S&P Global, a leading provider of financial intelligence, announced the launch of S&P AI Benchmarks by Kensho, an innovative solution aimed at establishing a new standard for evaluating large language models (LLMs) in complex financial applications.
Developed by Kensho, S&P Global’s AI-focused division, this benchmarking tool measures an LLM’s ability to perform tasks such as quantitative reasoning, data extraction from financial documents, and showcasing domain-specific knowledge. The results are displayed on a leaderboard, offering a transparent view of each model’s performance.
S&P AI Benchmarks ranks LLMs across key financial and quantitative metrics, including domain knowledge, quantity extraction, and program synthesis. “S&P AI Benchmarks combines Kensho’s cutting-edge AI research with S&P Global’s financial intelligence,” said Bhavesh Dayalji, Chief AI Officer at S&P Global and CEO of Kensho. “We aim for this solution to become the industry standard for evaluating LLMs in complex financial reasoning, fostering innovation in the FinAI space.”
The timing of this launch is critical for the financial services industry, as institutions increasingly explore the potential of generative AI and LLMs to enhance operations and achieve a competitive advantage. The absence of standardized benchmarks has made it difficult for organizations to assess the right models for their specific needs.
"Benchmarking solutions like this are vital for helping institutions determine which LLMs to use for their specific applications," Dayalji explained. “S&P AI Benchmarks will further innovation by guiding financial professionals on where each model excels and how it can provide the most value.”
The methodology for S&P AI Benchmarks was developed and validated by a diverse team of experts, including engineers, researchers, academics, and financial professionals across S&P Global. The evaluation consists of 600 questions designed to rigorously test LLM performance across three key categories.
Industry analysts consider the launch of S&P AI Benchmarks to be a significant milestone in AI adoption within the financial sector. As advanced AI technologies become more integrated into finance, having a reliable and transparent benchmarking tool will be essential for firms looking to make informed deployment decisions. S&P Global’s solution may accelerate the responsible adoption of LLMs and stimulate innovation in the FinAI space.
Looking forward, S&P Global envisions S&P AI Benchmarks playing a crucial role in the future of AI in financial services. “Our vision is for LLMs to adapt more effectively to the needs of our industries, and solutions like ours will facilitate that,” Dayalji said. “We also encourage all model providers to participate to help us continue evolving our framework.”
As the financial industry navigates the rapidly changing landscape of AI and generative AI, tools like S&P AI Benchmarks by Kensho are set to become essential resources, enabling organizations to harness these technologies while ensuring accuracy, transparency, and responsible deployment.