SAP Enhances Datasphere: Revolutionizing Enterprise Data Lakes for Improved Accuracy and Utility

SAP made a significant investment in AI agents during its annual TechEd conference, introducing its generative AI copilot, Joule. While generative AI captured attention as a trending topic, SAP also showcased its data innovations, outlining plans to equip enterprises with a comprehensive suite of tools designed to maximize the value of their datasets without losing their original context.

SAP announced enhanced capabilities, including new data lake features, a knowledge graph engine, and the ability to accelerate real-time risk analysis. Although these features won't be available immediately, they are expected to launch within the coming months, enabling enterprises to store, process, and derive value from their data efficiently.

This initiative coincides with a broader industry shift, as many leading enterprises are revitalizing their AI and data solutions to better address business needs. For example, Salesforce recently pivoted to AgentForce, a new ecosystem of AI agents designed to leverage business information effectively. Additionally, Salesforce expanded its data cloud capabilities and connectors to enhance agent performance.

New Data Lake and Knowledge Graph Engine

Through its Business Technology Platform (BTP), SAP offers key features such as data management, analytics, AI, automation, and application development under one comprehensive solution. This integration provides teams with all the tools they need to create new applications, enhance existing ones, or integrate various systems within a cloud computing environment.

Central to this data framework is the 'Datasphere,' which enables enterprises to connect, store, and manage data from both SAP and non-SAP systems, ultimately linking it with SAP Analytics Cloud and other tools for downstream applications.

The Datasphere, powered by SAP HANA Cloud, will soon be enhanced with new data lake capabilities. Previously, SAP offered a separate service for hosting structured, semi-structured, and unstructured data, requiring users to assign a Datasphere space to access the data, which often compromised the original context.

With the introduction of an embedded data lake, SAP is expanding the data fabric architecture of the Datasphere, integrating an object store that allows for efficient storage of large data sets in their original form. This simplifies data management and scalability.

“As part of SAP Datasphere, the object store adds a new layer within the data stack to facilitate the onboarding of data products from SAP applications like SAP S/4HANA and SAP BW,” a company spokesperson explained. “This enables customers to leverage core capabilities such as analytics models, cataloging, and data integration for direct access that enhances decision-making.”

To process data within the object store, SAP is providing teams with Spark compute. Additionally, a feature called SQL on files will allow querying without data replication.

Moreover, SAP has unveiled a knowledge graph engine, built on the Resource Description Framework standard, to help enterprises uncover complex relationships among various data points—such as business entities and invoices—that might otherwise be overlooked through manual modeling.

“Each data point is organized into three distinct components: the subject, the related object, and the nature of their relationship,” the company stated. “This approach organizes data into a web of interconnected facts, facilitating the understanding of correlations between different information sets.”

This capability will empower enterprises to better utilize their data for AI applications, including context-aware insights. The knowledge graph engine also supports SPARQL semantic query language for efficient information extraction.

Real-Time Risk Analysis

SAP also introduced Compass, a new feature in its Analytics Cloud that allows users to model complex risk scenarios and simulate their potential outcomes. This tool can assist organizations in preparing for challenges such as supply chain disruptions or rising commodity prices, minimizing their impact on operational expenses and revenue.

At the core of Compass is the Monte Carlo simulation method, which calculates the probability of various outcomes using random variable simulations. This technique streamlines analysis, providing results with probability distributions and boundaries through an intuitive interface suitable for non-technical users.

Timeline for New Features

The new data lake capabilities are set to be generally available by the end of Q4 2024. Meanwhile, the knowledge graph and Analytics Cloud Compass will launch in the first half of 2025. While the exact release dates are not yet determined, it is clear that SAP aims to create a cohesive ecosystem of capabilities, enhancing the relevance and context of data within the Datasphere to support critical business applications.

Regarding expected ROI from the new data lake capabilities, a spokesperson referenced a GigaOM case study demonstrating that a business data fabric enabled by SAP Datasphere achieved a three-year total cost of ownership that was 42% lower than that of do-it-yourself implementations.

Most people like

Find AI tools in YBX