Capturing Weak Signals in Endpoint Security: The Role of Large Language Models
The challenge of capturing weak signals across endpoints and predicting potential intrusion patterns is well-suited for Large Language Models (LLMs). The primary objective is to analyze attack data, uncover new threat patterns, and enhance the effectiveness of LLMs.
Prominent vendors in endpoint detection and response (EDR) and extended detection and response (XDR) are rising to this challenge. Nikesh Arora, CEO of Palo Alto Networks, stated, "We collect the most endpoint data in the industry from our XDR—about 200 megabytes per endpoint. This is often 10 to 20 times more than our competitors. This raw data enhances our firewalls and supports our attack surface management through automation."
At CrowdStrike’s annual Fal.Con event, co-founder and CEO George Kurtz emphasized their innovation in linking weak signals from various endpoints to identify novel detections. “We’re now extending that capability to our third-party partners to examine weak signals across domains,” he noted.
The Impact of XDR on Cybersecurity
XDR has demonstrated effectiveness in minimizing noise and maximizing signal clarity. Leading XDR platform providers include Broadcom, Cisco, CrowdStrike, Fortinet, Microsoft, Palo Alto Networks, SentinelOne, Sophos, TEHTRIS, Trend Micro, and VMware.
LLMs: The Future of Endpoint Security
Enhancing LLMs with telemetry and human-annotated data is crucial for the future of endpoint security. According to Gartner’s latest Hype Cycle for Endpoint Security, innovations are focused on rapid, automated detection, prevention, and remediation of threats through integrated XDR systems that correlate data from various sources, including endpoint, network, web, email, and identity solutions.
Investment in EDR and XDR is increasing at a pace outstripping the broader information security and risk management market. Gartner forecasts that the endpoint protection platform market will grow from $14.45 billion today to $26.95 billion by 2027, achieving a compound annual growth rate (CAGR) of 16.8%. In comparison, the global information security and risk management market is projected to rise from $164 billion in 2022 to $287 billion by 2027, with an 11% CAGR.
CrowdStrike’s Insights on LLMs in Cybersecurity
Recently, Elia Zaitsev, CTO of CrowdStrike, discussed the impact of training LLMs with endpoint data on cybersecurity.
a media: What prompted you to explore endpoint telemetry data for training LLMs?
Elia Zaitsev: "When we founded the company, we aimed to leverage AI and ML to tackle complex customer problems. Legacy technologies processed decisions at the edge, limiting information access. We believed that AI needed comprehensive data, which can only be achieved through cloud technologies, allowing us to train robust classifiers and deploy them effectively at the edge."
a media: How do you perceive LLMs and generative AI tools in relation to cybersecurity professionals?
Zaitsev: "The goal is not to replace humans but to augment their capabilities. AI assists humans in enhancing workflows and decision-making, rather than taking over those roles. Quality of data is vital; high-quality, human-annotated datasets help in fine-tuning generative models for specific tasks, such as incident summarization."
a media: How do automation technologies like LLMs reshape human roles in cybersecurity, especially with adversaries using AI?
Zaitsev: "Automation tools, including LLMs, typically handle basic tasks, freeing experts to focus on complex challenges. While adversaries may use AI to automate threats, defenders can employ AI to counteract these developments, highlighting the need for skilled human defenders."
a media: What lessons have you learned from utilizing telemetry data to train LLMs?
Zaitsev: "It’s often more effective to train several specialized, smaller LLMs for specific tasks than rely on one large model. This approach leads to higher accuracy and fewer inaccuracies. Our strategy employs a mixture of expert models, optimizing success in targeted applications."
a media: How vital are expert human teams in developing and training AI systems, particularly in your AI-assisted approach?
Zaitsev: "For effective training, we need a limited number of high-quality, human-annotated examples. Our investment in expert teams from the start allows us to build strong datasets, essential for creating generative AI tailored to cybersecurity applications."
a media: How do advancements in training LLMs affect your current and future products?
Zaitsev: "Using a multi-modal approach, Charlotte integrates various technologies. LLMs excel at instruction following, translating natural language into structured tasks. Customer and vulnerability data inform output, ensuring privacy while instilling trust in our operations."
By focusing on improving the framework of LLMs in cybersecurity, organizations can enhance protection measures while acknowledging the ongoing importance of human expertise.