A concerted cyber campaign has emerged, meticulously targeting and compromising misconfigured proxy servers to gain unauthorized access to commercial large language model (LLM) services and their proprietary content. This sophisticated operation represents a significant escalation in the threat landscape surrounding artificial intelligence, as attackers pivot from traditional data theft to leveraging the computational power and inherent capabilities of advanced AI.
Since its inception in late December, this pervasive initiative has systematically probed more than 73 distinct LLM endpoints, generating an astonishing volume exceeding 80,000 investigative sessions. The perpetrators employ subtle, "low-noise" queries to interact with these endpoints, a deliberate tactic designed to ascertain the underlying AI model without triggering automated security alerts or drawing undue attention from monitoring systems. This methodical approach underscores a long-term strategy of reconnaissance and potential exploitation, rather than immediate, overt disruption.
The increasing integration of artificial intelligence across various industries has inadvertently expanded the attack surface for cyber adversaries. Large Language Models, at the forefront of this revolution, are powerful computational engines capable of generating human-like text, translating languages, writing different kinds of creative content, and answering your questions in an informative way. Access to these services, especially commercial offerings, often comes with significant computational costs, proprietary data access, or advanced functionalities. Unauthorized access, therefore, represents a valuable prize for threat actors, offering opportunities for financial gain, intellectual property theft, competitive advantage, or even the deployment of malicious AI agents.
Proxy servers, which act as intermediaries for requests from clients seeking resources from other servers, are crucial components in many network architectures. They can enhance security, manage network traffic, and facilitate access to specific services. However, when these servers are improperly configured – lacking robust authentication, exposing administrative interfaces, or failing to implement proper access controls – they transform from protective gateways into vulnerable entry points. Such misconfigurations can allow attackers to bypass network defenses, obscure their true origin, and establish persistent footholds within target environments, effectively creating a backdoor to sensitive services like LLMs.

According to a detailed analysis from a prominent threat monitoring platform, GreyNoise, two distinct campaigns have been identified as part of this broader offensive, collectively registering over 91,403 attack sessions against their Ollama honeypot infrastructure over the past four months. Ollama, an open-source framework for running large language models locally, offers a convenient environment for testing and deployment, but also presents a potential vector for attack if not adequately secured.
The first operation, which commenced in October and remains active, exhibits a particular focus on exploiting Server-Side Request Forgery (SSRF) vulnerabilities. SSRF allows an attacker to compel a server to make requests to an arbitrary domain, potentially including internal networks or attacker-controlled infrastructure. In this campaign, the adversaries have leveraged Ollama’s model pull functionality – designed to retrieve AI models from registries – to inject malicious registry URLs. Furthermore, they have manipulated Twilio SMS webhook integrations via the MediaURL parameter, a technique that could facilitate data exfiltration, command-and-control communications, or the unauthorized sending of messages. A notable spike in activity for this campaign was observed around the Christmas period, with 1,688 sessions recorded within a 48-hour window, suggesting a calculated attempt to exploit holiday-period staffing reductions.
Intriguingly, the initial assessment of this specific campaign suggests a "grey-hat" origin, possibly involving security researchers or bug bounty hunters. This hypothesis is primarily based on the observed use of ProjectDiscovery’s Out-of-band Application Security Testing (OAST) infrastructure, a toolkit commonly employed in legitimate vulnerability assessments. OAST callbacks are standard techniques for detecting blind vulnerabilities by triggering an outbound connection to a controlled server. While such methods are foundational to ethical hacking and security research, the sheer scale and timing of these operations, particularly around a major holiday, raise questions about the precise intent and ethical boundaries of the actors involved. GreyNoise researchers noted that "OAST callbacks are standard vulnerability research techniques. But the scale and Christmas timing suggest grey-hat operations pushing boundaries," indicating a potential blurring of lines between legitimate security testing and activities that verge on unauthorized access.
Telemetry data associated with this first campaign revealed that the activity originated from 62 unique IP addresses spread across 27 different countries. These IP addresses primarily exhibited characteristics consistent with Virtual Private Servers (VPS), rather than signs indicative of a large-scale botnet operation. The use of distributed VPS infrastructure suggests a deliberate effort to maintain anonymity and evade detection, making attribution challenging and highlighting the sophisticated nature of the actors.

A second, more pronounced campaign was detected starting on December 28, characterized by an intensive, high-volume enumeration effort aimed at identifying exposed or improperly configured LLM endpoints. Over an intense 11-day period, this activity generated a staggering 80,469 sessions, primarily driven by just two IP addresses systematically probing over 73 different model endpoints. The attackers demonstrated versatility by utilizing both OpenAI-compatible and Google Gemini API formats, indicating a broad-spectrum reconnaissance strategy.
The list of targeted models spanned the offerings of all major AI providers, including OpenAI, Google, Anthropic, Meta, Mistral, and Perplexity. This wide net suggests that the attackers are not fixated on a single platform but are instead seeking any accessible LLM service that can be exploited, regardless of its underlying provider or specific capabilities. To maintain stealth during this extensive scanning, the actors employed innocuous queries such as brief greetings, empty inputs, or simple factual questions. This tactic is crucial for remaining undetected during the initial reconnaissance phase, allowing them to map the infrastructure without triggering defensive measures.
Significantly, the scanning infrastructure associated with this second campaign has been previously linked to widespread vulnerability exploitation activities. This crucial piece of intelligence shifts the balance away from the "grey-hat" hypothesis, strongly suggesting that the enumeration is not merely academic research but rather a calculated, organized reconnaissance effort by malicious actors to catalog accessible LLM services for future exploitation. The systematic nature, coupled with prior associations with illicit activities, points towards a well-resourced and determined adversary.
While the GreyNoise report did not claim to have observed active exploitation, data theft, or direct model abuse subsequent to discovery, the sheer volume and persistent nature of the activity are profoundly indicative of malicious intent. As the researchers emphatically stated, "Eighty thousand enumeration requests represent investment. Threat actors don’t map infrastructure at this scale without plans to use that map." This investment signals a strategic preparation for future attacks, which could involve a range of objectives from leveraging LLMs for generating convincing phishing content, disseminating misinformation, bypassing content filters, extracting proprietary data, or even utilizing the computational resources for illicit activities like cryptocurrency mining. The long-term implications for organizations hosting or relying on these LLM services are substantial, ranging from financial losses due to unauthorized compute usage to severe reputational damage and intellectual property compromise.
.jpg)
To fortify defenses against this evolving threat, several proactive measures are critically recommended. Organizations should implement stringent restrictions on Ollama model pulls, allowing access only to trusted and verified registries and enforcing strict allowlisting policies. Egress filtering is paramount to prevent outbound connections from internal systems to unauthorized external destinations, thereby thwarting data exfiltration and command-and-control communications. At the network level, blocking known OAST callback domains through DNS filtering can neutralize a common reconnaissance technique.
Furthermore, proactive measures against high-volume enumeration include rate-limiting suspicious Autonomous System Numbers (ASNs) and continuously monitoring for JA4 network fingerprints linked to automated scanning tools. JA4 fingerprints provide a unique identifier for client-side network traffic, enabling security teams to detect and block traffic associated with known malicious scanners.
Beyond these immediate tactical recommendations, a more comprehensive security posture is essential. This includes adopting a zero-trust architecture for AI infrastructure, where no entity, inside or outside the network perimeter, is trusted by default. Robust authentication mechanisms, including multi-factor authentication, must be enforced for all proxy and LLM access points. Regular security audits, penetration testing, and continuous vulnerability management are crucial for identifying and remediating misconfigurations before they can be exploited. Organizations must also prioritize secure configuration management, ensuring that default settings are hardened and that unnecessary services or ports are closed. Finally, investing in AI-specific security frameworks, such as those focusing on input/output validation for LLMs, protecting fine-tuning data, and monitoring for anomalous model behavior, will be vital in safeguarding the integrity and security of advanced AI systems against sophisticated and persistent cyber threats. The era of AI demands a proactive and adaptive cybersecurity strategy to protect these invaluable digital assets.







