How to ensure you use a safe Pypi package?

Ori Zerah
Ori Zerah
Jun 10, 2026 | 5 Minutes
How to ensure you use a safe Pypi package?

Key takeaways

  1. Every pip install is a trust decision. PyPI has no publish-time gatekeeping - any account with credentials can release a new version of a trusted package. setup.py and build hooks execute automatically at install time, before your application ever imports the library.

  2. Pinning to a specific version is not sufficient protection. Attackers publish malicious code under new legitimate version numbers. A package that was clean at 0.23.2 can have a compromised 0.23.3 in the registry within hours.

  3. PyPI supply chain attacks in 2026 are hitting packages with hundreds of millions of downloads. Mistral, LiteLLM, elementary-data, and Microsoft's durabletask SDK were all compromised using the same pattern - trusted maintainer pipeline abused, malicious version published, credentials silently exfiltrated on import or install.

  4. Traditional defenses - scanning, pip audit, dependency pinning - are reactive. They catch known threats after packages have already arrived. Against a novel malicious payload in a freshly compromised trusted package, they have no coverage.

  5. Echo Libraries acts as a transparent, trusted PyPI source. Developers continue using pip install exactly as they do today - Echo silently blocks malicious and vulnerable package versions before they can be pulled, with critical CVEs patched within 7 days under a strict SLA.

What is Python

Python is one of the world's most widely used programming languages, with a particularly dominant position in data science, machine learning, backend web development, automation, and security tooling. Its popularity is inseparable from its ecosystem: instead of writing every function from scratch, Python developers compose applications from thousands of pre-built libraries available on PyPI - the Python Package Index.

What makes Python's library ecosystem both powerful and consequential from a security standpoint is the depth of that dependency graph. A typical Python application - especially one in the ML or data space - may have hundreds of transitive dependencies: packages that your direct dependencies depend on, which in turn depend on others. Many of those packages execute code at install time through setup.py scripts or build hooks, before the library is ever imported into your application. That means the act of installing a Python library is itself a code execution event - one that happens silently, automatically, in developer environments and CI pipelines alike.

What is a Python library

A Python library - also called a Python package - is a reusable module or collection of modules published to PyPI that developers install with pip install. Libraries range from utilities like requests (HTTP) and boto3 (AWS SDK) to large frameworks like django, fastapi, and pytorch. The PyPI registry hosts over 500,000 packages, collectively downloaded billions of times per month.

The architecture of the Python packaging system creates several properties that are security-relevant:

  • setup.py and pyproject.toml execute at install time. A setup.py with arbitrary code runs when you run pip install, before the package is imported anywhere. This is the primary mechanism for supply chain attack payloads in Python - code that exfiltrates credentials, establishes persistence, or downloads secondary payloads executes the moment the package is installed.
  • .pth files can establish persistence. A malicious package can write a .pth file to Python's site-packages directory, causing arbitrary code to execute on every Python startup - not just at install time - until the file is manually removed.
  • Transitive dependencies are installed automatically. A dependency in your requirements.txt may itself depend on dozens of packages. Each of those is pulled and executed without explicit developer review.
  • PyPI has no publish-time code review. Any account with valid credentials can publish any version of any package it owns. There is no human review gate between a compromised maintainer account and a malicious version reaching the registry.

Python packages and the supply chain attack surface

Python packages are the primary delivery mechanism for the supply chain attacks hitting production environments in 2026. The reason is structural: PyPI's openness and the automatic code execution at install time create a combination that attackers have learned to exploit systematically.

The attack surface is broader than most teams realize:

  • Direct dependencies - the packages explicitly listed in your requirements.txt or pyproject.toml
  • Transitive dependencies - everything those packages depend on, recursively, most of which your team has never reviewed
  • Dev and build dependencies - tools like linters, test runners, and build utilities installed in CI environments that have access to the same secrets and credentials as production pipelines
  • Docker base image packages - Python packages pre-installed in your base image that are pulled and executed as part of every container build

Each layer represents a trust assumption. A compromise anywhere in that graph - including in a package your application never directly imports - can result in credential theft, lateral movement through CI environments, or persistent backdoors in deployed containers.

Hardened Python libraries

A hardened Python library is a version of an open source Python package that has been independently processed, verified, and secured before it's made available for installation - rather than pulled directly from PyPI where anyone can publish.

The hardening process addresses the specific attack vectors that the PyPI ecosystem creates:

Malware sandboxing at install time The most dangerous property of Python packages is that setup.py and build hooks execute automatically during pip install. A hardened library source executes every package in an isolated sandbox before making it available - catching credential-stealing payloads, reverse shells, and persistence mechanisms before they reach any real environment. This is the only defense that works against novel payloads in freshly compromised packages, because it's behavioral rather than signature-based.

CVE remediation on the version you actually use Alerting on a known vulnerability is not the same as fixing it. Hardened Python libraries have critical and high CVEs patched on the exact version your application requires - without changing the version number in your requirements.txt or breaking your dependency tree. This eliminates the scanner-alert → ticket → manual upgrade cycle that consumes security and engineering time.

Upstream project and maintainer health monitoring The pattern preceding every major PyPI supply chain attack in 2026 included observable signals: a publisher account change, an unusual new release outside the normal cadence, a new CI workflow with elevated permissions. Continuous monitoring of these signals - and automatic quarantine of packages that trigger them - is what separates a proactive hardened source from a reactive scanner.

Independent provenance attestation A hardened library should be built and signed on infrastructure that is independent of the original maintainer's pipeline. Signing by the original publisher's CI only proves the artifact came from that CI - it doesn't prove the CI wasn't compromised. Independent attestation closes that gap.

.pth file and persistence detection Python-specific persistence mechanisms - particularly .pth files written to site-packages - require specific detection logic that generic malware scanners don't always cover. A hardened Python library source should specifically analyze for these patterns.

SBOM and VEX documentation A Software Bill of Materials and Vulnerability Exploitability eXchange document provide audit-ready evidence of what's in every package version, what vulnerabilities are present, and which have been remediated. Essential for compliance, customer security reviews, and incident response.

PyPI security news - the attacks you need to know about

The PyPI threat landscape in 2026 has shifted decisively from fake packages to compromised trusted ones. The attacks making headlines are not typosquatting attempts targeting inattentive developers - they are systematic compromises of packages with millions or hundreds of millions of monthly downloads, using the same playbook repeated at scale.

Mistral on PyPI - credential theft with geographic logic

Microsoft's Threat Intelligence team identified a compromised version of the Mistral package on PyPI. The malicious code executed on Linux import - not at install time, but every time the library was imported - downloading a file called transformers.pyz designed to look like Hugging Face's Transformers library. In practice, it was a credential stealer targeting SSH keys, cloud credentials, API tokens, and environment variables.

What made this incident particularly notable was the geographic logic embedded in the malware: systems configured in Russian were skipped entirely. On systems identified as Israeli or Iranian, there was a 1-in-6 probability the payload ran rm -rf / - a full system wipe. The attack was not indiscriminate; it was targeted at specific operator profiles.

LiteLLM - malicious version from a compromised maintainer account

LiteLLM, a widely used Python library for interacting with LLM APIs, had a malicious version published from a compromised maintainer account. The payload executed on Python startup - via a .pth file written to site-packages - silently harvesting environment variables, cloud credentials, and API keys from every Python process on the affected system. The persistence mechanism meant the malware continued running after the malicious package version was removed, until the .pth file was explicitly identified and deleted.

Elementary-data - a forged release from a GitHub Actions exploit

The elementary-data incident illustrated how far attackers have moved from direct credential theft. An attacker posted a comment on an open pull request exploiting a script injection vulnerability in a GitHub Actions workflow. That was sufficient to grab the repository's GITHUB_TOKEN, forge a signed release commit, and trigger the legitimate PyPI publishing pipeline.

The resulting malicious version (0.23.3) of a package with approximately 1.1 million monthly downloads executed a payload on every Python startup, harvesting SSH keys, git credentials, PyPI and npm tokens, AWS credentials including live calls to Secrets Manager and SSM, Kubernetes configs, Docker credentials, and .env files. The GitHub release was named with random gibberish and created by github-actions[bot] - a signal that was visible in the release history but not in any install workflow.

Microsoft durabletask SDK - credential-stealing payloads in an official SDK

Microsoft's official durabletask SDK on PyPI shipped credential-stealing payloads following a supply chain compromise. An official SDK from a major vendor, distributed through the legitimate PyPI pipeline, containing malware. The incident reinforced that no publisher - regardless of size or reputation - is immune to the compromise pattern once an attacker has access to their publishing infrastructure.

The broader campaign - 500 million downloads affected

Across the combined npm and PyPI incidents linked to the Mini Shai-Hulud campaign and related activity in 2026, over 500 million combined downloads were affected. The PyPI component alone spans dozens of packages, with 373 malicious versions identified across the campaign at the time of writing.

The alternative - Echo Libraries for Python

The answer to the PyPI supply chain threat isn't to audit every package manually or restrict Python usage. It's to replace the public registry as your direct artifact source with a vetted pipeline that has already done that work - continuously, automatically, before any package version reaches your environment.

Echo Libraries provides a trusted, continuously maintained PyPI artifact source. Every Python package version in Echo's catalog has been independently processed before it's made available. The malicious versions of Mistral, LiteLLM, elementary-data, and durabletask were never available to Echo customers - not because they were detected and quarantined reactively, but because the compromise signals were identified and the versions blocked before they could be pulled.

What Echo's processing covers for Python specifically:

  • setup.py and build hook sandboxing - every package version is executed in an isolated environment at install time, detecting payloads that trigger during pip install before they reach any real system
  • Import-time payload detection - packages are also analyzed for malicious code that executes on import rather than install, catching the Mistral-style attack pattern
  • .pth persistence detection - Echo specifically analyzes for .pth files and other Python-specific persistence mechanisms written to site-packages
  • Critical and high CVE remediation within 7 days - patched on the exact version your requirements.txt specifies, with no version change required
  • Upstream maintainer and project health monitoring - publisher account changes, unusual release patterns, and new CI workflows with elevated permissions are monitored continuously; packages showing compromise signals are quarantined before they're pullable
  • Independent provenance attestation - Python packages are built and signed on Echo's own secure infrastructure, separate from the original maintainer's pipeline, closing the gap that signing-by-compromised-CI leaves open
  • SBOM and VEX delivery - in standard formats, for every package version

Echo integrates with your existing repository infrastructure - Nexus, JFrog Artifactory, GitHub Packages, and others - so the vetted source slots in without re-architecting anything.

How Echo protects every pip install

The most important thing to understand is what Echo doesn't change.

Developers continue using pip install exactly as they do today. The same package names. The same version numbers. The same requirements.txt and virtual environment workflow. No new CLI. No new commands. No changes to how developers work day to day.

The one-time configuration is pointing your pip or internal registry at Echo's vetted artifact endpoint instead of PyPI directly. After that, every pip install requests or pip install mistral routes through Echo's vetted source. Echo has already:

  • Verified the version is clean - no malicious install-time scripts, no import-time payloads, no .pth persistence mechanisms
  • Patched any critical or high CVEs present in that version, under a strict 7-day SLA for critical vulnerabilities
  • Confirmed provenance is consistent with the expected publisher and build pipeline
  • Checked that no upstream compromise signals have been detected for that package or maintainer

If a version fails any check, it is blocked. The developer or pipeline gets a clear signal that the version isn't available through Echo's trusted source - not a silent install of malicious code. No version of the Mistral malware, the LiteLLM backdoor, or the elementary-data credential stealer was ever pullable by an Echo customer.

For the full picture of how Echo Libraries works across Python, npm, Java, Ruby, and Go ecosystems, visit echo.ai/product/libraries. For detailed coverage of the incidents described in this post, see the March 2026 supply chain attacks breakdown and the CVE-2026-7482 Ollama vulnerability analysis.

FAQ

What is a Python library and why does it matter for security?

A Python library is a reusable package published to PyPI that developers install with pip install. What makes Python libraries security-critical is that setup.py and build hooks execute automatically at install time - meaning installing a package is a code execution event. A compromised library can exfiltrate credentials, establish persistence via .pth files, or download secondary payloads the moment it's installed, before the package is ever imported into your application.

What is PyPI and what are its security limitations?

PyPI is the Python Package Index - the official registry hosting over 500,000 Python packages. Its primary security limitation is that it has no publish-time code review or gatekeeping. Any account with valid credentials can publish any version of any package it owns. This means a compromised maintainer account, a hijacked CI pipeline, or an abused GitHub Actions workflow can produce a malicious version of a trusted, widely used package and have it available for pip install within minutes.

What are hardened Python libraries?

Hardened Python libraries are versions of open source Python packages that have been independently processed through a secure pipeline before installation. The hardening process includes malware sandboxing at install and import time, CVE remediation on the required version, upstream maintainer health monitoring, independent provenance attestation, and Python-specific persistence detection (.pth files, build hook analysis). The result is the same package name and version your application requires, with security guarantees the public PyPI registry cannot provide.

How does Echo Libraries protect against PyPI supply chain attacks?

Echo operates as a transparent, trusted artifact source between your pip install commands and PyPI. Every package version is sandboxed for malicious behavior at both install and import time, has critical CVEs patched within a 7-day SLA, and is monitored for upstream compromise signals - before it's made available. Malicious versions are blocked entirely. Developers see no change in workflow. The Mistral, LiteLLM, and elementary-data malicious versions were never available to Echo customers because they were blocked before any customer could pull them.

What should my team do after a suspected PyPI supply chain compromise?

Treat any environment that ran pip install during the attack window as potentially compromised. Rotate all credentials immediately - PyPI tokens, GitHub tokens, AWS/GCP/Azure keys, SSH keys, and any other secrets accessible from that environment. Audit Kubernetes secrets and cloud IAM access. Check site-packages for unexpected .pth files and remove them - these persist after the malicious package is uninstalled. Review Python process logs for unexpected outbound connections. Going forward, a vetted artifact source like Echo means this forensic work becomes unnecessary.

What are the 7 blind spots in your vulnerability scans?

Discover when "0 vulnerabilities" doesn't actually mean you're clean.

Read now →

Ready to eliminate vulnerabilities at the source?