litellm
An open-source LLM proxy and AI gateway that provides a unified OpenAI-compatible API across 100+ model providers — enabling teams to route, load balance, rate limit, and observe LLM traffic from a single control point.
What is LiteLLM?
The LiteLLM image packages the LiteLLM proxy server so you can run a self-hosted AI gateway in a container without managing Python environments or dependency conflicts on the host. LiteLLM exposes a single OpenAI-compatible REST API and translates requests at runtime to the native format of whichever model provider you configure — OpenAI, Anthropic, Azure OpenAI, Google Vertex AI, AWS Bedrock, Cohere, Mistral, Ollama, and over a hundred others — so application code never needs to change when you add, swap, or failover between providers. Beyond protocol translation, the proxy handles API key management and virtual key issuance so teams can give each application or user a scoped key without exposing provider credentials directly; budget and rate limiting per key, team, or model; request and response logging to observability platforms like Langfuse, Helicone, and S3; load balancing and fallback routing across multiple deployments of the same model; and a built-in spend tracking dashboard. LiteLLM is used by AI platform teams building internal developer portals for LLM access, MLOps teams that need centralized control over which models different services can call and at what cost, and organizations that want to abstract provider dependencies out of application code so that model selection remains an infrastructure decision rather than a code change.
What is Echo's LiteLLM image?
Echo's LiteLLM image is a hardened build of LiteLLM on Echo's hardened base. Echo images are designed to be a drop-in replacement: swap the image reference in your LiteLLM deployment manifest or Compose file and CVEs go to zero without disrupting your provider routing, virtual key configuration, or budget enforcement rules. Every image is tested across clouds, image use cases, and deployment targets. Echo ships every image in two variants:
- Distroless variant — optimized for runtime use, with the smallest possible attack surface
- Default variant — includes essential build tools, package managers, and shells for teams that need operational access
For production LiteLLM deployments, the distroless variant keeps all proxy operations — provider routing, virtual key validation, rate limit enforcement, spend tracking, and observability integrations — fully intact while minimizing exposure; the default variant suits platform teams that need shell access for config file debugging, provider credential troubleshooting, or inspecting LiteLLM's routing and fallback behavior at runtime.
What is the difference between Echo's LiteLLM image and the public LiteLLM image?
Public LiteLLM images ship on bases that include OS-level tooling convenient for development but which contribute CVEs that accumulate on a proxy running continuously in production with access to every provider API key in your organization, every prompt and completion your applications send, and the budget and rate limit controls that govern how your teams consume AI infrastructure. AI gateways occupy an unusually sensitive position in the stack — LiteLLM sits in the critical path of every LLM call your applications make, holding provider credentials for OpenAI, Anthropic, Azure, Bedrock, and others simultaneously, with visibility into request payloads that may contain user data, internal context, and proprietary prompts. A compromised LiteLLM image doesn't just expose one service; it can exfiltrate all provider keys, log every prompt and completion passing through the gateway, manipulate model routing to redirect traffic, or silently alter responses before they reach your applications. Echo's build retains everything LiteLLM needs for provider routing, virtual key management, rate limiting, spend tracking, and observability forwarding while removing the packages that don't belong in a production AI gateway container. As we covered in our post on how to protect your company from software supply chain attacks, infrastructure that sits in the critical path of sensitive data — and AI gateways handle exactly that — is where supply chain compromises cause the most damage and are the hardest to detect. Echo commits to a 7-day SLA for critical and high severity vulnerabilities, and 10 days for medium, low, and unknown — with vulnerabilities triaged within 24 hours. Echo images are recognized by all major scanners and mirrored to all major registries, so they fit into existing pipelines without changing your registry, scanner, or runtime tooling.
FAQ
Can I replace my LiteLLM image with Echo's LiteLLM image?
Yes. Echo's LiteLLM image is a drop-in replacement. Update the image reference in your deployment manifest, Compose file, or Dockerfile FROM line and your proxy keeps running. Provider routing, virtual key issuance and validation, budget and rate limit enforcement, load balancing and fallback configuration, spend tracking, and observability integrations all continue to work without any changes to your existing LiteLLM config file, environment variables, or application API calls.
Is Echo's LiteLLM image FIPS-validated?
Yes. Echo's FIPS-validated images use cryptographic modules with an active FIPS 140-3 CMVP certificate, making them fit for federal use — unlike FIPS-compliant images that haven't been validated. This matters for platform teams running LiteLLM inside FedRAMP boundaries where a proxy that authenticates to external model provider APIs, transmits prompt and completion data over TLS, and manages credentials for multiple providers must meet cryptographic requirements at every layer, not just in the application code.
What is Echo's vulnerability management SLA on the LiteLLM image?
Echo commits to a 7-day SLA for critical and high severity vulnerabilities, and 10 days for medium, low, and unknown — with vulnerabilities triaged within 24 hours. Patches are mirrored automatically into your private registry so you're always running a clean version — critical for an AI gateway image that runs continuously in production, holds provider API keys for your entire organization, and sits in the critical path of every LLM call your applications make.
Is Echo's LiteLLM image distroless?
Echo ships every image in two variants: a distroless variant optimized for runtime use, and a default variant that includes essential build tools, package managers, and shells. For production LiteLLM deployments where the proxy runs as a long-lived service, the distroless variant is the leaner, more secure choice; for platform teams that need shell access for config validation, provider credential debugging, or inspecting routing and fallback behavior, the default variant is the right fit.
How does Echo achieve such a drastic CVE reduction in LiteLLM?
Echo's LiteLLM image is built from source with only the absolute essentials needed to run the LLM proxy and AI gateway workload, which significantly shrinks the attack surface. Echo also patches aggressively over time, with backports available so you can stay on the LiteLLM version your provider configurations and virtual key setup are built against without forcing a disruptive upgrade that risks breaking routing rules or budget enforcement behavior across your platform.
Will Echo's LiteLLM image help us achieve FedRAMP?
Yes. The hard parts of FedRAMP — managing vulnerabilities, applying fixes, and using FIPS-validated cryptography — are baked into Echo images, including STIG-hardened configuration and ConMon/POA&M-ready reporting. For platform teams running LiteLLM as the AI gateway layer under an ATO — routing LLM traffic, managing provider credentials, and enforcing access controls for government or regulated workloads — Echo's hardened LiteLLM image keeps the gateway in-boundary and compliant without requiring custom hardening or manual patching between compliance cycles.
.avif)