The LiteLLM AI Gateway Attack: A Supply Chain Cautionary Tale
In March 2026, the cybersecurity world witnessed a sophisticated supply chain attack targeting LiteLLM, a popular Python library used as a multifunctional gateway for AI agents. Attackers injected malicious code into two versions of the library on the PyPI repository, aiming to steal sensitive data from servers. This incident highlights the growing risk of compromised open-source components. Below, we answer key questions about how the attack unfolded, what data was targeted, and the technical methods used—providing essential insights for developers and security teams.
How did the attackers compromise the LiteLLM package?
The compromise targeted the distribution channel via PyPI. On March 24, 2026, attackers uploaded trojanized versions of LiteLLM—litellm==1.82.7 and litellm==1.82.8—to the official registry. These were not modified source code libraries but maliciously altered distribution packages. In version 1.82.7, the harmful code was embedded in proxy_server.py, while in version 1.82.8, an extra file named litellm_init.pth was added. Both versions bypassed typical review checks, underscoring how easily compromised packages can enter trusted repositories.

What was the primary goal of the attackers?
Technical analysis revealed that the attackers were primarily after confidential data from servers. Their focus included credentials and configurations for AWS, Kubernetes, NPM, and various databases such as MySQL, PostgreSQL, and MongoDB. In the case of databases, they specifically sought configuration files. Additionally, the malware had functionality to steal data from crypto wallets and to establish a persistent foothold inside Kubernetes clusters. This multi-target approach shows the attackers aimed to harvest high-value infrastructure secrets.
How was the malicious code executed in each version?
Both LiteLLM versions contained identical malicious code, but execution differed. In version 1.82.7, the code ran only when the proxy functionality was imported, making it conditional and harder to detect. In version 1.82.8, the attackers used a .pth file (litellm_init.pth), which Python executes automatically upon interpreter startup. This technique ensured the malicious code ran every time the interpreter started, without requiring any specific import. The .pth method is particularly insidious because it can persist even if the main library isn’t actively used.
What steps did the malware take after infection?
Once executed, the malicious script decoded a Base64-encoded payload embedded in the file. It first saved this decoded code as p.py alongside itself and immediately executed it. The p.py script then launched the main payload—another Base64-encoded script—without saving it to disk, reducing forensic evidence. Before writing any output, the malware encrypted the results using AES-256-CBC and stored them in a file in the launch directory. This encryption helped conceal stolen data, making it harder for defenders to understand the breach.

Why are supply chain attacks like this particularly dangerous?
Supply chain attacks exploit trust in widely used components. LiteLLM is integrated into numerous AI services and developer tools. Because many solutions include such libraries, a single compromised version can propagate malware to thousands of downstream systems. The consequences range from compromising a developer’s local machine to infiltrating entire cloud infrastructures if the malicious code reaches a production service. The growing frequency of such incidents—through fake libraries, delayed payloads, or account takeovers—makes this one of the most impactful vectors in modern cybersecurity.
How can organizations defend against similar attacks?
To mitigate risks, organizations should implement supply chain security measures. These include using package integrity verification (e.g., checking hashes), conducting dependency audits, and maintaining an internal curated repository of approved packages. Runtime monitoring for unusual behavior—such as unexpected network connections or file writes—can detect active malware. Additionally, limiting permissions for containerized workloads (e.g., Kubernetes pods) reduces the blast radius. Finally, staying informed about disclosed vulnerabilities and quickly updating or pinning safe versions is critical. No single step is foolproof, but a layered defense significantly reduces exposure.