Attackers quickly exploited a critical LiteLLM flaw (CVE-2026-42208) to access and modify sensitive database data via SQL injection.
Attackers rapidly exploited a critical vulnerability in LiteLLM Python package, tracked as CVE-2026-42208, just days after it became public. The vulnerability, an SQL injection in the proxy API key verification process, lets attackers access and potentially modify database data.
Instead of safely passing the key as a parameter, it directly inserts the user-supplied value into a database query. This unsafe practice opens the door to SQL injection.
An attacker doesn’t need valid credentials. By sending a specially crafted Authorization header to an API endpoint (such as /chat/completions), they can manipulate the query executed by the database. Because the request flows through an error-handling path, the malicious input still reaches the vulnerable query.
“A database query used during proxy API key checks mixed the caller-supplied key value into the query text instead of passing it as a separate parameter. An unauthenticated attacker could send a specially crafted Authorization header to any LLM API route (for example POST /chat/completions) and reach this query through the proxy’s error-handling path.” reads the BerriAI’s advisory. “An attacker could read data from the proxy’s database and may be able to modify it, leading to unauthorised access to the proxy and the credentials it manages.”
Researchers observed real-world attacks targeting sensitive information stored in database tables, highlighting how quickly disclosed flaws can turn into active threats.
The flaw affects LiteLLM versions 1.81.16 to 1.83.6 and was fixed in 1.83.7 on April 19, 2026. The Sysdig Threat Research Team reported that attackers began exploiting it about 36 hours after disclosure.
“The Sysdig Threat Research Team (TRT) observed the first exploitation attempt 36 hours and seven minutes after the advisory was published to the global database.” reads the report published by Sysdig. “The traffic the Sysdig TRT captured was not a generic SQLmap spray, which is very common in SQL injection attacks, but a deliberate, and likely customized, enumeration of the production LiteLLM schema, targeting the three tables that hold the highest-value secrets: virtual API keys, stored provider credentials, and the proxy’s environment-variable configuration.
The attacker showed strong knowledge of LiteLLM’s database structure and quickly mapped table schemas, but researchers saw no signs of data theft or further compromise.
“We did not see follow-through, however. There were no authenticated calls using exfiltrated keys, no virtual-key minting via /key/generate, and no chained reuse of provider credentials.” continues the report. “The novelty of this finding is the speed and precision of the schema-enumeration attempt, not a confirmed compromise.”
Sysdig published indicators of compromise for attacks exploiting this vulnerability.
Users who can’t upgrade their installs are suggested to enable disable_error_logs: true in general settings to block the attack path and reduce exposure.
In 2026, the question for security leaders is not whether a supply chain attack is coming. Every serious organization should assume it is. The question is whether their defense architecture can stop a payload it has never seen before. It’s a question that takes on even more critical implications at a time where trusted agentic automation increasingly becomes the norm.
In three weeks this spring, three threat actors each ran a tier-1 supply chain attack against widely deployed software: LiteLLM, a core AI infrastructure package, Axios, the most downloaded HTTP client in the JavaScript ecosystem, and CPU-Z, a trusted system diagnostic tool. Different vectors, different actors, different techniques. SentinelOne® stopped all three on the same day each attack launched, with no prior knowledge of any payload.
The more important story is the how. Each attack arrived as a zero-day at the moment of execution. Each exploited a trusted delivery channel: an AI coding agent running with unrestricted permissions, a phantom dependency staged eighteen hours before detonation, a properly signed binary from an official vendor domain. No signature existed for any of them. No IOA matched.
SentinelOne stopped all three. That outcome is a direct answer to the question every security leader is now running against: What does your defense do when the attack arrives through a channel you explicitly trust, carrying a payload you have never seen before?
The AI Arms Race in Security is Underway
Adversaries are no longer running manual campaigns at human speed. In September 2025, Anthropic disclosed a Chinese state-sponsored group that jailbroke an AI coding assistant and ran a full espionage campaign against approximately 30 organizations. The AI handled 80–90% of tactical operations autonomously (i.e., reconnaissance, vulnerability discovery, exploit development, credential harvesting, lateral movement, exfiltration) with minimal human direction. Anthropic noted only 4–6 human decision points per campaign. The attack achieved limited success across those targets, but the trajectory is clear: AI is compressing the human bottleneck in offensive operations. Security programs designed around manual-speed adversaries are calibrating to a threat that is moving faster.
The LiteLLM attack is the clearest recent example of what this looks like inside an AI development workflow. On March 24, 2026, threat actor TeamPCP compromised the LiteLLM Python package by obtaining PyPI credentials through a prior supply chain compromise of Trivy, a widely-used open-source security scanner. Two malicious versions (1.82.7 and 1.82.8) were published. Any system with those versions during the exposure window executed the embedded credential theft payload automatically. In one confirmed detection, an AI coding agent running with unrestricted permissions (claude --dangerously-skip-permissions) auto-updated to the infected version without human review — no approval, no alert, no visible action before the payload ran. SentinelOne detected and blocked the malicious Python execution on the same day across multiple environments. Most organizations running AI development workflows didn’t know they were exposed until after the fact. The gap where human review processes don’t reach is wide, and it grows with every AI agent added to a pipeline.
Security programs were built for a different adversary. Vulnerability management, triage queues, patch cadences: all of it assumes an attacker who moves at a pace where human response can still close the window. This year’s SentinelOne Annual Threat Report documented what happens when that assumption breaks: adversaries are shifting left, embedding malicious logic in the build process before software ever reaches production. Likewise, the Verizon 2025 Data Breach Investigations Report found that edge device vulnerabilities are now being mass-exploited at or before the day of CVE publication, while organizations take a median of 32 days to patch them. The old model worked when it was designed. Attackers just weren’t running AI yet.
Three Attacks, One Common Failure Mode
Each attack ran through the same gap. Authorization was treated as a sufficient security boundary, and when authorization is automated, that assumption has no floor.
An AI agent with install permissions doesn’t stop to ask whether a package looks right. It installs. Trusted source, valid credentials, done. Supply chain attacks have always exploited trusted delivery channels, but a human at the keyboard introduces at least one friction point: Someone might notice something off, slow down, ask a question. Agents don’t do that. They execute at the speed of the next API call. When you give an agent install permissions, you’ve extended your trust model to cover everything it will ever run. Authorized agents execute exactly what their permissions allow. That’s the design. Treating permission as a proxy for safety is what turns a compromised supply chain hypersonic.
LiteLLM was compromised via credentials stolen through Trivy, a security scanner. The Axios attacker bypassed every npm security control the project had in place by exploiting a legacy access token the maintainers had forgotten to revoke. The CPUID attackers went after the vendor’s distribution infrastructure directly, so anyone who downloaded from the official website got a properly signed binary with a payload inside. In all three cases, the identity was legitimate. The intent wasn’t.
SentinelOne’s Annual Threat Report named the failure precisely: “The identity is verified, but the intent has been subverted, rendering traditional access controls ineffective against the resulting supply chain contamination.” Signature libraries, IOA rule sets, reputation lookups: All of them check authorization. None check intent. These attacks were designed to exploit exactly that. When the authorization model runs automatically, so does the exposure.
What Actually Stopped Them
In each incident, SentinelOne’s on-device behavioral AI flagged the execution pattern, not a known signature or hash for that specific attack.
The LiteLLM detection flagged a Python interpreter executing Base64-decoded code in a spawned subprocess. SentinelOne killed the process preemptively, terminating 424 related events in under 44 seconds, before any human was in a position to observe it. The Axios detection, via the Lunar behavioral engine, caught PowerShell executing under a renamed binary from a non-standard path. The engine flagged the technique regardless of what the payload contained. The first infection occurred 89 seconds after the malicious package went live; the behavioral detection fired on the same day of publication. The CPU-Z detection flagged cpuz_x64.exe building an anomalous process chain: spawning PowerShell, which spawned csc.exe, which spawned cvtres.exe. CPU-Z does not do that. The platform terminated the execution chain mid-attack during a 19-hour active distribution window.
This is the operational output of Autonomous Security Intelligence (ASI), the intelligence fabric built into the Singularity Platform. ASI runs on-device at the edge as part of the core architecture. It is already running when the attack starts, killing the process before the threat can escalate.
Where customers had SentinelOne fully deployed with the right policies enabled, they were covered. Where they did not, they were exposed, and with average ransomware recovery costs exceeding $4M per incident, that exposure has a real price. If you are not certain your deployment matches the configuration that stopped these three attacks, that certainty is worth getting.
AI to Fight AI
This is the product reality behind the thesis SentinelOne brought to RSAC: AI to fight AI. A machine-speed adversary requires a machine-speed defense. That is an architectural requirement, not a positioning statement. ASI monitors behavioral patterns at the point of execution and kills the process when something deviates, at machine speed, without waiting for a human to write a query or approve a kill.
According to an IDC study, organizations using SentinelOne’s AI platform identify threats 63% faster and remediate 55% faster than legacy solutions, neutralizing 99% of threats without a single manual step. For organizations in regulated industries (healthcare, financial services, manufacturing, critical infrastructure), the stakes compound beyond breach cost. An exposure window that stays open through manual investigation is a potential regulatory notification event, an audit finding, and a conversation the CISO has with the board under circumstances no one wants. The difference between a stopped attack and an active breach is whether the architecture acts before the attacker establishes persistence. By the time a human analyst approves the kill, redundant persistence mechanisms may already be installed. The CPU-Z attack deployed three of them specifically because partial cleanup leaves the payload operational.
Human-driven workflows, manual validation, and legacy tooling cannot keep pace with that attack cadence. When defense relies on investigation before action, the advantage shifts to the adversary. The gap is in the architecture. You cannot tune your way out of it.
Conclusion | The Only Question That Matters
SentinelOne’s latest Annual Threat Report documented the pattern these three attacks confirm: Adversaries are “shifting left” by integrating malicious logic into the build process itself, compromising software before it reaches production. It is the current operating model of advanced threat actors, and it is accelerating.
Three attacks. Three detections. Three outcomes, all in a matter of weeks. The architecture that survived them is real-time, AI-native, and built into the edge.
The question every security leader should be able to answer: Could your current solution have stopped LiteLLM, Axios, and CPU-Z autonomously, on the day of each attack, with no prior knowledge of any payload?
If the answer depends on a signature update, a cloud verdict, a manual investigation step, or a policy that wasn’t enabled, that is your answer.
Read the full technical breakdown of each incident:
All third-party product names, logos, and brands mentioned in this publication are the property of their respective owners and are for identification purposes only. Use of these names, logos, and brands does not imply affiliation, endorsement, sponsorship, or association with the third-party.
Host-based Behavioral Autonomous AI Detection is by far the most effective way to generically see, and stop both Human and/or machine-speed AI Agent based rogue or malicious activities.
On March 24, 2026, SentinelOne’s autonomous detection caught what manual workflows never could have: a trojaned version of LiteLLM, one of the most widely used proxy layers for LLM API calls, executing malicious Python across multiple customer environments. The package had been compromised hours earlier. No analyst wrote a query. No SOC team triaged an alert. The Singularity Platform identified and blocked the payload before it could run, across every affected environment, on the same day the attack was launched.
The LiteLLM supply chain compromise is not an anomaly. It is the new pattern: multi-stage, multi-surface, designed to evade manual workflows at every turn. A compromised security tool led to a compromised AI package, which led to data theft, persistence, Kubernetes lateral movement, and encrypted exfiltration, all within a window measured in hours.
SentinelOne detected and blocked this attack autonomously, on the same day it was launched, across multiple customer environments. No manual triage. No signature update. No analyst in the loop for the initial containment. This is what autonomous, AI-native defense looks like when it meets a real-world threat at machine speed.
The gap between the velocity of this attack and the capacity of human-driven investigation is the gap where organizations get compromised. Closing that gap is not a feature request. It is an architectural decision. This is what happens when AI infrastructure gets targeted by a multi-stage supply chain campaign, and what it looks like when autonomous, AI-native defense is already in position.
Here is what we detected, how the attack was structured, and why this is the class of threat that the Singularity Platform was built to stop.
Autonomous Detection at Machine Speed
SentinelOne’s macOS agent identified and preemptively killed a malicious process chain originating from Anthropic’s Claude Code running with unrestricted permissions (claude --dangerously-skip-permissions). No human developer ran pip install, an autonomous AI coding assistant updated LiteLLM to the compromised version as part of its normal workflow.
The AI engine classified the behavior as MALICIOUS and took immediate action: KILLED (PREEMPTIVE) across 424 related events in under 44 seconds. The agent didn’t need to know the package was compromised, it watched what the process did and stopped it based on behavior, regardless of what initiated the install.
Catching the Payload in the Act
The macOS agent caught the trojaned LiteLLM package mid-execution. The process summary tells the story: python3.12 launching with a command line containing import base64; exec(base64.b64decode(... , the exact bootstrap mechanism described in the attack’s first stage, decoding and executing the obfuscated payload in a child process.
The agent didn’t need a signature for this specific package. It recognized the behavioral pattern, a Python interpreter executing base64-decoded code in a spawned subprocess, classified it as MALICIOUS, and killed it preemptively before the stealer, persistence, or lateral movement stages could deploy.
The Full Process Tree: Containing the Blast Radius
Zooming out on the same detection reveals the full scope of what the autonomous AI agent was doing when the payload fired. The process tree expands from Claude Code (2.1.81) into a sprawling chain: zsh, bash, node, uv, ssh, rm, python3.12, mktemp, with hundreds of child events still loadable (304 events captured). This is what unrestricted AI agent activity looks like at the endpoint level: a single command spawning an entire dependency management workflow that pulled, installed, and attempted to execute the trojaned package.
The SentinelOne macOS agent traced every branch of this tree, correlated the events back to the root cause, and killed the malicious execution; all while preserving the full forensic record for investigation.
The Compromise Was Indirect. That’s What Makes It Dangerous.
The attacker, operating under the alias TeamPCP, never attacked LiteLLM directly. They first compromised Trivy, a widely trusted open-source security scanner, on March 19. From there, they obtained the LiteLLM maintainer’s PyPI credentials and used them to publish two malicious versions: 1.82.7 and 1.82.8.
A security tool, built to find vulnerabilities, became the vector that enabled the compromise of an AI infrastructure package used by thousands of organizations. The same actor went on to compromise Checkmarx KICS and AST on March 23, and Telnyx on March 27. This was not a smash-and-grab. It was a coordinated campaign that exploited the transitive trust woven through open-source supply chains.
For security leaders asking, “Could this have reached us?” the more pressing question is: “How fast could we have answered that?”
A New Attack Surface: AI Agents With Unrestricted Permissions
In one customer environment, SentinelOne detected the infection arriving through an unexpected vector: an AI coding assistant running with unrestricted system permissions autonomously updated LiteLLM to the trojaned version without human review. The update pulled the infected package, and the payload attempted to execute. Our agent blocked it.
This is a new class of attack surface that most organizations have not yet scoped. AI coding agents operating with full system permissions can become unwitting vectors for supply chain compromises. The speed and automation that make these tools valuable are the same properties that make them dangerous when the packages they pull have been weaponized. Organizations that have not yet established governance policies for AI assistant permissions are carrying risks they cannot see.
SentinelOne’s behavioral detection operates below the application layer. It does not matter whether a malicious package is installed by a human, a CI pipeline, or an AI agent. The platform monitors process behavior via the Endpoint Security Framework, which is why this detection fired regardless of how the infected package arrived.
Two Infection Vectors, One Designed to Run Without You
Version 1.82.7 embedded its payload in proxy_server.py, which executes every time the litellm.proxy module is imported. For anyone using LiteLLM as a proxy layer for LLM API calls, this fires constantly during normal operations.
Version 1.82.8 escalated. The attacker placed the payload in a .pth file, litellm_init.pth. Files with the .pth extension are processed by the Python interpreter at startup, regardless of which modules are imported. Any Python script running on a system with this version installed would trigger the malicious code, even if that script had nothing to do with LiteLLM.
If version 1.82.7 was a targeted shot, version 1.82.8 was a blast radius expansion. The attacker removed the requirement that the victim actually use the compromised library.
What the Payload Did Once Inside
The attack was structured as a multi-stage delivery system, each stage decoding, decrypting, and executing the next. The first stage was a minimal bootstrap, a single line of base64-decoded Python launched in a detached subprocess with stdout and stderr suppressed. Lightweight enough to slip past signature-based tools. Quiet enough to avoid raising flags.
The second stage was a comprehensive data stealer. It harvested system and user information, cryptocurrency wallets, cloud credentials, application secrets, and system configurations. For practitioners wondering what the blast radius looks like if a developer workstation is compromised, this is the answer: the attacker collects everything needed to move from a laptop to production infrastructure.
The third stage established persistence through a systemd user service at ~/.config/systemd/user/sysmon.service, executing a script at ~/.config/sysmon/sysmon.py. The naming convention, “sysmon,” was deliberately chosen to mimic legitimate system monitoring tools. It is designed to survive casual inspection and blend into environments where dozens of services run as expected background noise. This is precisely the kind of evasion that signature-based detection misses and behavioral AI catches: the process looks normal until you observe what it actually does.
The persistence mechanism included a 5-minute initial delay before any network activity, a technique specifically designed to outlast automated sandbox analysis. After that, the script contacted its C2 server every 50 minutes, fetching dynamic payload URLs. This sparse communication pattern makes behavioral detection through network monitoring significantly harder, and gives the attacker the ability to push new tooling without ever re-compromising the target.
It Moved Laterally Through Kubernetes
The attack did not stop at the workstation. It created privileged pods across Kubernetes cluster nodes in the kube-system namespace, using standard container images like alpine:latest, with hostPID, hostNetwork, and a privileged security context. By mounting the host filesystem directly, these pods gained root-level access to underlying nodes.
Each pod deployed persistent backdoors as systemd services on the host system. The pods operated in legitimate namespaces, used standard images, and ran with privileges that many production workloads legitimately require. For SOC practitioners asking whether their admission control and runtime detection would have caught this: the attack was designed specifically so they might not. Detecting this requires runtime visibility into container behavior after deployment, not just policy enforcement at the admission gate. This is the difference between cloud security that checks configuration and cloud security that watches execution.
Exfiltration Was Encrypted and Camouflaged
Stolen data was encrypted using a hybrid RSA and AES-256-CBC scheme. A random 32-byte AES session key encrypted the data, then the session key itself was protected with RSA encryption using a hardcoded public key. This meant the malware could encrypt and exfiltrate without first communicating with the C2 server. The encrypted payload was packaged as tpcp.tar.gz and transmitted via a single HTTP POST to models.litellm.cloud, a domain chosen to blend with legitimate LiteLLM API traffic and slip past network monitoring that whitelists expected destinations.
What This Attack Proves
The LiteLLM supply chain compromise is not an anomaly. It is the new pattern: multi-stage, multi-surface, designed to evade manual workflows at every turn. A compromised security tool led to a compromised AI package, which led to data theft, persistence, Kubernetes lateral movement, and encrypted exfiltration, all within a window measured in hours.
SentinelOne detected and blocked this attack autonomously, on the same day it was launched, across multiple customer environments. No manual triage. No signature update. No analyst in the loop for the initial containment. This is what autonomous, AI-native defense looks like when it meets a real-world threat at machine speed.
The gap between the velocity of this attack and the capacity of human-driven investigation is the gap where organizations get compromised. Closing that gap is not a feature request. It is an architectural decision.
Why This Detection Worked: Architecture, Not Luck
The LiteLLM detection wasn’t a one-off. It’s what happens when autonomous, behavioral AI is built into the foundation, not bolted on after the fact. The Singularity Platform’s visibility across endpoint, cloud, identity, and AI workloads is why the agent saw this regardless of whether the install came from a human, a CI pipeline, or an AI coding assistant.
For teams that need the human expertise layer on top, Wayfinder MDR extends that autonomous detection with 24/7 investigation and response, closing the gap between detection and resolution.
This is the Autonomous Security Intelligence (ASI) framework in practice: AI that acts at machine speed, backed by human expertise when it matters, across every surface the attack can reach. See how the Singularity Platform protects AI infrastructure and request a demo today.
Protect Your Endpoint
See how AI-powered endpoint security from SentinelOne can help you prevent, detect, and respond to cyber threats in real time.
SentinelOne AI stopped a LiteLLM supply chain attack in seconds, blocking malicious code automatically without human intervention.
SentinelOne’s AI-based security detected and blocked a supply chain attack involving a compromised LiteLLM package.
SentinelOne’s macOS agent detected and stopped a malicious process chain triggered by Claude Code after it unknowingly installed a compromised LiteLLM package. The AI identified suspicious hidden Python code execution via base64 decoding, and killed the process within seconds across hundreds of events. The system traced the full process chain triggered by an AI agent and prevented data theft or further spread, showing the power of autonomous, behavior-based defense.
Attackers indirectly compromised LiteLLM by first breaching trusted tools like Trivy, stealing maintainer credentials to publish malicious versions. The campaign also hit other platforms, showing how open-source trust can be abused. In one case, an AI coding assistant unknowingly installed the infected package, highlighting a new risk: AI agents with full system access can spread attacks automatically.
“SentinelOne’s behavioral detection operates below the application layer. It does not matter whether a malicious package is installed by a human, a CI pipeline, or an AI agent.” reads the report published by SentinelOne. “The platform monitors process behavior via the Endpoint Security Framework, which is why this detection fired regardless of how the infected package arrived.”
Two malicious versions ensured execution, one during normal use, the other at Python startup, expanding the attack’s reach even to systems not actively using LiteLLM.
The LiteLLM attack began with a small, obfuscated script that launched silently, followed by a data stealer that collected system info, credentials, crypto wallets, and secrets. The malware then ensured persistence by installing a disguised system service that ran in the background and contacted its command server at long intervals to avoid detection.
“The third stage established persistence through a systemd user service at ~/.config/systemd/user/sysmon.service, executing a script at ~/.config/sysmon/sysmon.py.” continues the report. “The persistence mechanism included a 5-minute initial delay before any network activity, a technique specifically designed to outlast automated sandbox analysis. After that, the script contacted its C2 server every 50 minutes, fetching dynamic payload URLs.”
The attack expanded beyond the initial machine by creating privileged Kubernetes pods, gaining deep access to cluster nodes and deploying backdoors. Stolen data was encrypted and sent to a server designed to look legitimate, helping it bypass monitoring. Overall, the attack shows how modern threats combine stealth, automation, and multiple layers to move quickly and evade traditional defenses.
“The LiteLLM detection wasn’t a one-off. It’s what happens when autonomous, behavioral AI is built into the foundation, not bolted on after the fact.” concludes the report.
Overview Recently, NSFOCUS Technology CERT detected that the GitHub community disclosed that there was a credential stealing program in the new version of LiteLLM. Analysis confirmed that it had suffered supply chain poisoning by the TeamPCP group on PyPI. It stole the publishing permission credentials by hacking into the security scanning tool Trivy used in […]
A significant proportion of cyberincidents are linked to supply chain attacks, and this proportion is constantly growing. Over the past year, we have seen a wide variety of methods used in such attacks, ranging from creation of malicious but seemingly legitimate open-source libraries or delayed attacks in such seemingly legitimate libraries, to the simplest yet most effective method: compromising the accounts of popular library owners to subsequently release malicious versions of their libraries. Such libraries are used by developers everywhere and are included in many solutions and services. The consequences of an attack can vary widely, ranging from delivering malware to a developer’s device to compromising an entire infrastructure if the malicious library has made its way into the code of a service or product.
This is exactly what happened in March 2026, when attackers injected malicious code into the popular Python library LiteLLM, which serves as a multifunctional gateway for a large set of AI agents. The attackers released two trojanized versions of LiteLLM that delivered malicious scripts to the victim’s system. Both versions made their way into the PyPI repository for Python. A technical analysis revealed that the attackers’ primary targets were servers storing confidential data related to AWS, Kubernetes, NPM, etc., as well as various databases (MySQL, PostgreSQL, MongoDB, etc.). In the latter case, the attackers were primarily interested in database configurations. In addition, the malware’s logic included functionality for stealing confidential data from crypto wallets and techniques for establishing a foothold in the Kubernetes cluster.
Repository compromise
The compromise affected the package distribution channel via PyPI: on March 24, 2026, malicious LiteLLM versions litellm==1.82.7 and litellm==1.82.8 were uploaded to the registry. It was specifically the package’s distribution contents that were compromised: in version 1.82.7, the malicious code was embedded in proxy_server.py, and in 1.82.8, the file litellm_init.pth was added.
Technical analysis
Both versions of the library contained the same malicious code, but its execution was implemented differently. In version 1.82.7, it was executed only when the proxy functionality was imported, while in 1.82.8, a .pth file was added that was able to execute the code every time the interpreter started.
Example of the malicious code in proxy_server.py
The malicious code in the proxy_server.py and litellm_init.pth files contained Python code encoded in Base64. After execution, the infected script saved this code alongside itself as a p.py file and immediately executed it.
The p.py script launched the main payload – another script, also encoded in Base64 – without saving it to disk. At the same time, it wrote the output of this payload to a file in the directory from which it was launched. Before being written, the output was encrypted using the AES-256-CBC algorithm with a random key generated in the code, which was saved to the session.key file. The key file was also encrypted – using a pre-initialized public RSA key.
The encrypted key and the output were combined into a tpcp.tar.gz archive and sent to the attackers’ remote server.
What exactly happened within the malicious payload whose output was sent to the C2 server? After it was launched, a recursive scan of the working directories on the victim’s system (/root, /app/, /var/www, etc.) began. In each directory, the script scanned the contents of files, which it output to the stdout buffer, from where it was then saved to the aforementioned file as the result. Next, the script collected system information and also saved it to the file. After that, it proceeded to search for sensitive data. It was interested in the following data located on servers and within the infrastructures of various services:
SSH keys
GIT accounts
.env files
AWS, Kubernetes, email service, database, and WireGuard configurations
files related to Helm, Terraform, and CI
TLS keys and certificates
A notable feature of this malware is that it does not limit itself to stealing files and configurations from the disk but also attempts to extract runtime secrets from the cloud infrastructure.
The code above uses the addresses 169.254.169.254 and 169.254.170.2. The first corresponds to the AWS Instance Metadata Service (IMDS), through which an EC2 instance (a virtual server in AWS, a machine running in the cloud) can retrieve metadata and temporary IAM role credentials (an AWS account with a set of permissions that a service or application can use to obtain temporary credentials for calls to the AWS API). The second is used in Amazon ECS to issue temporary credentials to a container during execution. Thus, the malicious script targets not only static secrets but also those issued by the cloud that can grant direct access to AWS resources at the time of infection.
Additionally, the script searches for crypto wallet configurations, as well as webhooks associated with Slack and Discord messengers. The latter indicates that the attackers are interested not only in infrastructure secrets and accounts, but also in communication channels within the development team.
In the next stage, the malware moves from data collection to establishing a foothold in the Kubernetes cluster infrastructure: if it has sufficient access, it configures a privileged pod (the smallest execution unit in Kubernetes, containing one or more containers) by enabling the securityContext.privileged=true option and mounts the node’s root filesystem via hostPath. This allows it to escape the container and perform actions at the node level.
Next, the malware executes another stage of infection: it saves a Base64-encoded script disguised as a legitimate system component to the Kubernetes node’s disk at the path /root/.config/sysmon/sysmon.py, and registers it via systemd. After launching, the script waits for an initial delay of 300 seconds, then begins periodically contacting the C2 node checkmarx[.]zone/raw, retrieving a link to the next payload from there. If the received value differs from the state previously saved in /tmp/.pg_state, the script downloads a new file to /tmp/pglog, makes it executable, and runs it in the background. At this stage, the attackers gain a foothold in the system and are capable of regularly delivering updated payloads without the need for re-injection. Since the malicious payload is written not to the container’s temporary file directory but directly to the Kubernetes cluster node, the attackers will retain access to the infrastructure even after the container has terminated.
A similar scenario is used for local persistence: in the absence of Kubernetes, the sysmon.py script is deployed in the user’s directory at ~/.config/sysmon/sysmon.py and is also registered as a service via systemd.
OpenVSX version of the malware
While analyzing files communicating with the C2 server, we discovered malicious versions of two common Checkmarx software extensions: ast-results 2.53.0 and cx-dev-assist 1.7.0. Checkmarx is used for application security assessment. These trojanized extensions contained malicious code that delivered the NodeJS version of the malware described above.
This version is downloaded from checkmarx[.]zone/static/checkmarx-util-1.0.4.tgz using NodeJS package installation utilities and is named checkmarx-util. Its key difference from the Python version is that it does not attempt to elevate privileges to the Kubernetes node level and does not create a privileged pod for persistence. Instead, it implements local persistence within the current environment. This means that the NodeJS variant persists only where it is already running.
Additionally, the list of folders to search for and steal secrets from is significantly smaller in this version than in the Python variant.
Checkmarx extensions are used to scan code and infrastructure configurations, so their compromise is quite dangerous: an attacker gains access not only to project files but also to a significant portion of the development environment, tokens, and local configurations.
Victimology
While assessing the attack’s impact, we saw victims all over the world. Most infection attempts occurred in Russia, China, Brazil, the Netherlands, and UAE.
Conclusion
As the technical analysis shows, the malicious scripts found in the LiteLLM versions are dangerous not only because they steal files containing sensitive data, but also because they target multiple critical infrastructure components simultaneously: the local system, cloud runtime secrets, the Kubernetes cluster, and even cryptographic keys. Such a broad scope of data collection allows an attacker to quickly move from compromising a single system and Python environment to seizing service accounts, secrets, and entire infrastructures.
Prevention and protection
To protect against infections of this kind, we recommend using a specialized solution for monitoring open-source components. Kaspersky provides real-time data feeds on compromised packages and libraries, which can be used to secure the supply chain and protect development projects from such threats.
Home security solutions, such as Kaspersky Premium, help ensure the security of personal devices by providing multi-layered protection that prevents and neutralizes infection threats. Additionally, our solution can restore the device’s functionality in the event of a malware infection.
To protect corporate devices, we recommend using a complex solution such as Kaspersky NEXT, which allows you to build a flexible and effective security system. The products in this line provide threat visibility and real-time protection, as well as EDR and XDR capabilities for threat investigation and response.
At the time of writing, the compromised versions of LiteLLM had already been removed from PyPI and OpenVSX. If you have used them, and as a proactive response to the threat, we recommend taking the following measures on your systems and infrastructure:
Perform a full system scan using a reliable security solution.
Rotate all potentially compromised credentials: API keys, environment variables, SSH keys, Kubernetes service account tokens, and other secrets.
Check hosts and clusters for signs of compromise: the presence of ~/.config/sysmon/sysmon.py files and suspicious pods in Kubernetes.
Clear the cache and conduct an inventory of PyPI modules: check for malicious ones and roll back to clean versions.
Check for indicators of compromise (files on the system or network signs).
TeamPCP backdoored LiteLLM v1.82.7–1.82.8, likely via Trivy CI/CD, adding tools to steal credentials, move in Kubernetes, and keep persistent access.
Threat actor TeamPCP compromised LiteLLM versions 1.82.7 and 1.82.8, likely through a Trivy CI/CD breach. LiteLLM, with over 95 million monthly downloads, helps developers route LLM requests via a single API.
The malicious releases, now removed from PyPI, included a multi-stage payload: a credential harvester targeting SSH keys, cloud data, wallets, and .env files; tools for lateral movement in Kubernetes via privileged pods; and a persistent systemd backdoor connecting to a remote server for further payloads.
On March 24, 2026, Endor Labs discovered that LiteLLM versions 1.82.7 and 1.82.8 on PyPI were backdoored, despite no malicious code in the GitHub repo.
The compromised versions execute a hidden payload on import, while v1.82.8 also installs a .pth file to trigger it on any Python run. Version 1.82.6 remains the last safe release.
The malware launches a three-stage attack: stealing credentials (SSH keys, cloud tokens, Kubernetes secrets, wallets, .env files), spreading across Kubernetes clusters via privileged pods, and installing a persistent systemd backdoor that fetches more payloads. Attackers encrypted stolen data before exfiltrating it. The campaign is linked to TeamPCP, already tied to attacks across multiple ecosystems including GitHub Actions, Docker Hub, npm, OpenVSX, and PyPI.
“The malicious code resides in a single file within the litellm wheel distributed on PyPI: litellm/proxy/proxy_server.py. The attacker inserted 12 lines at line 128, between two unrelated legitimate code blocks (the REALTIME_REQUEST_SCOPE_TEMPLATE dictionary and the showwarning function).” reads the report published by Endor Labs. “The GitHub source at the corresponding commit does not contain these lines — the injection was performed during or after the wheel build process.”
The malicious code was hidden inside the LiteLLM PyPI package, specifically in proxy_server.py, where 12 malicious lines were inserted during or after the wheel build.
he 12-line injection in proxy_server.py (lines 128–139). The malicious code sits between the legitimate REALTIME_REQUEST_SCOPE_TEMPLATE dictionary (line 122) and the showwarning function (line 141). Line 130 contains the active base64 payload (34,460 characters); lines 131–132 contain commented-out earlier iterations.
This code runs automatically when the module is imported, silently decoding and executing a payload. It avoids detection by using subprocess calls instead of flagged methods like exec().
Version 1.82.8 adds a more dangerous method: a .pth file that executes the payload on every Python startup, even if LiteLLM is never used. It runs in the background, making detection harder and spreading impact across any Python process in that environment.
“This makes 1.82.8 significantly more dangerous: any Python script, test runner, or tool invoked in an environment where litellm is installed will silently trigger the credential harvester in the background.” continues the report.
The malware works in three stages. First, it launches an orchestrator that collects and encrypts stolen data before sending it to a remote server. Second, a credential harvester scans the system for sensitive data, including SSH keys, cloud credentials, Kubernetes secrets, environment files, databases, wallets, and system logs. It can also move laterally by deploying privileged pods across Kubernetes nodes.
Finally, it installs a persistent backdoor as a systemd service that regularly contacts a remote server, downloads new payloads, and maintains long-term access while blending in with normal system processes.
The malicious code reveals three development stages left in the package as commented base64 blobs. The first version used exec() and basic obfuscation, already targeting credentials and using the same C2 and persistence. The second included both old and new harvester code, showing a transition phase. The final version refined delivery, replacing exec() with subprocess techniques to evade detection, while keeping the same targets and infrastructure.
The malware uses two C2 domains: one to receive encrypted stolen data and another to deliver additional payloads. Its obfuscation relies on multiple nested base64 layers and standard library code to appear harmless. Stolen data is protected with RSA+AES encryption, and the package was rebuilt with valid hashes, making detection difficult without comparing it to the original source.
Endor Labs attributes the attack to TeamPCP with high confidence, citing strong overlaps with earlier incidents reported by Wiz. Key indicators match exactly, including the same C2 domain (checkmarx.zone), identical persistence files (sysmon.py and sysmon.service), the “System Telemetry Service” name, 50-minute beaconing, the same kill switch logic, and the tpcp.tar.gz exfiltration archive. Encryption methods and Kubernetes persistence techniques are also consistent.
Timeline data supports this attribution, the researchers reported that the malicious LiteLLM versions were released shortly after the KICS compromise, with rapid iteration between versions.
TeamPCP repeatedly leverages stolen credentials to pivot across ecosystems, targeting security tools to maximize access to sensitive data and infrastructure.
“This campaign is almost certainly not over. TeamPCP has demonstrated a consistent pattern: each compromised environment yields credentials that unlock the next target.” concludes the report. “The litellm compromise is the latest escalation in a month-long campaign that began with a single incomplete incident response. On February 28, an autonomous bot exploited a workflow vulnerability in Trivy and stole a PAT. Aqua remediated the surface-level damage but left residual access. Three weeks later, TeamPCP leveraged that opening — and in five days crossed five supply chain ecosystems: GitHub Actions, Docker Hub, npm, OpenVSX, and now PyPI.”