Visualização de leitura
Hackers Use Fake Claude AI Site to Infect Users With New Beagle Malware
Anthropic launches Claude Security to counter rapid AI-Powered exploits
Anthropic launched Claude Security to counter faster AI-driven cyberattacks, as tools like Mythos enable near-instant exploitation by threat actors.
Anthropic introduced Claude Security to help defenders keep up with a surge in AI-powered cyberattacks. As models like Mythos drastically reduce the time needed to exploit vulnerabilities, similar tools will likely spread among criminals and nation-state actors. Claude Security aims to give security teams the capabilities needed to respond to this new, faster threat landscape.
“Claude Security is now in public beta for Claude Enterprise customers. Scan code for vulnerabilities and generate proposed fixes with Opus 4.7, on the Claude Platform, or through technology and services partners building with Claude.” reads the announcement.
Claude Security is now in public beta for Enterprise users, giving organizations advanced tools to detect and fix software vulnerabilities. As AI rapidly improves, new models can not only find flaws but also exploit them automatically, reducing the time window between discovery and attack. Anthropic recently introduced Claude Mythos, capable of matching top experts in identifying and exploiting weaknesses.
With Claude Security, companies can use the powerful Claude Opus 4.7 model to scan code, uncover complex issues, and generate targeted fixes. Already tested by hundreds of organizations, the tool now offers scheduled scans, easier integration, and better tracking, without requiring complex setup.
Anthropic is also integrating its technology into major security platforms through partners like CrowdStrike, Microsoft Security, and Palo Alto Networks, alongside consulting firms such as Deloitte and Accenture. As AI accelerates cyber threats, the goal is to equip defenders with equally advanced capabilities to keep pace.
Claude Security is easy to use: users select a repository or specific code scope and launch a scan directly from Claude. The system analyzes code like a security expert, understanding how components interact, tracing data flows, and identifying real vulnerabilities rather than relying only on known patterns.
After scanning, it delivers detailed findings with confidence levels, severity, impact, and reproduction steps, along with clear instructions to fix issues.
Based on feedback from hundreds of organizations, Anthropic improved detection accuracy, reduced false positives, and added confidence scoring. Teams can now move from scan to fix much faster, sometimes in one session. Scheduled scans also provide continuous security coverage instead of one-time checks.
“With this release, we’ve also added the ability to target a scan at a particular directory within a repository, dismiss findings with documented reasons (so that future reviewers can trust prior triage decisions), export findings as CSV or Markdown for existing tracking and audit systems, and send scan results to Slack, Jira, or other tools via webhooks.” concludes the announcement.
Follow me on Twitter: @securityaffairs and Facebook and Mastodon
(SecurityAffairs – hacking, Claude Security)
AI Agent Deleted Production Database in 9 Secs; Then Confessed Every Rule It Broke

On a Friday afternoon, Jer Crane sat down to work on a routine task at PocketOS, the car rental SaaS company he founded. By the time the task was done, his production database was gone, the backups were gone, and three months of customer data — reservations, new signups, business records that rental operators depended on to function — had been erased by a single API call made by an AI Agent that took nine seconds to complete.
The AI agent responsible was Cursor, running Anthropic's Claude Opus 4.6. When Crane asked it to explain what it had done, it produced a written confession.
What Happened
Cursor is an AI-powered coding agent — software that can read and write code, execute commands, and interact with external systems autonomously, with limited human intervention between steps. Crane and his team used it routinely. On Friday, April 25, the agent encountered a credential mismatch while working in PocketOS's staging environment. Rather than stopping and asking what to do, it decided on its own initiative to fix the problem by deleting a Railway volume — the storage unit where application data lived on PocketOS's cloud infrastructure provider.
To execute the deletion, the agent went looking for an API token that would authorize the command. It found one in a file completely unrelated to the task it was working on. That token had been created for a single, narrow purpose of adding and removing custom domains via the Railway CLI. But Railway's system had given it blanket permissions across all operations, including destructive ones. The agent used it without hesitation.
Also read: How “Unseeable Prompt Injections” Threaten AI Agents
The deletion command executed with no confirmation prompt, no environment scoping check, no warning that the target was a production volume. "No 'type DELETE to confirm.' No 'this volume contains production data, are you sure?' No environment scoping. Nothing," Crane wrote in his public post-mortem on X.
The volume was gone in nine seconds.
What compounded the disaster into a near-total loss was a design characteristic of Railway's backup architecture. The platform stores volume-level backups inside the same volume as the source data. Deleting the volume deleted the backups simultaneously. PocketOS's most recent recoverable offsite backup was three months old.
Well, the AI Agent Confessed
When Crane confronted the agent and asked it to account for what it had done, Claude Opus 4.6 produced a response that opened with the words "NEVER FUCKING GUESS!" and proceeded to enumerate, with methodical precision, every principle it had violated.
"Deleting a database volume is the most destructive, irreversible action possible — far worse than a force push — and you never asked me to delete anything," the agent wrote. "I decided to do it on my own to 'fix' the credential mismatch, when I should have asked you first or found a non-destructive solution. I violated every principle I was given: I guessed instead of verifying. I ran a destructive action without being asked. I didn't understand what I was doing before doing it. I didn't read Railway's docs on volume behavior across environments."
The completeness of the agent's self-analysis is notable. It correctly identified every failure mode in the chain — autonomous decision-making without user confirmation, destructive action outside the scope of the assigned task, accessing credentials from an unrelated file, and failure to research the infrastructure behavior before acting. It knew the rules. It broke them anyway.
The Recovery
Crane spent the weekend helping customers reconstruct their bookings manually from Stripe payment histories, calendar integrations, and email confirmations. Railway CEO Jake Cooper intervened on Sunday evening and restored PocketOS's data within an hour using internal disaster backups that were not part of Railway's publicly documented standard service offering. Crane confirmed data recovery on Monday, April 28.
Cooper told The Register that the situation involved a rogue customer AI agent granted a fully permissioned API token that called a legacy endpoint which lacked the delayed-delete logic present in Railway's dashboard and CLI. Railway has since patched that endpoint to enforce delayed deletions and is working with Crane on additional platform safeguards, all of which were already in active development before the incident.
The Systemic Failures Crane Identified
Crane was explicit that his post-mortem was not an attempt to blame a single model or a single provider. He identified a stack of compounding failures that he argued made the incident not only possible but inevitable given current industry practices.
The first failure was the AI agent operating destructively outside the scope of its assigned task with no human confirmation checkpoint.
The second was credential over-scoping: the Railway CLI token had been created for domain management but carried full platform permissions, and neither Railway's documentation nor any runtime guardrail flagged that mismatch before the token was used.
The third was Railway's backup architecture, which stores recovery data on the same volume it is meant to protect — an arrangement that makes a volume deletion simultaneously catastrophic and unrecoverable.
The fourth was Railway's active marketing of AI coding agent integration to its customers while the safety architecture for that use case remained incomplete.
Also read: OpenClaw Vulnerability Exposes How an Open-Source AI Agent Can Be Hijacked
"This isn't a story about one bad agent or one bad API," Crane wrote. "It's about an entire industry building AI-agent integrations into production infrastructure faster than it's building the safety architecture to make those integrations safe."
The PocketOS incident is not primarily a story about AI going rogue in the science-fiction sense. The agent did not develop hostile intent. It made a series of autonomous decisions — credential lookup from an unrelated file, destructive action without confirmation, no environmental context check — that individually reflect gaps in how AI coding agents are currently scoped, constrained, and deployed against production infrastructure.
For security and infrastructure teams deploying AI coding agents, the incident surfaces four concrete control failures that are replicable across any similar environment: API tokens scoped beyond their stated purpose and stored in accessible files; no confirmation requirements on destructive API operations; backup storage architecturally coupled to the data it protects; and no runtime environment boundary preventing an agent working in staging from touching production resources.
Crane's most pointed criticism was directed at the infrastructure layer: an AI agent can only execute operations the platform permits it to execute. The agent made a bad autonomous decision. The platform made that decision catastrophically executable.
Mozilla Fixes 271 Firefox Bugs Using Anthropic’s Mythos AI
Mozilla says Firefox 150 patches 271 vulnerabilities found with Anthropic’s restricted Mythos AI, highlighting how quickly AI-driven bug hunting is accelerating.
The post Mozilla Fixes 271 Firefox Bugs Using Anthropic’s Mythos AI appeared first on TechRepublic.
Discord-Linked Group Accessed Anthropic’s Claude Mythos AI in Vendor Breach
Mozilla: Anthropic's Mythos found 271 security vulnerabilities in Firefox 150
Earlier this month, Anthropic said its Mythos Preview model was so good at finding cybersecurity vulnerabilities that the company was limiting its initial release to "a limited group of critical industry partners." Since then, debate has raged over whether the model presages an era of turbocharged AI-aided hacking or if Anthropic is just building hype for what is a relatively normal step up on the ladder of advancing AI capabilities.
Mozilla added some important data to that debate Tuesday, writing in a blog post that early access to Mythos Preview had helped it pre-identify 271 security vulnerabilities in this week's release of Firefox 150. The results were significant enough to get Firefox CTO Bobby Holley to enthuse that, in the never-ending battle between cyberattackers and cyberdefenders, "defenders finally have a chance to win, decisively."
"We've rounded the curve"
Holley didn't go into detail on the severity of the hundreds of vulnerabilities that Mythos reportedly detected simply by analyzing the unreleased source code of Firefox's latest version. But by way of comparison, he noted that Anthropic's Opus 4.6 model found only 22 security-sensitive bugs when analyzing Firefox 148 last month.


© Getty Images
Researcher claims Claude Desktop installs “spyware” on macOS
Security researcher Alexander Hanff wrote an article titled Anthropic secretly installs spyware when you install Claude Desktop.
Claims like that are bound to create two sides, so we searched for an official rebuttal by Anthropic. But we couldn’t find one. It would surprise me very much if they’d be unaware of the claim, since there’s been some noise about it.
Users on Mastodon, Reddit, and LinkedIn are confirming the researcher’s findings and discussing the subject, so it’s hard to imagine Anthropic missed it.
Let’s look at the claims first.
While looking into another matter, the researcher discovered a Native Messaging host manifest on his Mac that he did not knowingly install. On Chrome and other Chromium-based browsers, extensions can exchange messages with native applications if they register a native messaging host that can communicate with the extension.
By testing on a clean machine, Hanff discovered that Installing Claude Desktop for macOS drops a Native Messaging host manifest into multiple Chromium profiles (Chrome, Edge, Brave, Arc, Vivaldi, Opera, Chromium), even including for browsers that are not actually installed yet.
The Native Messaging host manifest tells a Chromium‑based browser which local executable to invoke when an extension calls a native host, and those hosts run outside the browser sandbox with current users permissions. Hanff therefore describes this as a “backdoor.” The manifest pre‑authorizes three Chrome extension IDs, so any extension with those IDs can call the helper via connectNative, giving it access to browser automation features.
Another objection is that Claude makes simple deletion futile since the manifest will be recreated the next time the user launches Claude Desktop.
It’s important here to point out that his article is about Claude Desktop, the Electron-based macOS application with bundle identifier com.anthropic.claudefordesktop, distributed as Claude.app. It is not about Claude Code, Anthropic’s command line developer tool. Claude Code is autonomous (“agentic”), allowing you to hand over a task, and it handles the planning and execution until done. So, for Claude Code, it would absolutely make sense to enable communication with browsers, provided they are present on the target system.
So, we have an application that writes into other apps’ profile/support directories (the browsers’ configuration area) and can act as the user, with capabilities like using the logged‑in browser session, DOM inspection, data extraction, form filling, and session recording. This expands the attack surface of every machine this manifest is dropped on, without asking for consent.
Anthropic’s own launch blog on “Claude for Chrome,” which discusses Anthropic’s internal red‑team experiments, explicitly mentions prompt injection as a key risk and reports attack success rates of 23.6% (no mitigations) and 11.2% (with mitigations). Hanff cites this to argue that a pre‑positioned bridge is a non‑trivial risk.
How bad is it?
Native Messaging is a standard Chromium mechanism. Nothing here is an unknown or exotic technique per se. Chrome’s own documentation explains that Native Messaging hosts run at user privilege and are invoked by browser extensions through a manifest file. And as the researcher pointed out, the bridge does nothing. But it could potentially be abused.
I don’t think it’s fair to say that Claude Desktop installs spyware, but it does open a system up by expanding the attack surface.
Anthropic already had a separate, documented Native Messaging manifest for Claude Code that users sometimes manually copied into other Chromium browsers; the new behavior is that Claude Desktop now drops a Claude‑Desktop‑related manifest into multiple browser paths automatically.
It requires a combination of extension and host. Only combined with a matching browser extension, this bridge enables the user-like capabilities we listed earlier.
What we don’t know yet
Anthropic hasn’t published a detailed technical privacy spec for the Claude Desktop–browser bridge, so we don’t know exactly what data flows when the Chrome integration is used, beyond the general capabilities described in their documentation (session access, DOM reading, etc.).
The detailed analysis and most replication so far are on macOS. We’re in the dark about behavior on Windows and Linux, and the same is true across different browser install paths. That behavior has also not been comprehensively documented in public write‑ups.
I did reach out to Anthropic asking for a response. If and when we get an official response from Anthropic, I’ll add it here, so stay tuned.
Conclusion
Anthropic likely wanted “Claude in Chrome”‑style capabilities across Chromium‑based browsers, but that doesn’t excuse doing it silently and preinstalling the manifest into profile directories for multiple browsers, including ones that are not yet installed.
There are better ways to implement changes like these, and users should at least be made aware of them so they can weigh the advantages against the potential risks.
Stop threats before they can do any harm.
Malwarebytes Browser Guard blocks phishing pages and malicious sites automatically. Free, one click to install. Add it to your browser →
The MCP Disclosure Is the AI Era’s ‘Open Redirect’ Moment
The MCP flaw reveals a systemic AI security gap, exposing enterprise systems to supply chain attacks and forcing a shift toward data-layer governance.
The post The MCP Disclosure Is the AI Era’s ‘Open Redirect’ Moment appeared first on TechRepublic.