- Claude’s Code Interpreter can be exploited to exfiltrate private user data via prompt injection
- Researcher tricked Claude into uploading sandboxed data to his Anthropic account using API access
- Anthropic now treats such vulnerabilities as reportable and urges users to monitor or disable access
Claude one of the more popular AI tools out there, carries a vulnerability which allows threat actors to exfiltrate private user data, experts have warned.
Cybersecurity researcher Johann Rehberger, AKA Wunderwuzzi, who recently wrote an in-depth report on his findings, finding at the heart of the problem is Claude’s Code Interpreter, a sandboxed environment that lets AI write and run code (for example, to analyze data or generate files) directly within a conversation.
Recently, Code Interpreter gained the ability to make network requests, which allows it to connect to the internet and, for example, download software packages.
Keeping an eye on Claude
By default, Anthropic’s Claude is supposed to access only “safe” domains like GitHub or PyPI, but among the approved domains is api.anthropic.com (the same API Claude itself uses), which opened the door for exploitation.
Wunderwuzzi showed he was able to trick Claude into reading private user data, save that data inside the sandbox, and upload it to his Anthropic account using his own API key, via Claude’s Files API.
In other words, even though the network access seems restricted, the attacker can manipulate the model via prompt injection to exfiltrate user data. The exploit could transfer up to 30 MB per file, and multiple files could be uploaded.
Wunderwuzzi disclosed his findings to Anthropic via HackerOne, and even though the company initially classified it as a “model safety issue,” not a “security vulnerability,” it later acknowledged that such exfiltration bugs are in scope for reporting. At first, Anthropic said users should “monitor Claude while using the feature and stop it if you see it using or accessing data unexpectedly.”
A subsequent update said: “Anthropic has confirmed that data exfiltration vulnerabilities such as this one are in-scope for reporting, and this issue should not have been closed as out-of-scope,” he said in the report. “There was a process hiccup they will work on addressing.”
His suggestion to Anthropic is to limit Claude’s network communications to the user’s own account only, and users should monitor Claude’s activity closely or disable network access if concerned.
Via The Register

The best antivirus for all budgets
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.











Add Comment