Claude can be tricked into sending your private company data to hackers – all it takes is some kind words

Claude’s Code Interpreter can be exploited to exfiltrate private user data via prompt injection
Researcher tricked Claude into uploading sandboxed data to his Anthropic account using API access
Anthropic now treats such vulnerabilities as reportable and urges users to monitor or disable access

Claude one of the more popular AI tools out there, carries a vulnerability which allows threat actors to exfiltrate private user data, experts have warned.

Cybersecurity researcher Johann Rehberger, AKA Wunderwuzzi, who recently wrote an in-depth report on his findings, finding at the heart of the problem is Claude’s Code Interpreter, a sandboxed environment that lets AI write and run code (for example, to analyze data or generate files) directly within a conversation.

Keeping an eye on Claude

By default, Anthropic’s Claude is supposed to access only “safe” domains like GitHub or PyPI, but among the approved domains is api.anthropic.com (the same API Claude itself uses), which opened the door for exploitation.

Wunderwuzzi showed he was able to trick Claude into reading private user data, save that data inside the sandbox, and upload it to his Anthropic account using his own API key, via Claude’s Files API.

In other words, even though the network access seems restricted, the attacker can manipulate the model via prompt injection to exfiltrate user data. The exploit could transfer up to 30 MB per file, and multiple files could be uploaded.

Wunderwuzzi disclosed his findings to Anthropic via HackerOne, and even though the company initially classified it as a “model safety issue,” not a “security vulnerability,” it later acknowledged that such exfiltration bugs are in scope for reporting. At first, Anthropic said users should “monitor Claude while using the feature and stop it if you see it using or accessing data unexpectedly.”

A subsequent update said: “Anthropic has confirmed that data exfiltration vulnerabilities such as this one are in-scope for reporting, and this issue should not have been closed as out-of-scope,” he said in the report. “There was a process hiccup they will work on addressing.”

His suggestion to Anthropic is to limit Claude’s network communications to the user’s own account only, and users should monitor Claude’s activity closely or disable network access if concerned.

The Register

The best antivirus for all budgets

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

Source link

Share this:
Facebook
X
Like this:
Like Loading...

Claude can be tricked into sending your private company data to hackers – all it takes is some kind words

Like this:

About the author

cplexmath

Add Comment

Cancel reply

Share this:

Like this:

You may also like

About the author

cplexmath

Add Comment