Complex Mathematics

Anthropic will nuke your attempt to use AI to build a nuke




  • Anthropic has developed an AI-powered tool that detects and blocks attempts to ask AI chatbots for nuclear weapons design
  • The company worked with the U.S. Department of Energy to ensure the AI could identify such attempts
  • Anthropic claims it spots dangerous nuclear-related prompts with 96% accuracy and has already proven effective on Claude

If you’re the type of person who asks Claude how to make a sandwich, you’re fine. If you’re the type of person who asks the AI chatbot how to build a nuclear bomb, you’ll not only fail to get any blueprints, you might also face some pointed questions of your own. That’s thanks to Anthropic’s newly deployed detector of problematic nuclear prompts.

Like other systems for spotting queries Claude shouldn’t respond to, the new classifier scans user conversations, in this case flagging any that veer into “how to build a nuclear weapon” territory. Anthropic built the classification feature in a partnership with the U.S. Department of Energy’s National Nuclear Security Administration (NNSA), giving it all the information it needs to determine whether someone is just asking about how such bombs work or if they’re looking for blueprints. It’s performed with 96% accuracy in tests.



Source link