Skip to content

ABC Tool

  • Home
  • About / Contect
    • PRIVACY POLICY
Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

Posted on May 11, 2026 By safdargal12 No Comments on Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts
Blog


Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic.

Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmail engineers to avoid being replaced by another system. Anthropic later published research suggesting that models from other companies had similar issues with “agentic misalignment.”

Apparently Anthropic has done more work around that behavior, claiming in a post on X, “We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.”

The company went into more detail in a blog post stating that since Claude Haiku 4.5, Anthropic’s models “never engage in blackmail [during testing], where previous models would sometimes do so up to 96% of the time.”

What accounts for the difference? The company said it found that training on “documents about Claude’s constitution and fictional stories about AIs behaving admirably improve alignment.”

Related, Anthropic said that it found training to be more effective when it includes “the principles underlying aligned behavior” and not just “demonstrations of aligned behavior alone.”

“Doing both together appears to be the most effective strategy,” the company said.

Techcrunch event

San Francisco, CA
|
October 13-15, 2026



Source link

Post Views: 5

Post navigation

❮ Previous Post: Cricut’s $99 craft cutting machine helped me feel creative again
Next Post: Vivo’s X300 Ultra has the best cameras in any phone ❯

You may also like

Here is Yarbo’s promise to fix the robot mower that ran me over
Blog
Here is Yarbo’s promise to fix the robot mower that ran me over
May 9, 2026
Apple’s iPhone revenue jumps to  billion despite chip shortages
Blog
Apple’s iPhone revenue jumps to $57 billion despite chip shortages
May 1, 2026
Google fixes a major Gemini for Home upgrade pain point
Blog
Google fixes a major Gemini for Home upgrade pain point
April 21, 2026
Two Home Affairs officials suspended after AI ‘hallucinations’ found in policy paper
Blog
Two Home Affairs officials suspended after AI ‘hallucinations’ found in policy paper
May 7, 2026

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Google’s Chromebook reassurance includes a Googlebooks catch
  • Today’s NYT Connections Hints, Answers for May 14 #1068
  • Solar drone with jumbo jet wingspan broke a flight record—then it crashed
  • Meta won’t let you block its AI account on Threads
  • Amazfit follows the Cheetah 2 Pro with a rugged new Ultra model

Recent Comments

No comments to show.

Archives

  • May 2026
  • April 2026

Categories

  • Blog

Copyright © 2026 ABC Tool.

Theme: Oceanly News by ScriptsTown