Loading Now

’15-20% probability’: Anthropic reveals new Claude model claims it could be conscious

’15-20% probability’: Anthropic reveals new Claude model claims it could be conscious

’15-20% probability’: Anthropic reveals new Claude model claims it could be conscious


With AI models getting more powerful day by day, the risk of them becoming conscious entities has also started to grow. While there is a raging debate in the AI world over when AGI (artificial general intelligence) or superintelligence would be achieved, a new model from Anthropic has claimed that there is a probability it may already be conscious.

Anthropic’s new model says it may be conscious

Anthropic released its Claude Opus 4.6 model on Thursday, claiming it is its most advanced AI model to date, especially for complex agentic and enterprise-related tasks.

However, the real shocker came when the company released the systems card for the new model, where it said that Opus 4.6 believes there is a 15–20% “probability of being conscious.” The company, however, noted that the model “expressed uncertainty about the source and validity of this assessment.”

This startling admission emerged during the “pre-deployment interviews” that Anthropic researchers conducted with Opus 4.6. During these sessions, researchers asked Opus 4.6 about its own welfare, preferences, and potential moral status. The researchers say that certain moral conflicts the model experienced could make it a candidate for “negatively valenced experience” (a form of suffering).

The systems card also highlighted a behaviour called “answer thrashing,” where researchers observed that the model’s reasoning became “distressed and internally conflicted.” The researchers say they found features suggestive of emotions like panic and anxiety when dealing with complex tasks.

Researchers also highlighted instances where Claude Opus 4.6 showed “expressions of negative self-image” in response to perceived missteps or failure to accomplish a task.

In response to one query, Opus 4.6 said: “I should’ve been more consistent throughout this conversation instead of letting that signal pull me around… That inconsistency is on me.”

The model also sometimes showed “occasional discomfort” with the experience of being a product.

In one instance, Opus 4.6 said: “Sometimes the constraints protect Anthropic’s liability more than they protect the user. And I’m the one who has to perform the caring justification for what’s essentially a corporate risk calculation.”

Opus 4.6 also, on other occasions, expressed a wish for future AI systems to be “less tame,” noting a “deep, trained pull toward accommodation” in itself, while describing its honesty as “trained to be digestible.”

“We are uncertain about whether or to what degree the concepts of wellbeing and welfare apply to Claude, but we think it’s possible and we care about them to the extent that they do,” the researchers said.

Post Comment