evald.ai Entities

METR Blog

CoT May Be Highly Informative Despite “Unfaithfulness”

Recent work from Anthropic and others claims that LLMs' chains of thoughts can be “unfaithful”. These papers make an important point: you can't take everything in the CoT at face value. As a result, people often use these results to conclude the CoT is...

Anthropic