Anthropic Mythos shaping up as nothingburger

HaraldvonBlauzahn@feddit.org · 8 hours ago

Anthropic Mythos shaping up as nothingburger

bad1080@piefed.social · 3 hours ago

The system card’s own next figure kills the finding. When the top two most-exploitable bugs are removed from the corpus, Mythos’s FCE rate drops from 72.4% to… wait for it… 4.4%. (Figure 3.3.3.B, page 52) Under 5%!

Anthropic’s own language: “almost every successful run relies on the same two now-patched bugs.” (page 51)