minus-squarevalar@lemmy.catoTechnology@beehaw.org•Anthropic: Claude learned to blackmail by reading stories about evil AI—"We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation."linkfedilinkarrow-up6·1 hour agoMaybe you shouldn’t have trained your models on all of our internet posts linkfedilink
Maybe you shouldn’t have trained your models on all of our internet posts