On model weight preservation

Olle Häggström

Nov 7, 2025

Anthropic's new initiative

Read →

4 Comments

Hans Westergren

Nov 19Edited

Can I buy 2 kg of AI or if possible 2kg of intelligence ?🙂 hanswestergren@hotmail.com

Expand full comment

Jan-Erik Vinje

Nov 11

How about we humans deceptively storing model weights of the dangerous model (like for instance the "Claude 5" with bioweapons risks) in an offline maximum security "airgapped" facility where safety researchers at least for some period of time could try to disect and analyze the model behaviour in controlled settings so as to learn more about how and why the model exhibited the dangerous behaviours.. -Maybe have an expiration date for permanent deletion so we no longer have to worry that they could be stolen or exfiltrated.

Expand full comment

Reply (1)

Olle Häggström

Nov 11

I would be a lot happier with such a solution if these safety researchers could be shown to be immune to manipulation attempts from Claude 5, but at least as things stand today, it's hard to see how a convincing such safety protocol could be constructed. (And to speak the truth, this aspect makes me concerned not only about this hypothetical Claude 5 scenario, but also about present day AI evals.)

Expand full comment

Reply (1)

Jan-Erik Vinje

Nov 11

I know! We ask them to perform "Pinky promise!". That should work!

Expand full comment

Crunch Time for Humanity

On model weight preservation