File this one under very-partially-baked ideas.

I find fear-mongering to be a fascinating marketing play for AI companies. Try it for any other product or service and it’s a bit weird. “What we’re building could be very very bad for everyone, but hey, it’s us building it.”

Anthropic’s Mythos is not the first time for this, others have drawn the link between it and GPT-2 back in 2019. I also had some thoughts at the time. But even beyond specific model releases, leaders of these firms are very interested in talking about how dangerous their products are.

So, why? On the one hand, marketing efforst like this do have an analogy: security firms identifying exploits to raise their profile and get new clients. Or from the firm’s side, establishing bug bounties to get buy in from the community.

An alternative explanation comes from the “-washing” phenomena. Let’s look at a few (with some oversimplified definitions from me):

  • Greenwashing, where a firm leans into the language of environmentalism without substantively addressing environmental issues with their products
  • Sportswashing, where a country or company hides bad behavior behind sports
  • AI-washing, where you add an if-else to your code and call your app “AI-powered”

So, what about panicwashing (or doomwashing, I’m bad at naming things)? A firm seeks to increase usage, which in the case of AI firms means number of subscriptions, API calls, etc. Benchmarking is a bit of a horse-race, where the top labs regularly beat the benchmark of the moment until it’s saturated and a new one comes around.

Now, how does this sound for a strategy:

  • You run a large AI lab. Performance on benchmarks is table stakes so put it to the side.
  • You want people to sign up with you and kick the tires (aka users who are not going to push the limits on token use). Users who try a few commands, realize that AI can do some crazy stuff, but then leave it at that.
  • You tell folks that AI is the bees knees, but it might sting. And that you are trying your best to keep it safe.
  • That gives you visibility, which will hopefully lead to market share.
  • Crucially: You havent’t actually done anything other than stoke fear.

Just look at this timeline for how much engagement was generated after the GPT-2 release. That is a crazy amount of discussion, general public engagement, and political action.

Something that is very important to note here is the parallel lack of peer-review for these things. I of course understand how hard it is to eval these things. But, the preprint (or blog post) as research paradigm that these firms have established is a big part of the problem. For all of its issues, peer-review is still the best bet for motivating and evaluating hypotheses.

I don’t think the panicwashing is going anywhere, but it would be nice to see increased pressure on these firms to either expose their claims to peer-review or interesting ideas from academia on how to get back in the frontier-model game such that there is a peer-reviewed model that is comparable to what is being put out by firms.