You Didn't Need An LLM. You Needed A Query

In this new age of fast decisions, cheap code, and the ever-forgiving standard of "good enough", something fascinating is happening in businesses all over the world. It's a pattern the Luddites saw coming two centuries ago. They weren't smashing machines out of fear; they were smashing the ones producing low quality output.

As a heavy user of AI and large language models for several years, both as an architect and a developer, I've developed a reasonably deep understanding of what they can do. And it is, genuinely, quite impressive. Just not at everything.

In the LLM gold rush, almost every product I see now has AI as its leading differentiator. On its adverts, its homepage, its pitch deck. And what I keep seeing is AI applied to the wrong problems.

For example, businesses are using LLMs to calculate statistics.

People looking at jumbled nonsense statistics

Leave aside, for a moment, the question of whether a transformer-based model can even do this reliably. It is a spectacularly expensive approach to a problem we solved decades ago. You calculate statistics deterministically, the way we have done since long before LLMs turned up.

I do understand why it happens. It looks like a way to get features without the expense of cleaning your data or upgrading your data estate, and to an extent that is true (ish). You might get some market advantage out of it. The question is whether you are quietly building a house of cards.

You could ask an LLM to calculate a standard deviation, and depending on the model, you may even get a plausible or correct answer. But this is hammering in a nail with a water-guzzling, power-sucking behemoth of a machine, one that can do genuinely amazing things, tied up doing arithmetic we already solved decades ago.

It feels like every SaaS company has bolted LLM-based sentiment analysis onto their product in the last eighteen months.

Can an LLM do sentiment analysis? Yes. It just isn't very good at it. I didn't learn this the easy way. On my machine, it was flawless, so I ran with it. Then, the fateful thought: let's ramp this up for a real test. Let's run it 500,000 times and see what happens.

Around 70% were outright wrong. The rest were fantastic. I mean good enough, the new fantastic.

An expensive learning.

Now, as I am writing and, of course, dutifully running my prose through an LLM to catch my lazy spelling and opt-in/out grammar, I am cautiously hopeful for the future of these things, not least because I would rather the value of my house and investments didn't tank if the bubble ever takes the global economy with it. But I am also genuinely nervous that we are quietly setting a cross-domain standard of "good enough" that could produce poor outcomes for, well, everyone.

So, to my fellow tech leaders, who have been charged with presenting the AI-powered 20% efficiency story by Friday: hold your nerve. Keep one eye on first principles, and one eye on the LLM hype you have to accommodate. And, well. That's both eyes used