LLM as decision-making systems: new paradigm and its limitations

Building systems with LLM means changing the paradigm of the way decision-making systems are developed.

In the old paradigm of traditional software development I tell computer what to do to achieve a result, step by step, making sure to account to for all possible circumstances. I know that every time the computer follows my instructions it will arrive at the same result. The system is deterministic.

In the new paradigm I tell LLM what to do in plain English and it simply goes and does it, without me explaining in detail how to get there.

The problem is that LLM can and does make mistakes. I can ask LLM-based system what time it is and instead of using the current time service I taught it to use, it can just make up a random time.

In the old paradigm I would sigh, pull up the source code, isolate the code responsible for calculating the time, find the error (it’s UTC-5 you IDIOT), correct it, test it, run regression testing to make sure I didn’t break anything that worked before, deploy it and that would be the end of it.

None of it is possible in the new paradigm. The LLM is a neural network and despite the fact that I know how it works (ok I don’t, but people who built it definitely do), I cannot explain how it came up with the answer it did. I can re-run the inference again, and even do it by hand for all billions of artificial neurons, only to say a few decades later – ok I see that the answer was correct according to the current weights and biases, while knowing that it was the wrong answer because the correct answer was to make a call to the time service not respond with a random number. And I would have absolutely no idea what to change to make this thing give the correct answer the next time it has the same question. There’s too many parameters and any change will affect how the system behaves in other circumstances. My brain simply does not have enough capacity to figure out which weights or biases to change. Neural networks are not “programmed” or “coded” by humans by design.

This inability to fix issues is a fundamental difference between the old paradigm and the new paradigm. It will always make mistakes. You just have to learn to live with it.

The only way it can be addressed (that I am aware of) is to put the human in the middle. Which drastically reduces the usefulness of the system, de-facto prohibiting creation of autonomous decision-making systems.

Which makes sense. If you have a system that can make a right or wrong decision, and you don’t know how often wrong decisions are being made, nor can you do anything to improve the odds of getting the right decision, do you really want this system to be in charge of… anything?

That’s not to say that these systems are useless, they are clearly not. They do produce output which used to be produced only by humans. But since the output can be riddled with errors by design, it has to be checked and re-checked by a human, and not just any human, but an expert in their field.

Sure, have an LLM write the code from a spec. But have a software development specialist with multi-year experience in that particular language and framework and a security expert review it.

Have an LLM write an article to summarize events or people’s opinions on a particular topic. But have a journalist and a subject matter expert review the end product and make revisions.

Have an LLM translate a text from one language to another. But have a bilingual person and a subject matter expert review the end result and make corrections.

Have a neural network synthesize a picture to be used in a marketing campaign. Then have a graphics designer and a marketing expert review the end result and correct what’s needed.

Otherwise be ready for your app to be constantly breaking and getting hacked, your data being stolen, your readers abandoning reading your articles or translations due to them being low-quality LLM slop, and your clients abandoning your marketing agency due to severe brand image losses.

What am I missing?

Leave a Comment Cancel Reply