It was more than 55 years ago that most Americans first became aware of AI via HAL 9000 in Stanley Kubrick’s hit film, 2001: A Space Odyssey.
Of course, 2001: A Space Odyssey is science-fiction and it wouldn’t be until several decades later that the public began discussing AI as a matter of actual science. With the turn of the millennium came technological advancements that, we were told, would pave the way for world-changing AI. Now we're more than 20 years beyond the future imagined by Kubrick’s and, for the most part, we’re still waiting.
Or are we?
Recently the hopes for HAL-like artificial intelligence have been reignited with the introduction of tools like ChatGPT and Midjourney. And if the hype is to be believed, we're in the midst of a veritable AI revolution. If that's the case though, that revolution hasn't made it to the business world. To the water coolers, sure. But not to the operations in any meaningful way.
Yes, it's true that some media brands are experimenting with generative AI while simultaneously downsizing staff. But those experiments are as much PR stunts as genuine business strategies, and for the most part they've already backfired.
At the same time, as anyone who's used ChatGPT or a similar tool will attest, it really is quite impressive and should, ostensibly, be a powerful resource for business. But so far that's not been the case. So what gives? Why aren't the same capabilities wowing consumers translating into real business value?
To answer that question, we need to first explain a little about how these generative AI tools work.
At the most basic level, AI is technology that can receive and carry out tasks without being given explicit instructions on how those tasks should be accomplished. It’s that ability to solve a situation on its own that allows AI to resemble natural intelligence. And it's that same ability that makes AI so much more flexible and unbounded than traditional technologies.
The current crop of generative AI technologies consist of two main types, text generators (i.e. ChatGPT and the like) and image generators (i.e. Midjourney and the like). Here we'll focus on text generators, which boast a much larger range of potential applications — especially as it pertains to business.
These tools aim to achieve artificial general intelligence and pursue that objective chiefly through large language models. Large language models (LLMs), use statistical analysis around defined parameters to analyze huge datasets — learning the patterns and relationships between words and phrases.
Before the model trains on a large amount of unlabeled data, it undergoes pre-training and fine-tuning. This involves ingesting smaller labeled datasets and subjecting outputs to human review and scoring to help the model refine performance. Once performance consistently meets a human standard of intelligence, the model undergoes full-scale training.
After training, the LLM can be used to generate new content based on a prompt, mimicking the patterns modeled in the context of the prompt. If you have a cell phone or use Google, you're already familiar with predictive text technology. You can think of it like that — but on steroids. It doesn't predict the next word or the rest of the sentence, but whole paragraphs.
Generally speaking though, we don't approach these tools with the expectations of souped-up T9 technology. We expect an oracle. And that's the hitch. When we ask questions, we want the AI to act like a researcher, investigator, or logician — digging in to uncover the truth. But that's not how LLMs work. They don't "think" in terms of finding the right answer. They think in terms of producing a reply that resembles other content on the topic.
According to AI expert Rodney Brooks, "What large language models are good at is saying what an answer should sound like, which is different from what an answer should be."
Looking to this technology, therefore, to fill the role of fact-finder, logical arbiter, researcher, or analyst is deeply problematic. Here are just a few reasons why:
It's designed to faithfully simulate language, not actual thoughts or opinions. It can at once sound exactly like an expert opinion while giving a totally non-expert take. This is why the shortcomings of this AI become more apparent the deeper you dive. At the surface everything tends to look A-okay, but when you get specific and relate to areas of more focused expertise, the wheels quickly come apart. (You can test this by asking ChatGPT a pointed and nuanced question relating to any topic of which you are considerably more knowledgable than the public.)
When asked about something novel (i.e. not contained in its training), it will draw on old content to form its reply. Users of ChatGPT will have surely encountered this when the tool get's facts wrong — citing dead or defeated rulers as the leaders of their countries. But more insidious than wrong trivia is wrong logic. LLMs shows a marked inability to shift their "thinking" in the face of shifting situations. Instead of laughing at the obvious absurdity of the reply — as one might when ChatGPT cites Queen Elizabeth II as today's Sovereign of the UK — logical errors are more inconspicuous and users are more likely to accept a faulty premise and build on its shoddy foundations.
And though this liability can can combatted to some degree at the pre-training and fine-tuning stages, the larger the model becomes (bearing in mind that largeness is part of the core concept), the more impossible it becomes to impose any semblance of order over the unsupervised training.
Worse still, after training, LLMs do not generally retain an exact copy of their training materials. Instead, they store abstractions of the data they trained on. They "make sense" of their training by forming and holding onto key takeaways — definitive textual patterns, contextual markers, and salient fragments.
This can be seen as the AI equivalent of how the human mind works — reconciling our experiences subject to a modeled understanding of the world. It's why we hold on more tightly to stories than to details and it's why human memory is notoriously unreliable. It's fundamentally derivative. Which is also why LLMs can be so dangerous and why they so frequently output information as fact that is nothing of the sort.
Complicating matters is the fact that LLMs don't actually model the world, they have no connection to the world. They model language and language alone. The result can look a lot like it's based on a real understanding of the larger world, but it's not. It's based on something (contextualized language patterns) that is based on an understanding of the world. There will always be a degree of separation. Sometimes that won't make a big difference. But a lot of times, it will make a huge difference.
In a recent Futurism article, Victor Tangermann distilled the point saying, "current language models aren't able to logically infer meaning, despite making it seem like they can."
What's more, these AI models are not designed to output replies like "I don't know." They are made to be helpful, so for every question in, they deliver one answer out. Even when they don't have the answer. And because they don't hold a copy of their training dataset, they're terrible when asked to deliver precise quotes, numbers, and sources. It's just not how they're meant to work.
And while it's true that LLMs typically cannot precisely recall excerpts from their training sets (let alone anything outside of their training set), that doesn't stop them from trying to reproduce a close approximation — or a fabricated alternative — when prompted. They are GENERATIVE AI tools after all. That's what they're meant to do. In these cases, they're not malfunctioning so much as they're being used incorrectly. Incidentally, this is why the preferred term for their invented statistics, quotes, and sources is "hallucinations." As advocates like to explain, they're not falsifying so much as they're "mis-remembering".
Though, the exact cause is still unknown, this may also have something to do with the problem of drift. Drift refers to the change in a model's ability to successfully carry out a given task over time. For it to be drift, those changes need to be unrelated to any known or intentional system updates. An example of this can be seen in precipitous decline in ChatGPT's basic math skills.
Many AI enthusiasts excuse these shortcomings, arguing that with further technological refinement these problems can be solved. And while it's probably true that with time technologists will find ways to lessen these problems, they certainly will not go away. That's because they're fundamental to how large language models work.
With all the above-mentioned shortcomings in mind, imagine trying to work with such AI as a core component of your professional workflows. It would be chaos. The same exact task — subject to the same exact circumstances and details — would produce different results every time. Adding insult to injury, none of those results would necessarily stand up in the face of their real world applications.
It's no wonder that so much of the business world regards generative AI as a sort of gimmick — interesting and fun to play with, but of no essential utility.
Of course, generative large langue models aren't the only AI game in town. Lots of businesses are already familiar with generative AI's less sexy cousin — anomaly detection engines (ADEs). And while the value is real, it's relatively modest and far from the transformational AI we were promised.
Often detected anomalies will have no actual bearing on the business. And when they do, they'll just as often be inactionable. And even when they're both relevant and actionable, without the why and the what now, just knowing about the anomaly will be of limited value.
This is why intensive management is usually required to extract good business value from a general AI tool — regardless of whether it's LLM or ADE based. Both suffer they same fatal flaw: a lack of context-awareness. To compensate for that flaw, users need to parameterize correlations, adjust thresholds, select features, (somehow) integrate their own business logic, and generally refine performance.
On top of that, they'd need to manually investigate outputs to validate their significance, both mathematically and in terms of the business. With so much manual overhead required of a supposed automation tool, it just doesn't make a lot of sense.
To unlock the full business potential of AI, we'd need to overcome these limitations on the technology side rather than the user side. We'd need AI that can independently make sense of the world within which it operates. It would need to not only model language and number patterns, but data relationships, operational flows, impact lines, specific situations and applications, as well as the business context.
But that's simply not possible when aiming for artificial general intelligence. Specific problems cannot be solved with a general approach. This is were specialized AI, sometimes referred to as vertical AI, comes into play.
Artificial specialized intelligence is designed to achieve deeper intelligence predicated on a more focused pre-training, training, and refinement process. In addition, specialized AI technologies are enhanced with built-in operational logic, category insights, and context-specific relational mapping. In this way, specialized AI is able to carry out complex tasks with expert-level intelligence and precision. With specialized AI, performance is not only quick and clever, but proficient, consistent, scalable, and critically — "hallucination" free.
Accordingly, this brand of AI offers significantly greater and more immediate ROI. Here's the rub though: vertical AI cannot be achieved without first acquiring the specialized subject matter expertise and operational know-how. That's a slow and expensive process that requires AI companies to build entirely new teams to work hand-in-hand with their engineers. And for the most part, the big tech brands and AI leaders aren’t too keen on taking on such a grueling process for niche audiences.
But that doesn't mean that others haven't picked up the torch in pursuit of artificial specialized intelligence. oolo AI is one such torch-bearer. When we set out to develop our Deep MonitoringTM platform, we knew we had to take a specialized approach. So we focused on what we knew, not just in terms of the AI but in terms of the business too.
Having worked for many years in the digital media and AdTech space, we had a really good understanding of obstacles faced by marketers and monetizers. We created oolo focusing on those obstacles and drawing on our experience in anomaly detection, predictive analysis, natural language processing, real-time classification, and entity recognition.
It’s no easy thing to bake context-awareness into technology, but that’s what’s needed to create a business solution that actually delivers on the promise of AI. I can personally attest to the fact that, for oolo, it was a long, expensive, painful process. And we kept pouring our time and money into the project because we believed in it and we wanted to push the boundaries of what AI could do for business.
It took two years for us to develop a minimum viable product and another two for us to get to where we are now. But it was worth it. We did exactly what we set out to do and we've brought our vision of high-value artificial specialized intelligence to reality.
While it's easy to dismiss the AI yay-sayers, the truth is they're on to something. There is an AI revolution afoot — just not exactly the one everyone's talking about. The revolution is not in artificial general intelligence, but in specialized AI.
Businesses need AI that's capable of modeling more than just language patterns. They need technology that "thinks" in terms of business patterns; AI that could differentiate the pattern makers and breakers that are trivial from those that actually matter. And when something significant is found, specialized AI should be able to trace the issue to its data points of origin — revealing the root cause. To further reinforce the value, this type of AI will even quantity the business impact and model follow-up options to recommend an ideal course of action.
Don't get me wrong. For what they are, the new wave of general generative AI technologies really are quite amazing. They've been able to do something that no other AI has. They've reignited the popular imagination and given the public its first taste of science-based AI that resembles the AI of science fiction.
Sure, there are lots of examples of consumer AI in everyday life — but those things aren't sexy and don't resemble Iron Man’s Jarvis or Westworld’s Dolores. They can be helpful but basic, like Siri and Alexa or Netflix’s content recommendations. Or they can be straight-up boring — like the technology that let's you deposit a physical check through a mobile app. IBM's Watson came close in the 2010s, but it wasn't really available to the public in any meaningful way. You could watch it battle and beat Ken Jennings at a game of trivia, but you couldn't play with it.
Now with ChatGPT, Midjourney and the like, that's finally changed. These tools have given the world a highly appreciable hands-on way to build excitement around AI. And that excitement matters. It creates demand, directs investment, and rekindles imagination. Three vital requirements for the growth and embrace of the technology.
Still, the current crop of high-hype AI tools are victims of their own design. To achieve mass appeal and maximum flexibility, they were built in pursuit of general intelligence. But the intelligence they demonstrate is so general, generic, and without depth that it's not really intelligent in any very meaningful or useful ways. When the tech cannot relate to the business context, they have no way of turning data distinctions into business directives.
At the same time, despite their limitations, ChatGPT and Midjourney have still delivered a potent proof of concept. They've shown that machines are able to go beyond pre-programmed logic to "make sense" of messiness and operate inductively rather than computationally. And all the same principles that are used in large language models can be used in other AI models — developing artificial understanding in domains beyond language.
Which is exactly what vertical AI innovators are doing. They're combining business understanding with data pattern recognition. From core algorithms to category insights and business logic, specialized AI is being crafted and refined for unique industry and use case datascapes and performance factors.
Most anomaly detection solutions are academic tools — requiring devoted data science and operations personnel to filter out the noise and translate outputs into appreciable business problems/opportunities. Not so oolo. Our technology delivers anomaly detection with a built-in business brain — serving up expert analysis and pertinent context along with every alert.
It's that specialized intelligence that makes ALL the difference — successfully delivering on the promise of transformational AI. And while it's still relatively early days, I'm happy to report that the AI revolution has in fact officially made it to the world of business. It may not be HAL 9000, but if you've seen the movie, you know that's probably for the best 😉.