Picture a corporate conference room with an executive leadership team gathered around the board table 鈥 some in person, some on screen. Amidst the usual chatter, there鈥檚 an agenda item mysteriously titled 鈥淐hatGPT鈥.
听
Taking the lead, the CEO notes that key competitors have once again got the jump on them with press releases proclaiming their adoption of the new artificial intelligence-based chatbot for customer service, software development, marketing, legal advice and sales 鈥 with the implication being that they鈥檙e making substantive cost savings. Market analysts have greeted this news well by upgrading their stocks across the board.
听
The CTO plugs her laptop into the screen and begins her presentation with Mitch Ratcliffe鈥檚 famous quote from Technology Review in 1992:听鈥淎 computer lets you make more mistakes faster than any other invention with the possible exceptions of handguns and Tequila.鈥
听
There are several puzzled and disappointed faces around the boardroom table鈥
听
___
听
OpenAI鈥檚听听has received significant attention since its launch in November 2022. The response has been varied. Many people have been amazed by its ability to rapidly听听like crafting professional prose, academic essays and even product strategies. On the other hand, as interest groups and professionals collectively stress-tested ChatGPT, its limitations became clear. From听,听,听 writing code that has听, and even听, for every enthusiastic tale of ChatGPT鈥檚 capabilities, there are just as many that emphasize a need for restraint and caution.
听
In this two-part series, we draw on 黑料门鈥 experience in听delivering AI systems听to demystify the technology behind ChatGPT. We鈥檒l cover its possible applications, its limitations and corollary implications for organizations that are thinking of building business capabilities around generative language models. But first, let鈥檚 take a closer look at how it actually works.
听
听
How ChatGPT works听
听
ChatGPT is a type of large language model (LLM for short, or sometimes referred to as generative language model). Before discussing the strengths and limitations of LLMs, it helps to briefly describe how they work.
听
Much of the razzle-dazzle of LLMs boils down to听next 鈥渢oken鈥 (i.e. word) prediction. LLMs are听 trained on large amounts of text data (what鈥檚 commonly called a 鈥渃orpus鈥). Through this training process, the model builds up an internal numerical representation of words or 鈥渆mbeddings鈥 so it can predict the next most probable word for any given word or phrase.听
For example, if we give a trained language model a prompt, 鈥淢ary had a鈥, and ask it to complete the sentence, we are essentially asking it:听what words are most likely to follow the sequence, 鈥淢ary had a鈥︹? Anyone with even the slightest familiarity with nursery rhymes will know that the answer is 鈥渓ittle lamb鈥.
Figure 1: LLMs are able to generate sentences and whole essays through its ability to predict the next highest-probability candidate word for any given word or phrase. It learns to do so through techniques such as masked language modeling.
听
ChatGPT is different from earlier LLMs (e.g.听) in that it introduced an additional mechanism 鈥撎(RLHF). This means human annotators further train the LLM to mitigate the听.
听
听
This train doesn鈥檛 stop at every station: Limitations of large language models
听
ChatGPT鈥檚 uncanny text generation abilities have no doubt sparked the imagination of many: how can I leverage or recreate a tool like ChatGPT for听creating value in my business?However, the recent missteps of听听and听听highlights the limitations of LLMs; these need to be properly understood in order to set expectations and avoid disappointment when we invest money and time in generative AI projects.听With that in mind, let鈥檚 look at three key limitations of LLMs.
听
听
1. They are sophisticated but probabilistic auto-complete machines
听
LLMs are essentially an 鈥渁uto-complete鈥 machine that operates using sophisticated pattern recognition methods. It parrots and reconstructs prose that it has been trained on, but it does so probabilistically; that鈥檚 why it will often听听and听听or produce听. These so-called 鈥渉allucinations鈥 are听听鈥 it鈥檚 a behavior that鈥檚 inherent to any generative system.听
As our colleague David Johnston听, it cannot, and should not be expected to, complete tasks that require logical reasoning. Would you use a tool that听听to听?
听
听
2. Their text generation abilities are unsuitable in contexts with low fault tolerance
听
The corollary of the first limitation is that LLMs鈥 text generation capabilities are less suitable for tasks with low 鈥渇ault tolerance.鈥 For example, legal writing and tax advice have low fault tolerance and are highly specific use cases that warrant expertise, accountability and trust 鈥 not just words on pages. Furthermore, such scenarios typically warrant factually accurate, not probabilistically composed, prose.听
听
On the other hand, tasks such as creative writing may in some cases have high fault tolerance, and the high variance of LLMs may actually be a feature rather than a bug in that it can help generate creative and valuable options that humans hadn鈥檛 considered before. Even then, human moderation is absolutely necessary to avoid exposing your customers to uninhibited and potentially hallucinatory chatbot responses.
听
Having said that, LLMs have other useful capabilities beyond the free-form, prompt-based text generation capability that ChatGPT is known for. These include classification, sentiment analysis, summarization, question-answering, translation, among others. Since 2018, LLMs such as BERT and GPT-2 could already achieve听听on these Natural Language Processing (NLP) tasks. These capabilities are narrower, more testable, and can be applied to solve business problems such as knowledge management, document classification, and customer support to name just a few.
听
听
3. Data security and privacy risks
There are likely to be legal and regulatory constraints on the movement and storage of your organization鈥檚 and your customer鈥檚 data. The widespread use of ChatGPT among knowledge workers introduces a serious risk of exposing confidential data to a third-party system. This data will be used听听which means it could potentially be regurgitated to the public.
听
听
So, where does that leave us?
听
ChatGPT鈥檚 limitations doesn鈥檛 mean that we should write off the potential of generative AI in solving important problems. However, by being aware of these limitations we can ensure we have realistic expectations about what it can actually deliver. Here are some high-level guidelines when considering using generative language models to solve business problems:
听
Start with business use cases with a high fault tolerance
听
Fine-tune pre-trained LLMs on domain-specific data (what鈥檚 also known as transfer learning). Do this using a secure cloud infrastructure or approved PaaS or MLaaS technologies with the necessary data security and privacy controls to create models that are relevant to your business domain
听
Ensure that LLMs are always monitored by a human who can verify and moderate its content (even though it seems听听)
听
Even with those guidelines in mind, it鈥檚 important to remember that training high-quality generative language models is costly and challenging in terms of compute, testing, deployment and governance. For that reason we should be careful not to fall into the trap of a tech-first approach 鈥斕we have a shiny hammer, what can we hit with it?听Instead, organizations need to zoom out and (i) start with finding a high-value problem worth solving, and (ii) find safe, lightweight, cost-effective ways to solve them iteratively.听
听
In our experience, the delivery of AI solutions must be undergirded by foundational practices in agile engineering, Lean delivery and product thinking, in order to deliver value rapidly, reliably and responsibly. Without those elements in place, organizations are likely to get trapped in tedious AI development and delivery risks, or even releasing products that users didn鈥檛 want or need. The result is frustration, disappointment and botched investments.听
听
We鈥檒l explore that in part two of this blog, where we share five recommendations that we鈥檝e distilled from our experience delivering AI solutions at听industry-leading clients.听
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of 黑料门.