Navigating the ‘Magic’ (and Caution) of ChatGPT

February 15, 2023

It’s not an overstatement to say that ChatGPT has taken the world by storm over the last few months, captivating a wide and diverse audience across academic, technology, and business circles alike. Customer experience (CX) leaders and professionals have especially taken notice, as there has never been a chatbot that is seemingly so smart and capable. Rather, broadly speaking the experience with chatbots to date has been disappointing and frustrating, with limited success in application(s) for elegantly automating customer service work in place of humans. Customers have been underwhelmed as well, as a recent study by Ujet¹ found that 72% of people consider chatbots to be a ‘waste of time.’

So is there new hope on the horizon, despite the decade-long graveyard of broken chatbots and dashed automation and self-service promises? Indeed, there is – and/but, some context and clarity are needed to ensure we as a CX community don’t retrace the same steps and mistakes from the previous chatbot chapter(s). A CX leader at a large financial services company recently told me in passing that he had just “asked his team to look into how they could use ChatGPT in their business.” We’re guessing many CX leaders are asking this same question and thought this post would be helpful in kick-starting new journeys toward much more elegant and impactful CX self-service and automation.

Let’s start with demystifying ChatGPT and the core technology enabling it…

CHATGPT 101

ChatGPT (Chat Generative Pre-Trained Transformer) is the ‘branded’ chatbot developed by OpenAI and launched in November 2022. A “ChatGPT for Dummies”, which we all appreciate, explains it in layman’s terms. ChatGPT is built on top of OpenAI’s GPT-3 family of large language models (LLMs). It is THIS bit of core technology – the LLMs, and not ChatGPT itself – that is the most important and game-changing as it relates to brand-specific CX utility and impact. So let’s briefly dive into the world of LLMs, and what makes them so powerful.

LLMS: WHAT ARE THEY, AND WHY SHOULD WE CARE?

By definition, a large language model (LLM) is a deep learning algorithm that can recognize, summarize, translate, predict, and generate text and other content based on knowledge gained from ingesting massive datasets. (For those new to the concept of LLMs in general, this article from Kyle Wiggers offers an excellent overview)². In practice, ChatGPT’s chatbot is a perfect example of this in action, as the LLMs powering it were trained using massive text databases sourced from the internet. How massive?…

570 GB worth of data obtained from books, webtexts, Wikipedia, articles, and other pieces of writing/content on the internet
300 billion words fed into the model
175 billion parameters

By ingesting all of this data (with ongoing supervision and tuning – more on this later), the LLMs effectively pre-trained ChatGPT (hence, the ‘P’ in ‘GPT) to recognize and respond to context, intent(s), varied speech and language patterns, among many other things. For CX leaders and practitioners, especially those who have been wrestling with time-consuming legacy approaches to building chatbots and other types of self-service/automation, THIS is the absolute game-changing bit.

Pre-trained LLMs come with thousands of recognized intents ‘out-of-the-box’ – no more intent identification and bot training, one intent at a time. Pre-trained LLMs recognize variations in speech and language ‘out-of-the-box,’ welcoming natural ‘human’ conversational ways of speaking – no more rigid, ‘robotic’ language within your chatbots and self-service protocols.

Given the above, pre-trained LLMs move fast – no more waiting for months to deploy even a single, simple use-case chatbot. These use cases alone are driving double-digit percentage increases in customer self-service across industries, as well as double-digit percentage decreases in contact center volume and associated operational costs – and this is before we even approach the generative (‘G’ in ‘GPT’) capabilities that can impact revenue-generating use cases like sophisticated cross-sell and upsell, and loyalty programs.

Now that we understand the value-driving capabilities of LLMs, it’s also critical to understand the watch-outs and challenges associated with applying this technology to brand-specific CX strategies and programs.

LLMS FOR CX & SELF-SERVICE AUTOMATION: WITH GREAT POWER COMES GREAT RESPONSIBILITY – BE CAREFUL AND PURPOSEFUL

Shortly after ChatGPT’s release, and amidst all the immediate excitement surrounding it, OpenAI Chief Executive Sam Altman tweeted the following public service announcement:

“It’s a mistake to be relying on it for anything important right now. We have lots of work to do on robustness and truthfulness.”

This type of warning might seem counter-intuitive, if not downright confusing, coming from the CEO of the most-talked-about technology in recent memory. But he has good reason for extending such caution at the moment. One of the biggest strengths and challenges with LLMs is that they’re typically trained without much supervision, per our brief mention earlier. In the context of AI, ‘supervision’ is the component of training where an external entity, usually human, provides “correct” answers to the AI as it learns its way forward. On one hand, supervision is very expensive, so removing supervision can greatly increase the amount of data the AI can be trained with, as well as time to launch/value. However, removing supervision also means that the AI might learn incorrect behavior. There are already numerous articles highlighting ChatGPT’s inaccuracy, and in some cases its ‘creepiness.’ Following Mr. Altman’s lead and guidance above, OpenAI and Microsoft (its primary investor) continue to iteratively install more guardrails and limits on ChatGPT³ as they learn more from the market and its usage.

For business and CX purpose, there are two primary areas where brands need to be cautious and thoughtful when considering employing LLMs in their CX strategy. The first is around supervision – specifically, the lack of supervision of LLMs can lead to many mistakes if/when the AI directly generates responses. Some examples include:

The AI produces responses that appear to be correct or reasonable, but are, in fact, false. These LLMs understand the relations between words and context, but do not have logical reasoning capabilities. This is problematic because the AI might use a sequence of reasoning it found in training, but in an illogical context. For example, if you ask the AI to provide details about the credit card with a 3% interest rate, it might splice the credit card details with a savings account details to invent a fictional credit card that has a 3% interest rate.
The AI might ignore certain context in the user’s question or use content that depends on specific context. For example, if a brand offers a basic tier and a premium tier service offer that have 9-5 support hours and 24/7 support hours, respectively, the AI might respond to a question like “Are you open 24/7?” with “Yes, we are open 24/7 to support our members,” when the hours are actually dependent on the user and specific offer.
The AI provides answers that are a violation of compliance or even inappropriate content. Since the AI is trained on massive corpora, including potentially online discussions, it will have learned to respond in ways that are not acceptable in a business setting.

The second area of caution, which has been somewhat surprising to us, is that some brands are attempting to use LLMs as a development tool rather than a customer-facing one. Specifically, their intent is to leverage LLMs to generate content and training data in service of their traditional/legacy conversational AI solutions – with the hypothesis being that they can significantly reduce the amount of effort that goes into building a chatbot. However, brands attempting this approach will still concede large benefits associated with LLMs, including:

Data biases. The content generated by LLMs is trained using specific terminology and will use certain terms more than others. Previous-generation models learn based on patterns, so if a specific term shows up frequently, the model might incorrectly learn that term implies a specific intent or meaning.
Lack of deep understanding. The data that LLMs produce is only data, though highly predictive – specifically, the conversational context and deep understanding of what the user needs based on multi-turn conversations. While the LLM has these understanding capabilities, the same capabilities will not be passed down to previous-generation models. So, any contextual understanding will still need to be manually engineered by the development team, which defeats the purpose of LLMs in the first place.
Intractable data requirements for complex conversational flows. Multi-turn conversations, particularly ones where the user is trying to complete an action, present an innumerable set of potential conversational flows. While an LLM could theoretically produce data for all of these conversational flows, the time requirements to produce the data and train the previous-generation model would be intractable.

‘CAUTION’ DOES NOT MEAN ‘STOP’: THE LLM-POWERED KNOWBL PLATFORM

While the cautions and challenges noted above should not be ignored, neither should the benefits of applying LLMs to your CX strategy in the very near-term – especially if improved self-service, chatbots, and automation are on your strategic agenda. It’s the very limitations discussed above, along with the ongoing challenges of building chatbots and automation in general, that led to the start of Knowbl. We set out to leverage the massive advantages LLMs have over traditional conversational AI technologies for brands where CX, NPS/CSAT, compliance, and brand image are critical. By designing and building a transformer-first platform, that solves the hardest part about conversational AI management (few-shot extraction, automated context management), we are able to utilize the full robust understanding and generative capabilities of LLMs, while also avoiding the limitations of minimal supervision, content inaccuracy, and lack of brand control over the AI.

While the Knowbl platform offers all of the functionality seen in traditional conversational AI platforms, its LLM-based intelligence unlocks new features, capability, and business control that drive vastly improved CX outcomes. Three specific highlights include:

Ingestion of brand-approved content to automatically generate conversational flows, responses, and intents. Using few-shot learning, the content alone provides a reasonably robust intent training set out-of-the-box.
Identification and diagnostics for misunderstandings. One of the key challenges when using LLMs is that they’re large black box models. However, as experienced practitioners, we recognize the need to quickly recognize and resolve issues surfaced by the AI. To this end, we enable a quick development cycle through model explainability and quick incremental model training processes.
Transactional experience support. For brands, transactional experiences can represent critical, revenue-generating functionality in chatbots. The Knowbl platform enables transactional experiences easily by utilizing LLMs to handle slot/entity extraction much more easily and efficiently than a traditional conversational AI platform would. Rather than requiring thousands of training examples to build a transactional experience like a traditional platform, LLMs allow the Knowbl platform to learn with just a couple examples.

Using these building blocks and more, Knowbl is enabling brands to deliver on their authorized and compliant CX promises and conversational AI/automation objectives in a fraction of the time (compared to traditional models), with increased breadth of capabilities, accuracy, and control.

¹UJET Research Reveals Chatbots Increase Frustration for 80% of Customers, UJET Research / Businesswire, December 6, 2022

²The emerging types of language models and why they matter, Kyle Wiggers, TechCrunch, April 28, 2022

³Microsoft Considers More Limits for its New AI Chatbot, Karen Weise & Cade Metz, New York Times, February 16, 2023

Navigating the ‘Magic’ (and Caution) of ChatGPT

CHATGPT 101

LLMS: WHAT ARE THEY, AND WHY SHOULD WE CARE?

LLMS FOR CX & SELF-SERVICE AUTOMATION: WITH GREAT POWER COMES GREAT RESPONSIBILITY – BE CAREFUL AND PURPOSEFUL

‘CAUTION’ DOES NOT MEAN ‘STOP’: THE LLM-POWERED KNOWBL PLATFORM

Features

Company

Resources

Research