Width.ai

Turbocharge Dialogflow Chatbots With LLMs and RAG

Matt Payne
·
August 27, 2024
example of dialogflow chatbot with LLMs

Dialogflow was a popular technology for several years because it enabled businesses to upgrade from limited, rule-based chatbots to chatbots capable of natural language processing (NLP) to extract useful structured data from user conversations. Its NLP capabilities weren't great but they were good for that time.

But we now have much more powerful large language models (LLMs) capable of function-calling, agents, and much more.

Many businesses and developers will be thinking about replacing Dialogflow with LLMs. We believe Dialogflow still provides several benefits to businesses and recommend a hybrid approach where LLMs are integrated into Dialogflow agents instead of replacing them.

In this article, we'll explain why you should continue with Dialogflow chatbots in 2024 but also show how modern LLMs can turbocharge their capabilities.

Is Google Dialogflow Still Useful to You in 2024?

Dialogflow is still useful because of its versatile integration and deployment capabilities. Every Dialogflow chatbot supports the following channels and integrations right out of the box:

  • Telephone conversations with customers
  • Support for several messengers like Google Chat, Slack, Facebook Messenger, and others
  • Voice-based conversations through any channel that supports voice
  • Rich text and images in the user interface to improve online shopping and business procurement use cases
  • Built-in scalability (especially with the Dialogflow CX solution)
  • Makes chat workflows easier to put guardrails around. Dialogflow keeps conversations in a box which is always a concern with LLMs

All these features are time-consuming to implement from scratch. Dialogflow's built-in support for them with proven track record and performance provide compelling reasons to continue with Dialogflow.

But Dialogflow has its shortcomings. We'll explain them in more detail later but they include:

  • Very limited natural language understanding (NLU) of training phrase variants
  • Clumsy handling of multiple-turn conversations
  • An inflexible approach that's useful for simple chatbots but difficult to adapt for complex domains or the integration of knowledgebases that are anything more than basic
  • Time-consuming to configure and test even when using the visual dashboard
  • Application programming interfaces (APIs) that are rather chatty, and documentation that's difficult to navigate — two aspects that add to any custom development effort

We'll explain these issues and solutions for them in more detail in the sections below.

Integrate Advanced RAG and LLMs With Dialogflow

A common problem with Dialogflow is that it's quite limited for use cases that involve complex queries. For example, in fields like medical insurance, conversations may take many unexpected turns and users may provide unexpected inputs that are specific to that domain, such as names of medical conditions and prescription drugs.

For example, a user may start with insurance plan benefits, then switch to questions about coverage of their medical conditions, list the drugs they've been prescribed to check for coverage, ask about copay conditions, and inquire about pre-existing exclusions.

Dialogflow's intent-based framework is just not flexible enough to handle such unpredictable context switches. In fields like medical insurance or online shopping, the information that must be given to users is too vast to handle via Dialogflow parameters and the possible turns a conversation may take are too many to fit into Dialogflow intents. The trigger-based intent identification that Dialogflow does, as illustrated below, is not suitable for such domains.

static dialogflow with intent classification

In contrast, LLMs trained for instruction following and human alignment — like GPT-4 or Mistral — are extremely good at handling diverse turns in conversations. They're able to organically use all the previous information given by the user without having to convert it into structured information. However, switching to LLMs means losing all the benefits of Dialogflow we outlined earlier.

Luckily, it's possible to avail all the integration and deployment benefits of Dialogflow and combine it with LLMs for handling complex conversations.

LLM for Conversation Handling

For our clients in complex domains like financial services and e-commerce, we integrate LLMs into Dialogflow as outlined below.

First, apart from the default welcome intent and the default fallback intent, we don't configure any intents at all.

Instead, we configure a webhook URL for the Dialogflow agent as shown below:

dialogflow insurance example

This URL points to a custom webhook service hosted on our client's server. We can also implement it using cloud functions. Either way, the webhook enables us to get notified by Dialogflow whenever it executes an intent and allows us to modify or override the configured response as well.

We then set up the default fallback intent to call this webhook as shown below.

webhook for dialogflow

This way, every user input immediately reaches our webhook service to process and respond as we please. It's in the webhook that we plug in the LLMs and their reasoning capabilities.

Why LLMs?

LLMs provide us the flexibility we need to have informative multi-turn conversations with users and fulfill their exact information needs.

With normal Dialogflow chatbots, the tendency is to handle simple straightforward conversations but once the conversation starts involving more criteria and complex information needs, users are either switched to a live agent or asked to call customer service. The ratio of end-to-end fulfillment automation may only be around 15% and the remaining 85% of conversations are either handed over to live agents or possibly result in user abandonment.

In contrast, generative AI techniques like retrieval-augmented generation (RAG) enable our chatbots to handle about 70-80% of conversations end-to-end and switch to live agents only 20-30% of the time. This is an enormous improvement in workforce productivity, customer experience, and user retention.

LLMs enable the following

  • Complex reasoning in order to satisfy a large number of criteria that the users specify
  • Custom intent identification that isn't restricted to a small set of phrases unlike Dialogflow
  • Techniques like RAG, reasoning and acting (ReAct), planning and executable actions for reasoning over long documents (PEARL), and LLM agents enable us to include knowledge from external sources in the response
  • Routing of conversations based on built-in semantic matching without having to add training phrases
  • Simultaneous processing where the same query is handled by multiple specialist LLMs (for example, one converts tables into structured information, others query different document sets simultaneously, yet others query different databases at the same time, and so on, and finally a more powerful LLM combines all that information into a single answer)

In the next section, we demonstrate one such LLM-based Dialogflow chatbot.

Medical Insurance Case Study

We've already seen the complexity of a field like medical insurance, for example. When we try to service such a business using Dialogflow, we quickly run into several difficult problems:

  • Is the user asking about plan details? How do we cover all the possible words and phrases that users may use in such a wide field?
  • There are so many names involved in the field. Names of medical conditions, treatments, drugs, and health care providers are frequently used. How can the chatbot correctly identify this practically infinite set of names?
  • How do we handle the complex set of conditions that some users' information needs involve? They may be looking for plans using criteria based on monthly premiums, exclusions, copay amounts, out-of-pocket limits, covered drugs, allowed treatments, and so on. Again, the number of possibilities is practically infinite.

In reality, Dialogflow by itself — even its built-in machine learning features — isn't cut out for such domains. Dialogflow is closer to old-school phone-based interactive voice response systems except that instead of nine number keys, a slightly larger set of spoken dialogues is available.

Our RAG-based LLM architecture shown below is much more powerful.

Width.ai RAG pipeline

Its responses are based on information available in your:

  • Business documents
  • Frequently asked questions
  • Knowledge bases
  • External databases like product inventories and enterprise resource planning (ERP) systems

For example, insurance information from reputed insurers often looks like this, with anywhere from 30-500+ pages of documents to go through:

medical document example

Instead of forcing users to go through all this complex information, LLM-based chatbots can easily provide all the required information to users looking for medical insurance plans. A chatbot capable of such conversations is useful for a variety of users like:

  • Individuals looking for good medical insurance plans
  • Customer service agents of medical insurance companies
  • Human resource personnel looking for the best medical insurance plans for their employees
  • Medical billing and coding specialists in health care facilities who have to discuss cases with their organization's insurance partners
  • Social workers who help patients resolve their insurance problems
  • Insurance aggregator services that help users find the best insurance deals from different providers

In the example below, notice how the LLM is able to handle a wide-ranging conversation with multiple turns and satisfy a user's information need.

First, the user asks if a medical condition is covered:

insurance example in dialogflow

Then the user switches to asking about coverage for a drug that's been prescribed for another condition. The LLM crunches through a 300-page document about covered prescription drugs in seconds and gives them the exact information they need:

more insurance example

At this point, the user probably favors going with this plan. But they have some more important criteria for coverage as shown below:

more of a Q&A example for dialogflow

Again, the LLM provides extremely detailed information for their query, including coverage limits, copayment terms, allowed procedures, and so on.

This case study demonstrates the ability of LLMs to reduce the chances of user frustration and abandonment by answering their complex questions in great detail.

In the following sections, we explain some more techniques we frequently use to create agents based on Dialogflow quickly.

Develop Dialogflow ES Chatbots Faster

In Dialogflow ES, intents are the primary means of configuring a chatbot's behavior. The screenshot below shows the intents bundled with the predefined online shopping chatbot.

intent recognition with dialogflow

Every user message is mapped to one of these intents by matching it against each intent's training phrases. The intent that matches best is then executed. This involves:

  • Identifying entities: Custom entities (or parameters) are identified and assigned values. For example, in the user message, "I want to buy shoes," the entity "product" will be assigned the value "shoes."
  • Generating responses: Dialogflow generates a response that's suitable for the channel. For example, for voice chatbots, it'll generate text-to-speech (TTS) audio.
  • Invoking webhooks: These are optional application callbacks that Dialogflow will notify after the intent is executed. Through them, the application gets a chance to modify the chatbot's behavior.

If Dialogflow can't find a suitable intent, it sends the query to the default fallback intent.

Creating a large number of intents using the Dialogflow GUI is time-consuming:

  • Each intent requires at least 10-20 training phrases. Additionally, they must be annotated with parameters and values.
  • Additional settings like responses, events, and webhooks must also be configured for each intent.
  • Neither this process nor its GUI are layperson-friendly. Technical terms like "entities" and "webhooks" are extensively used in the GUI.

Speeding Up Chatbot Development

We can greatly streamline the above process and make it layperson-friendly by automating the intent creation using an LLM.

First, a developer can describe the chatbot's behavior in plain English. The LLM then rapidly generates the JSON configuration data required to create intents using Dialogflow APIs.

Next, the user can quickly generate a large set of training phrases using LLMs. Here, LLMs help in three ways.

1. Generate Large Set of Training Phrases

LLMs can generate a wide set of conversational variants. Since LLMs are extensively trained on internet datasets, the number of available semantic variants for a conversation can be extremely large.

query examples

2. Generate Training Phrases in Multiple Languages

LLMs can be instructed to generate the training phrases in multiple languages. In the example below, a developer who doesn't know German has generated product search trigger phrases in German, enabling a business to easily expand their customer service features to a new market:

multiple examples

3. Annotated Training Phrases for More Robust Intent Identification

Dialogflow training phrases can be annotated with parameters and entity types to help the Dialogflow engine convert unstructured conversations into neat structured data that can be easily processed. The annotated training phrases for a pre-built online shopping intent are shown below:

Since LLMs can understand coding-related data formats like JSON, they can be instructed to generate even entity annotations with the training phrases. In the earlier generated phrases, the LLM has already done it partially by using "[product]" and "[produkt]" as placeholders. From there, it's just a matter of converting that placeholder, as well as other placeholders provided by the developer — like SKUs, dates, and times — into annotated parts of training phrases.

This helps create a larger and more varied number of training phrases and helps make the chatbot smarter and more robust. Your business can reduce the percentage of cases you have to hand over to live agents.

In addition, we use custom Python code to generate annotated training phrases using Dialogflow APIs:

python example of generating add phrase

The example below shows the behavior of Dialogflow's built-in online shopping agent and its "product.search" intent when LLMs and webhooks are not used:

The agent can identify that "shoes" are the product, as seen above. This is also evident from the Dialogflow console:

However, stock Dialogflow's lack of robustness becomes quickly evident when the user just switches from "shoes" to "socks" as shown below!

The intent fails to identify "socks" as the product parameter as shown in the development console below:

Instead of relying only on Dialogflow, we now enhance the intent with our webhook that uses an LLM to identify the product in the query as shown below:

After intercepting the query using an LLM, the chatbot can easily identify the products that it  previously stumbled on:

It can even understand multiple products in the query as shown below:

Create Dialogflow CX Agents Quickly from Conversational Design Images

The last technique we explain here is how to convert conversational flow diagrams prepared by conversation designers directly into Dialogflow agents using LLMs and image recognition.

The diagram below is a conversational flow diagram showing conversational states and transitions between them:

Dialogflow CX Agents

We use multimodal vision-language models like GPT-4V to convert these states and transitions directly into Dialogflow CX pages and routes.

We prompt GPT-4V with this instruction: "This is an image of a conversational flow. All the text boxes are conversational states. All the arrows are transitions or routes between them. Extract all the text in the text boxes and list them."

It generates the following list of conversational states:

We then ask it to convert the blue and red connections as a list of transitions between these conversational states using this prompt: In the image, identify all the connections between the textboxes shown by the blue and red arrows. If they are one-way arrows, list them using the format "From state => To state". If they are double-ended arrows, list them using "From state

<=> To state".

The LLM can convert them to a format suitable for processing easily. We then use the Google Cloud Dialogflow CX APIs to convert them into page and route objects.

Robust Dialogflow AI Chatbots With LLMs

In this article, we demonstrated how we develop Dialogflow chatbots better and faster. By doing so, we solve the shortcomings of both technologies. The LLMs and modern conversational AI make Dialogflow smarter while leveraging its versatile integrations and deployment options to deploy B2B and B2C chatbots quickly on multiple channels.

Contact us for high-quality Dialogflow chatbots for your business!