Hacking on Platforms to Support Conversational Components

Published by:

Chen Buskilla

Conversational components are a new concept which enables modularity and content reusability in chatbots. The general idea is that reusing content cuts development costs, improves the quality of conversations, and facilitates faster deployment.

Modularity prevails in all areas in computer science, but not with regard to Chatbot content. Building chatbots is a complex endeavor. Sticking to a modular architecture makes it easier to reduce this complexity by breaking the system into varying degrees of interdependence and independence. Or, as Baldwin says in his essay, “hide the complexity of each part behind an abstraction and interface”.[1]

There are many approaches to reuse content in chatbots. One of the standard ones is sharing datasets. Today it’s easy to find conversation datasets with tagged intents, slots, speakers, sentiments and more. The disadvantage with this kind of reuse is the loss of flexibility. Adapting these datasets ranges from adapting each datapoint manually to algorithmic methods like domain adaptations. Both methods are brittle and require extra data and manual work.

Some more great reading

Why do we need global variables in open domain chatbots?
What is “Deep context switching” in conversational AI?
The conversational AI and chatbots glossary

Conversational components offer a different approach – agent based reuse. Each component is an agent with specific goals that takes control of the conversation, solves specific sub goals and returns control back to the calling bot. It does this over multiple turns and part of the API allows customizations so convo designers can match the tone/style to the calling agent’s personality. It hides the complexity and implementation details of the specific problem it is solving. For example, adding a “get name” component to your bot frees you (the developer) from maintaining lists or models with first names, last name and titles. You can just delegate all the work related to getting the name of the user (or updating it) to the component.

As mentioned before, conversational components operate over multiple turns. It’s not enough for a component to just take a single input from the user and return a single response, as personal assistants usually do. It needs to wait for the next input and evaluate it. This presents an issue with most of the existing platforms in the market today – they are built for the single turn use case with extensions for multiturn cases. Usually this means that the integration point to other systems is at the single intent/response level. We call it “The Continuations Problem”. So, adding these conversational components requires some tinkering.

A context for every input/output

For example the Dialogflow model of multi turn conversation gives each intent an input and output “context” keys. Each intent can output keys to “activate context” and when a context key is active only intents with the matching context key will fire. Hence, this allows a kind of extended follow ups.

Screen Shot 2020-02-05 at 3.42.39 PM

However, in Rasa the situation is different; Rasa core gives developers the ability to define a policy that operates over multiple turns. A policy usually has some training step where it processes conversations turn-by-turn and for each action (an action is usually just omit response) tries to learn the relation to the preceding intents and actions. Then, at run time, it aims to predict actions by looking at previous intents and actions. This is a good start but things can go wrong when there are multiple policies with different goals. Rasa’s model for this case is just to give each policy a priority which means at any point a different policy can omit an action (response) without regard to the stage of the conversation it is in.

So, how did we do it?

In order to solve those issues in CoCo SDKs we had to tinker with non-intuitive workarounds, or hacks. – In Dialogflow we are using the fulfillment API and activate a special reserved context key on each response. We also need to instruct our users on how to go through an installation procedure where they add a catch-all intent with the special context key as in-context. This creates the ability to continue the delegation after the first turn.
So, for Rasa the solution was building on the top of the Form Policy which faces the same continuations problem – which is also an awkward and confusing solution which uses a magic-number priority.

Additionally, there’s another challenge when creating modular components is how to transfer data between components in run time. We call it Context Transfer and will discuss it in my next post.

To summarize, Agent based reuse is still poorly supported in the leading industry platforms. We believe it’s a step forward for the chatbot industry. So, we hope that one day soon, leading platforms will realize other forms of reuse exist and will provide better support for it.

[1] Baldwin, C.Y.; Clark, K.B. (2000). “Chapter 3: What Is Modularity?”. Design Rules: The power of modularity. MIT Press. pp. 63–92.

Have a DF/Rasa Bot? Now you can call a conversational component to make your bot EVEN more conversational.

Do it in DialogFlow: https://www.youtube.com/watch?v=9iGWVK7CcjQ

Do it in Rasa: https://www.youtube.com/watch?v=Wiwx0881iCc