A lot has been written over the past year about the differences between visual interfaces and conversational interfaces, but, there are some things we’ve learned at Talla that I haven’t seen covered anywhere so I want to address them.
Most companies get started by building a minimum viable product. The way you do this for a conversational interface may be quite different than the way you do this for a traditional application. The complexity of a traditional application is often on the backend end, but the complexity of a conversational interface is usually in the natural language part — taking what someone has entered and translating it into intent. So, to build a conversational minimum viable product, you may skimp on your NLP initially.
What does this mean? Well, take something like scheduling a meeting as an example. Rather than try to parse a single sentence like “schedule a meeting with Rob for Friday to talk about Marketing,” and the 100 different ways I could make that request, you might take it in steps and make your chatbot ask specific questions.
Me: Schedule a meeting
Chatbot: Who would you like to meet with?
Chatbot: When would you like to meet with Rob?
And so on, until you have all the information you need to make this meeting happen. The first problem here is that, a minimum viable chatbot product that works this way may be a pretty crappy experience. In that sense, it may not actually meet the “minimally viable” definition because you can’t get to product-market fit with something so conversationally cumbersome. As a result, you may have to build better, deeper, more accurate natural language processing before you can really have a MVP. This means a chatbot MVP, while seemingly inexpensive, could actually require more capital than a traditional web or mobile app, where an abundance of frameworks can help you get something pretty good to market quickly.
But there is an even bigger problem here. One of the things we have learned at Talla is that much of the time, users don’t fully read the things a bot says. This is evidenced by the surprising number of times a bot asks an either/or question and the user responds with “yes” or “no,” both of which are invalid options for the question.
Now, put these two together and you get a very interesting problem. You’ve built an MVP that follows a certain conversational flow. And you know people don’t always read the entire conversation before they respond. But now you’ve been in market for a year and you have a bunch of ideas about how to improve the conversational flow. What happens when you introduce a new conversational experience?
I will tell you, because it just happened to me last week. What happens is that it can be very jarring. I was using the Talla Task Assistant functionality, which manages a to-do list in natural language, and we deployed some significant changes to make it much easier to use, but I typed in something entirely wrong because I had become so conditioned to the existing conversational flow.
The problem with chatbot MVPs is that they may change pretty dramatically over time, like all MVPs do, but slight modifications to a conversational flow can be much harder to pick up on than slight modifications to a visual flow, and that can make for a jarringly bad experience.
We don’t yet have good ideas on how to fix this. Like most things conversational, it’s new, and there aren’t established best practices, so we will have to experiment. One option could be to tell the user “by the way, I’ve changed how I respond to that command, here is the new flow.” But, users often dislike any extra information that they didn’t ask for, and, that may actually get annoying if you are changing quite a bit about your application. Another option might be to always vary the flow slightly, so users are trained to pay more attention to what specifically was said. It’s an interesting problem and my goal here was just to share a learning with the broader conversational UI community. If you have good ideas on how to approach it, I would love to hear them.