“Hello, I’d Like To Make An Appointment”: Advancing AI Agents

Artificial Intelligence (AI) is everywhere. It permeates modern science fictions shows like Westworld, Black Mirror, and Humans. And who can forget HAL in the 1968 classic, 2001: A Space Odyssey? Now we fast forward to this week’s unveiling of Google Duplex, which represents a new leap in AI–Agent capabilities. Listen to the following audio clips and try to guess how each person feels during the conversation.

Scheduling a hair appointment:

Making a dinner reservation:

Making a dinner reservation #2:

Now that you’ve given some thought to how each person feels in each of the conversation, you might be surprised to learn that three of the six people you just heard aren’t people at all. Three of the “people” are actually computer generated AI agents. Can you guess which three? The answer is, the three callers.

This is an exciting development because it means that computer–based AI agents are able to interact with real people, in the real world, in a completely real and natural way. This is an important evolutionary step that takes state–of–the–are AI agent solutions like’s Amy that could schedule meetings on your behalf via email, or Georgia Tech’s Jill Watson which serves as a teaching assistant in its Knowledge–Based AI course, to the next level. Notice that the AI agent initiated the conversation. Also, notice in the conversations above how the human (recipient of the call) was able to ask the AI agent questions and the AI agent responded appropriately. This type of seamless, natural experience opens the door for AI agents to do a lot more on behalf of their principals.

Google Duplex is important because it means that someone can have an experience with a computer without first having to “learn” how to interact with that computer. Contrast the conversations you just heard with one of your recent phone calls into a customer support line where you were prompted with: “Say 1 for billing, 2 for sales, 3 for directions, or stay on the line for the next available operator.” I’m sure I’m not the only one who would repeatedly press zero so that I could speak to a real person.

There are clear benefits to consumers and businesses from leveraging this technology. For one, it cuts training time to zero. The people who answered the call didn’t need to learn how to talk with the agent or be prompted with a set of responses that the agent would understand. Second, it means that agents can do more. They are no longer limited to interacting with people via a computer. They can interact via phone. I can already imagine a future where you are interacting with a visual agent using augmented or mixed reality. Third, it frees up valuable time and allows you to deploy your people on more value–added tasks and activities.

Of course, this technology advance must also consider the broader implications. What happens if an AI agent incorrectly makes an expensive purchase on your behalf? Should an AI agent be required to disclose itself as one when it interacts with people? Are there implications for believing we are dealing with a person when we aren’t? These are questions we’ll need to answer as we start to build solutions that leverage this new approach.

I’m excited about this technology. In fact, while recently talking to a colleague about AI Agent–Oriented Operating Systems, I cited the movie HER as a futuristic example of what might be possible. While Artificial General Intelligence (AGI) solutions with capabilities like Samantha are still futuristic, Google’s solution represents a significant step forward in some very exciting areas.

