What if you could have a conversation with a loved one who has passed away? Or at least: what if you could have a conversation with a computer program that simulated that person? Would you want to? Would it be creepy? Or might it, in some strange way, be comforting?
“You sound just like him.”
“Almost creepy, isn’t it? I say creepy… I mean it’s totally batshit crazy I can even talk to you. I mean, I don’t even have a mouth!”
“That’s just the sort of thing that he would say…”
“Well… that’s why I said it.”
This isn’t really possible… yet. It is the plot of an episode (“Be Right Back“) of the British science fiction television series Black Mirror. Being a science fiction show, it has suspense and plot twists, with everything from psychological drama to social commentary. But the plot hinges on a technological premise, and raises an interesting question: how realistic is the artificial intelligence technology they’re talking about?
Consider how this artificial intelligence “service” is described in the episode. The computer system goes through everything public that a person has ever written on the internet, any public correspondence available, and uses it to build a model of how he or she communicates: word choice, writing style, sense of humor, everything. For a small extra fee, you can provide passwords so that the system can access all of your private emails and texts and have even more raw data to work with.
All of this information is fed into a program that will then simulate that person’s responses in conversation. You send texts to a number, and the computer program will text you back… responding exactly the way your loved one would have responded. And if you allow yourself to be caught up in the moment, it will feel like your texting with a long lost friend or family member.
Setting aside for the moment whether or not it’s creepy… is it plausible?
Today’s Chatbot Technology
I ran a “chatbot” on my personal blog for several years that was designed to simulate a conversation with me. I mischievously even called it a “live chat”: a place on the page where people could type things in, and they would get a reply. Here is an example of one of the conversations it had:
Greg: Hi! Is anybody there?
> hi Greg?
Greg: That’s me! How are you?
> Good I guess. You?
Greg: I’m fantastic!
> so I saw your article about fitness I was wondering if you could explain somethingform e a little more
Greg: You want me to explain somethingform e?
> lol ok I keep working out my chest but don’t get any stronger I don’t know why
Greg: I’ve written some articles with workout tips in my blog here.
> i know I read them but I don’t know what I’m doing wrong
Greg: Do you think I should know what you’re doing wrong?
> idk I just thought maybe you could give me some advice
Greg: What would you like advice about?
> my chest workouts
Greg: I’ve written some articles with workout tips in my blog here.
Most people realized they were talking to a computer program almost right away. Others chatted with “me” for hours. I even had a couple of people confront me on Twitter, asking me why I had been so rude to them on my website.
This is how the chatbot worked. First, I created an account on the Pandorabots website. Pandorabots was created by Dr. Richard Wallace, the artificial intelligence researcher who invented the Artificial Intelligence Markup Language (AIML) and designed A.L.I.C.E., the automated conversation agent (chatbot) that won the Loebner Prize for realism (i.e. being able to carry on a conversation indistinguishable from a human being) in 2000, 2001, and 2004.
On the Pandorabots website, I could create my very own chat bot from scratch. Their system would run the basic underlying software on their server; all I needed to do was define the “rules” that determined exactly how the bot would respond to input. This was all done by rules called “categories” that assigned response templates to input patterns.
For example, I might have a category that says if the input sentence contains the phrase “please explain X” (where X represents any phrase at all) then the chatbot should respond with “You want me to explain X?” Or if the input sentence contains the phrase “I don’t know X” then the chatbot should respond with “Do you think I should know X?”
The artificial intelligence markup language (AIML) provides a standardized way to represent these rules. If you are the type of person who likes looking at computer code, the two examples I just gave you in the previous paragraph would look like this:
<pattern>_ PLEASE EXPLAIN *</pattern>
<srai>You want me to explain <star index=”2″ />?</srai>
<pattern>_ I DON’T KNOW *</pattern>
<srai>Do you think I should know <star index=”2″ />?</srai>
The feature that makes this system so powerful is that these rules can be applied recursively. In other words: once one rule matches the input sentence, the output of that rule can be used to match a different rule that changes the output again. Each input can be processed over and over again until there are no more rules that can be applied.
This system allows you to define a whole set of rules to “normalize” or “reduce” inputs gradually into the simplest possible format. You can have rules that strip out extraneous phrases like “sort of” or “actually”, and rules to transform complex phrases like “I was wondering if you could tell me” into simpler equivalents like “Please tell me”.
As a result, I could get my bot to respond to “How tall are you?”, “What’s your height?”, “What height are you?”, “Can you please tell me how tall you are?” and any number of other variations the same way. My bot contained thousands of these “reduction” rules, many of which (luckily for me) came pre-packaged as part of the open-source ALICE program.
The rest of the rules were the fun part: devising the actual answers that my bot would give to questions. I created rules to give answers about my profession, about my taste in music and books, and about my personal physical attributes. I learned over time (by looking at the logs of conversations the bot was having) that I needed to include rules to cover misspellings, questions about obscure trivia, and questions that were very, very dirty. Quelle surprise–this is the internet, after all.
As you can imagine, creating these rules by hand took up an enormous amount of time. Even tens of thousands of categories later, the end result was very clearly a robot. I suppose it’s theoretically possible to put together a large set of categories that would simulate me convincingly, I can’t even imagine how many categories it would take, or how much time I would need to figure them out.
This has been the biggest “pain point” in the chatbot industry for a number of years. In the business world, most chatbots are used for a limited range of conversations, such as surveys or helpdesk questions. In these settings conversations with automated chatbots can sound very natural with only a few thousand AIML categories, as long as the human on the other end is cooperating and not trying to trip up the program.
The ALICE bot, on the other hand, was designed to sound realistic even for arbitrary chit-chat and casual conversation, and required more than 40,000 categories. But whether it’s 4,000 categories or 40,000, most businesses just don’t have the time and energy available to do the drudge work of hand-crafting, one by one, the massive number of rules required to create a realistic automated conversation agent.
Automatic Category Learning: How to Talk with Your Dead Relatives
The innovation imagined by the Black Mirror episode was automatic category learning: being able to take transcripts of actual conversations on the internet and convert them into rules that operate like the AIML categories we were just looking at. The assumption is that if the person has had enough conversations online, there will be a rich enough field of input data to make a program that sounds natural and convincing.
The script writers for the show were not the first people to come up with that idea, of course. Automated category learning has been a top priority for chatbot research for years. Dr. Wallace recently filed a patent, Patent US8818926, for a method for doing exactly what the plot of the Black Mirror episode described: taking a written dialogue as input, and converting it into a set of AIML rules that allow a chatbot to simulate that conversation.
Dr. Wallace’s patent identifies one of the key issues that makes this process very difficult: the problem of generalization. If one day someone tweeted to you “What would you say is your favorite color?” and you tweeted back “blue”, you can’t have a program that simply creates an AIML category rules that matches that exact input:
<pattern>WHAT WOULD YOU SAY IS YOUR FAVORITE COLOR</pattern>
A category like that does not generalize. This bot still would not be able to answer questions like “What is your favorite color?”, “What color do you like the most?”, or even “Which color do you think is best?” The rule is too literal.[recent_posts type=”post, portfolio” category=”” count=”1, 2, 3, 4″ offset=”1, 2, 3…” orientation=”horizontal, vertical” no_image=”true” fade=”true”]
To solve this problem, Dr. Wallace proposed applying very sophisticated reduction rules to inputs before the script creates a new category. If there were a set of reduction rules that could transform the input “What would you say is your favorite color?” to “What is your favorite color?” then the chatbot could learn the correct response to this general form.
That chatbot would also be able to answer any of the related questions–What color do you like the best? Which color is your favorite? What’s your absolute favorite color–as long as the reduction rules were rich and varied enough to convert all of those forms into the canonical form “What is your favorite color?”
Obviously, the problem is still extremely difficult. The biggest hurdle will still be designing complex and sophisticated reduction rules. However, Black Mirror has correctly identified that this is the current direction of cutting-edge research on chatbot technology. Far from being a flight of fancy, the project of having an app that can re-create a person’s conversational style based on records of prior conversations is exactly the hot topic of research in the industry. If all goes well in research and development, we may have that app that lets you talk to a long-lost loved one within the next several years.
Then the only question that remains: would you actually want to use it?