MIT is making a device that can ‘hear’ the words you say silently

It's like having Siri listen to your internal commands.

Share

Students from MIT have created a prototype device, dubbed AlterEgo, that can recognize the words you mouth when silently talking to yourself—and then take action based on what it thinks you’re saying.

Arnav Kapur, a master’s student at the MIT Media Lab—a division of the Massachusetts Institute of Technology that focuses on the intersection of people and technology— and author of the paper, stresses that the device doesn’t read thoughts or the random, stray words that just happen to pass through your mind. “You’re completely silent, but talking to yourself,” he says. “It’s neither thinking nor speaking. It’s a sweet spot in between, which is voluntary but also private. We capture that.”

The prototype system, as it exists right now, looks like a white headset a telemarketer might wear. But instead of a mic hovering in front of their lips, it sticks to the face and neck, where a handful of electrodes pick up the miniscule electrical signals generated by the subtle internal muscle motions that occur when you silently talk to yourself. The device connects via Bluetooth to a computer, which then communicates with a server that interprets the signals to determine what words the wearer is articulating.

It’s very much in the prototype stage, though it represents a fascinating departure from the norm. We most often interact with our devices by touching them—typing on a smartphone, pressing on an app, or double tapping the side of Apple’s AirPods to pause or play music. Or, we talk to our gadgets or smart speakers by engaging with digital helpers like Siri, Alexa, or the Google Assistant. Those services require more from you than speaking silently to yourself. Put another way: this type of tech is like having a simpler version of Siri hear your silent whispers.

The goal of all this? To further “combine humans and computers,” Kapur says. The more tightly we interact with computers, the more we can take advantage of their strengths—like quickly getting help with a math problem or a translation—without having to look up from your work and click, tap, or type.

Or a user could simply change the channel on the Roku—those remotes are so small and easily lost!—in total silence. The AlterEgo also seems promising for people with disabilities, or paralysis. But Kapur says they haven’t been able to study that application yet.

To be sure, the tech is still in its early stages, so each app only has the capacity to learn about 20 different words. The system can’t understand every word a person says—just the ones it has been taught. Talking to yourself deliberately, but not saying anything out loud, is an easy practice to learn, Kapur says. When training someone to use it, they start by asking them to read a passage aloud. “After that, we ask them to not voice the words” as they read, he says. “It’s more comfortable than speaking out loud.”

To build the system, Kapur used a common artificial intelligence tool called a neural network, which can learn from data inputs. They trained the neural network to recognize how different electrical signals correspond to the different words a person could say to themselves.

While it’s easy to see military applications of such a device—a professor from Georgia Tech’s College of Computing, Thad Starner, said in a statement on MIT’s website that he could envision “special ops” using such a device—Kapur says that’s not their intended goal for the system.

“This is more about how we could bridge the gap between computers and humans,” he says. The ideal scenario is one in which people can augment themselves with the smarts of an artificial intelligence system smoothly and in real time.

The next step: work on the device’s form, so it’s a bit “more invisible.” It’s all about that seamless integration—so ideally future versions won’t look like a taped-on telemarketer’s headset.