Your Voice Is Not Your Own

Written by: Mike Yang

The security industry has been trying to replace usernames and passwords since, well, forever, and with little success. The rush to employ biometrics has produced plenty of options, some of which can be defeated by Gummy bears, and no clear winner.
Voice recognition recently has emerged as one of the leaders in the clubhouse in this tournament for some applications, including building access and authentication over the phone. The idea behind voice authentication is simple and mirrors that of biometrics in general: use a trait that’s unique to each person. However, recent research from the University of Alabama at Birmingham shows that defeating authentication systems based on voice printing is a simple task.
The attack developed by the researchers shows some of the inherent weaknesses in these systems. What they found is that by recording just a few minutes of a person’s speech, building a model of the target’s voice, and then using voice-morphing software, an attacker could impersonate a target with a high success rate.

“As a result, just a few minutes’ worth of audio in a victim’s voice would lead to the cloning of the victim’s voice itself,” said Nitesh Saxena, the director of the Security and Privacy in Emerging computing and Networking Systems lab and associate professor of computer and information sciences at UAB. “The consequences of such a clone can be grave. Because voice is a characteristic unique to each person, it forms the basis of the authentication of the person, giving the attacker the keys to that person’s privacy.”

Recording the target’s voice is the key to the success of the attack, but that can be done easily.

Recording the target’s voice is the key to the success of the attack, but that can be done easily, especially now that everyone carries a high-quality voice recorder around in their phone. Holding an iPhone in your hand with the recorder running while standing next to someone during a casual conversation would be enough to gather the audio needed. With that done, the attackers can build an accurate model of the target’s voice, which then can be used to attack voice-recognition and authentication systems.
Banks, financial services companies, credit card companies, and others have begun adopting voice recognition as a primary authentication method, something that the UAB researchers show opens these companies up to potential impersonation attacks.
“Our research showed that voice conversion poses a serious threat, and our attacks can be successful for a majority of cases,” Saxena said. “Worryingly, the attacks against human-based speaker verification may become more effective in the future because voice conversion/synthesis quality will continue to improve, while it can be safely said that human ability will likely not.”
Aside from the security implications, there are potential personal consequences, as well. The attack also could be used to impersonate a target to her friends or colleagues in a phone conversation or fabricate voice messages. That opens up an entirely different set of problems, most of which consumers are ill-equipped to deal with.