The Threat of Voice Aging in Voice Biometrics Security

Written by: Mike Yang

As the implementation of voice biometrics has become increasingly popular as a form of identification and authentication, researchers are challenged with determining how users’ voices change over time. New research shows that voices age significantly, even in the short term, making positive authentication more difficult with just voice biometrics alone.
One obstacle making the measurement of voice aging difficult is that every speaker’s voice ages uniquely and at a different rate. There is no universally accepted factor that can be applied to a known authentic recording to compensate for aging.
“Voice biometrics aren’t accurate enough on their own. You have to add other factors like spoofing detection and phoneprinting,” said Dr. Elie Khoury, a principal research scientist at Pindrop, who has conducted a long-term study on voice aging. Khoury delivered an eye-opening presentation on his results at the RSA Conference on February 17.
Biometrics have gained popularity in both consumer and enterprise applications for a number of reasons, specifically their trusted persistence. Most fingerprints and irises don’t change much over time, so these traits can serve as accurate long-term identifiers. But voice is different. Small changes in a user’s voice can have a direct impact on scoring models and result in false acceptances or rejections.
In a two-year study of 122 people — native speakers of English, Dutch, French, German, Spanish, and Italian — Khoury found that the expected error rate (EER) of positively identifying a given speaker increased significantly over time. In fact, the EER nearly doubled over the two-year the study. And it’s not just one trait that changes in a speaker’s voice, either.
“There’s a change in the pitch and the speed of the speech. When you compute the score, it will decrease slowly over time,” Khoury said. “That’s what’s risky for voice biometrics. The score should remain as high as possible for a match. Aging can make false detection or rejection go up over time. And the pitch will change multiple times during a lifetime.”
There also a number of additional factors, besides age, that can contribute to variances over time, including the emotional state, stress levels, health, and vocal effort of the speaker, all of which can have an effect on accurate identification, Khoury said. Compensating for these factors is the challenge for researchers looking to improve the accuracy of voice models.
One way to do improve accuracy is to change the threshold for acceptance, based on the amount of time elapsed between tests. Khoury said updating a model frequently can help account for voice aging. He studied more than 400 recordings of Barack Obama’s public speeches from the beginning of Obama’s first term through the end of the second and found that recalibrating the biometric model significantly reduced the effect voice aging had on the score.
“You can update the model with each new recording, but that’s risky if someone is able to attack the system and compromise the model,” Khoury said.
View the on-demand session:

In an age of such aggressive attacks, voice biometrics alone will not offer the multi-layer approach organizations should implement to fully secure their call center. Phoneprinting provides universal protection for all incoming calls to the contact center, allowing contact center agents to identify unknown attackers on their very first call while also creating a robust intelligent blacklist of known attackers. Contact centers are empowered with the technology necessary to stop fraud loss, reduce operations costs, protect brand reputation and compliance, and improve the customer’s overall experience.
Learn more about Phoneprinting.