Voice Biometrics: The ‘Thumbprint’ of the Future | Pindrop

Voice Biometrics: The ‘Thumbprint’ of the Future

Biometrics can range from fingerprinting to DNA matching, iris matching to voice recognition to everything in between. Together, these technologies make up biometrics, which can be defined as the measurement and analysis of a person’s unique physical and behavioral characteristics. Moving into an age defined by voice - not touch, the thumbprint of the future can be identified as voice biometrics.

Voice Biometrics Security | Voice Authentication Security

Biometric technologies provide a basis for identity corroboration that makes use of inherent biological and behavioral traits unique to each user, and offer better user experience, customer experience and accountability than other commonplace credential-based methods.

Voice biometric technology does offer higher level security than found in other solutions due to the reliance of biometrics on more than physical characteristics. Voice biometric engines take into consideration the behavior of the voice, what is said, and countless other voice features.

These features are measured indirectly, looking at data that is generated by factors of how you speak, but is not exclusive to those physical characteristics. The information that is provided by these features is hidden by the audio. The job of voice biometrics is to try to match enrollment samples that also have that hidden information with trial samples that have the information.

Additionally, there are things you can change about your voice and things you cannot change. For example you cannot change the length of your vocal tract, fundamental frequency of your voice, etc, but you can change the pitch, accent, speed, pacing. These qualities you can change are what humans use to recognize and identify voices, which are also dependent upon inherent and non-inherent contributors. Inherent, or text independent voice biometrics can be compared to text dependent voice biometrics, or non-inherent.

So this leads us to the question, is voice ID safe?

The security of voice identification is dependent upon several factors that you should be asking:

Additionally, safety is a vague term, that can have different meanings that need to be addressed:

Why are people worried about it being safe? 

People are concerned for security, and may ask themselves: “Will someone be able to take advantage of my voice? Will my voice be attached to a thing I will be embarrassed of? Will my voice be used in mass surveillance? Will someone use my voice to spoof or to act like me?”

Can someone spoof my voice/utilize my voice in a replay attack or synthesize my voice? 

If your voice protection can’t detect these attacks – then no, it is not safe. Speech production is more than your identity – voice biometric distills out the identity portion of your identity, and the remaining data isn’t stored. In order to recreate a voice, the pitch, word choice, pace, and other behavioral characteristics would need to be known. There’s a lot more required to produce a model of someone’s speech than to identify someone based on their voice.

Can a voiceprint be converted back to a real person’s voice?

In addition to these factors, time is also something that has to be considered. Time varying information is taken out and 60-500 numbers are extracted that represent your voice identity, which are taken from a recording. There’s only so much you can represent in 60-500 numbers, and in order to recreate someone’s speech, every millisecond would have to be generated; therefore, a voiceprint is not reversible.

 

Chapter 1: Voice Authentication and Recognition Biometrics

What is voice biometrics?

Voice biometrics is the technique used to recognize a speaker by the characteristics of her/his voice. Every caller’s voice presents unique acoustic characteristics and behavioral features over the wire that can be invisibly analyzed to authenticate or identify a caller. Voice biometrics can be broken down into further categories of speaker identification (SI) and speaker verification (SV). Speaker identification examines the problem of identifying a speaker from a given set (1:N problem), whereas speaker verification determines an identity of a speaker from his or her voice.

What is Identification?

Identification examines the problem of identifying a speaker from a given set of speakers. Two scenarios are typically considered: Closed-set identification scenario where the real speaker is one of N known speakers, and open-set identification scenario where the real speaker could be one of the N know speakers or an unknown speaker. Open-set identification is typically the case of Blacklist fraud detection.

What is authentication? 

Authentication, also known as verification, is the process of verifying the identity of a person is who they claim they are, which can be used to combat fraud. It is a 1:1 problem. For example, voice biometrics can be used as a method of authentication, by validating a speaker’s voice to confirm the speaker is who they say they are. Due to the contactless nature of speech, voice biometrics have moved to the top of the list of convenient biometrics integrated into authentication.

What is voice recognition? / What is voice recognition software?

Voice recognition, on the other hand, can be confusing, and to the industry can refer to either speaker recognition (i.e. voice biometrics) or speech recognition.  

Voice recognition is a main factor used in voice biometrics, usually in the form of software taking a voiceprint, which can then be used to authenticate the speaker. For example, Pindrop’s Deep Voice™ biometric engine extracts the unique characteristics from a caller’s voice that is used to create a credential for each unique caller.

Voiceprints can be taken passively, while a speaker is talking freely, or actively, while a speaker is instructed to say a specific phrase. However, voice recognition is defined by the print taken from a speaker’s voice, and independent from the speech used to create the voiceprint. Voice prints used in voice biometrics, widely within call centers, are being utilized to strengthen authentication processes, providing an additional layer to security.

What is speech recognition? / What is speech recognition software?

On the other hand, speech recognition aims to recognize what is spoken. Automatic speech recognition (ASR) is much more common to consumers. Also known as voice command, ASRis found in voice assistants, like Siri, Amazon Alexa, Google Home, and other user interface technology. Whereas voice recognition offers the ability to be used for authentication, speech recognition provides users the ability to interact with technology via voice.

Chapter 2: How does Voice Biometrics Work?

How does voice recognition work? / How does voice recognition biometrics work? 

Voice recognition begins with a voiceprint or template taken from a segment of speech, which is then stored in order to authenticate the speaker during other instances. The voiceprint is created with the help of technology or software that breaks down the speech into different frequencies and identifies other behavioral characteristics that work together to collectively make up the print.

Voiceprints are like fingerprints, each is unique and is linked to an individual. Therefore, the voice recognition systems work by storing these prints in databases, which are used later to identify or authenticate speakers. The systems used in voice recognition technology focus on determining the similarities between stored voiceprints of individuals and unfamiliar speech.

However, even if a speaker has a voiceprint on file, their voices will change overtime. Factors like age, health, and stress create differences in the sound and characteristic of their voice, which may result in the inability to verify the speaker.

How does voice biometrics work? 

Voice biometric systems analyze characteristics of a speaker’s voice to create a unique voiceprint, or a “vocal fingerprint” to verify a speaker is who they claim to be. The assumption is that no two individuals sound identical because everyone’s voice-producing-organs are different. In addition to physical differences among individuals, each speaker has their own manner of speaking, including accents, rhythm, intonation, pronunciation, choice of vocabulary, and more.

There are two major modes of voice biometrics, one to verify legitimate speakers, and the other to identify repeat fraudsters. There are also two types of voice biometric engines: text-dependent and text-independent. Text-dependent biometric engines requires a speaker to enroll a fixed-phrase and is prompted to repeat that phrase during verification, whereas enrollment and verification in text-independent biometric engines is based on unconstrained natural speech.

Additionally, voice authentication is different from speech recognition, whereas the goal of speech recognition is to identify what is spoken, voice authentication verifies vocal characteristics against those associated with the enrolled user to identify an individual.

 

Chapter 3: Voice Authentication Biometrics Security: Is Voice ID Safe?

Voice Biometrics Security | Voice Authentication Security

Biometric technologies provide a basis for identity corroboration that makes use of inherent biological and behavioral traits unique to each user, and offer better user experience, customer experience and accountability than other commonplace credential-based methods.

Voice biometric technology does offer higher level security than found in other solutions due to the reliance of biometrics on more than physical characteristics. Voice biometric engines take into consideration the behavior of the voice, what is said, and countless other voice features.

These features are measured indirectly, looking at data that is generated by factors of how you speak, but is not exclusive to those physical characteristics. The information that is provided by these features is hidden by the audio. The job of voice biometrics is to try to match enrollment samples that also have that hidden information with trial samples that have the information.

Additionally, there are things you can change about your voice and things you cannot change. For example you cannot change the length of your vocal tract, fundamental frequency of your voice, etc, but you can change the pitch, accent, speed, pacing. These qualities you can change are what humans use to recognize and identify voices, which are also dependent upon inherent and non-inherent contributors. Inherent, or text independent voice biometrics can be compared to text dependent voice biometrics, or non-inherent.

So this leads us to the question, is voice ID safe? 

The security of voice identification is dependent upon several factors that you should be asking:

Additionally, safety is a vague term, that can have different meanings that need to be addressed:

Why are people worried about it being safe?

People are concerned for security, and may ask themselves: “Will someone be able to take advantage of my voice? Will my voice be attached to a thing I will be embarrassed of? Will my voice be used in mass surveillance? Will someone use my voice to spoof or to act like me?”

Can someone spoof my voice/utilize my voice in a replay attack or synthesize my voice?

If your voice protection can’t detect these attacks – then no, it is not safe. Speech production is more than your identity – voice biometric distills out the identity portion of your identity, and the remaining data isn’t stored. In order to recreate a voice, the pitch, word choice, pace, and other behavioral characteristics would need to be known. There’s a lot more required to produce a model of someone’s speech than to identify someone based on their voice.

Can a voiceprint be converted back to a real person’s voice?

In addition to these factors, time is also something that has to be considered. Time varying information is taken out and 60-500 numbers are extracted that represent your voice identity, which are taken from a recording. There’s only so much you can represent in 60-500 numbers, and in order to recreate someone’s speech, every millisecond would have to be generated; therefore, a voiceprint is not reversible.

Chapter 4: Who is Using Voice Biometrics?

Voice biometrics are being used widely in the call center. Enterprises are taking note of how fraudsters are easily able to hack into accounts using passwords and PINs – which are common assets leaked in data breaches. Voice biometrics offer an alternative to traditional security measures that can be put to use by call centers to identify account holders, and on the other hand, the fraudsters attempting to hack into others’ accounts.

From 2013 to 2017, voice fraud has increased by 350% – with some industries experiencing larger uprises in voice fraud. For example, retail has seen a major increase in fraud calls, from 1 in 1,000 calls being fraudulent in 2014 to 1 in 427 calls in 2017.

Additionally, voice biometrics are not limited to the call center, but is making its way into other markets and industries – think voice assistants. As Amazon Alexa, Google Home, and Microsoft Cortana see more and more growth, consumers may be more likely to be interested in using voice biometrics to unlock their personal devices.

Chapter 5: Advantages & Features of Voice Biometric Authentication

Advantages of Biometric Authentication | Advantages of Biometric Security 

Advances in natural language understanding have enabled technology providers to deliver seamless and intuitive experiences for various digital and physical interfaces. Voice biometrics makes technology fade or disappear one step further by making identification as easy as speaking.

There are several benefits and advantages of biometric security and authentication, including the increased security resulting from multi-factor authentication measures. As previously mentioned, voice biometrics involves not only someone’s voice, but behavior and other factors surrounding voice. With the increased security comes a decrease in fraud – voice biometrics and authentication can help identify fraudsters within the phone channel.

Secondly, biometric authentication and biometric security will allow for a smoother customer experience, as passwords, KBAs (knowledge-based authentication questions), and other legacy security tactics often add additional friction into the interaction. Voice biometrics can occur passively, behind the scenes, allowing the customer’s experience to be held at top priority.

Known for the ability to reduce call time, voice biometric solutions can also be attributed to reduced operations costs, saving companies millions of dollars by eliminating steps involved in traditional authentication methods.

Lastly, employing voice biometrics can protect brand reputation. In an age when there are several ways to obtain a product or service, consumers are able to switch to competitors with ease. Consumers trust companies to protect their account information and their privacy – if the company encounters a breach or does not offer a seamless customer experience, consumers will be able to easily take their business elsewhere.

Chapter 6: Voice Biometrics & The Future of Authentication

In the future of authentication, face, fingerprint and voice biometric modalities will be built on integrated sensors in mobile devices, which will enjoy higher business value and faster adoption. These biometric technologies will become more refined and advanced because of the growing popularity in mobile devices. Another factor of this popularity can be pinpointed to IoT manufacturers building biometric technology natively into mobile applications and IoT firmware.

Voice biometrics and the future of authentication will depend on:

2019 Voice Intelligence Webinar Series – where voice, not touch is the main interface for customers.