Search
Close this search box.
Search
Close this search box.

Written by: Elie Khoury

VP Research

The FTC (Federal Trade Commission) held its sixth Voice Cloning Challenge under the America COMPETES Act to drive innovation in voice cloning detection for protection of consumers. Pindrop was announced as the only winner in the large organization category for its real time Voice Cloning Detection technology. This challenge and the diversity of winners, including Pindrop’s Recognition Award, underscores the severity of the challenges posed by voice clones to our financial and media landscape and also validates liveness detection technology as part of a multi-disciplinary approach to solve this problem. 

Why does liveness detection matter?

The US and many other countries have already been targeted by voice clones to spread misinformation and to commit fraud. Pindrop’s own research found out that more than 90% of US consumers have concerns about voice cloning with a significant portion of those having already witnessed voice clones made of them with or without their consent. Rapid proliferation and increased believability of voice cloning technology along with its reduced cost has exacerbated its threat. Liveness detection is a critical asset at our disposal to help stop this threat. 

How exactly will liveness detection help against voice cloning? 

Liveness detection can work on several fronts to help detect the presence of voice clones or non-live voices: 

  • Protect conversations in real time: Pindrop Pulse Liveness Detection can evaluate each incoming phone call in real-time in 2-second chunks and compute a liveness score. If the score is lower than a predefined threshold, the audio is flagged as potentially deepfake (a.k.a. “cloned voice,” “spoofed voice”). 
  • Authenticating genuine callers in real time: Liveness detection, as part of a multi-factor ensemble, can perform a speaker specific liveness check, which can be a vital input to authenticate the caller. This can be done in real time without adding to caller friction or delaying the verification process. 
  • Detect voice clones for media and digital channels: Liveness detection can help evaluate social media video and audio clips by stripping the audio into 4 second segments and individually analyzing and scoring each segment. If the score is lower than a predefined threshold, the segment would be considered “non-live”.

 

Why does liveness detection work?

Just as voice clones leverage generative AI, liveness detection leverages breakthroughs in Deep Neural Networks (DNN) which are conceived for deepfake detection. Pindrop’s liveness detection is trained on DNN and over 120 Text-To-Speech (TTS) and voice cloning systems. The liveness detection system filters out the nonspeech frames (e.g., silence, noise, music), extracts low-level spectro-temporal features, runs through a series of neural layers, and finally outputs a “Fakeprint” – a mathematical representation that distinguishes between machine-generated vs. generic human speech. This specialized and in-depth training of the neural model to help differentiate between live and non-live voices enables the liveness detection system to identify voice clones. 

Why is the result of the FTC challenge important for the wider industry?

The scope and judging criteria for the FTC challenge includes not only innovation but also resilience, administrability, feasibility to execute, increased company responsibility and reduced consumer burden. The FTC is concerned with how the wider industry can align on a shared goal of consumer protection. Voice clone technology creators, liveness detection providers, telecommunication network providers, financial institutions and media companies have a role to play to ensure ethical use of AI and to protect the customers at every stage of the process. A multi-disciplinary approach, combining products, policies and workflows, that address multiple intervention points, is required to effectively deal with the voice cloning threat. 

Pindrop has proven its deepfake detection industry leadership through independent third party evaluations and industry benchmark challenges. Pindrop is engaged with the broader government efforts on security for GenAI and has also partnered with Respeecher to enhance its voice clone detection technology and to promote ethical use of AI. By winning the FTC’s large organization category for its real time Voice Cloning Detection technology, Pindrop continues to demonstrate its commitment to fighting deepfakes and help ensure a more secure digital security environment.

 

1. Based on research by Pindrop Labs of non-live calls in contact centers, AP News “Chatbots’ inaccurate, misleading responses about US elections threaten to keep voters from polls:, Feb 2024, Voice.com “How I Broke Into a Bank Account With an AI-Generated Voice:, Feb 2023
2. Voicebot.ai Deepfake and Voice Clone Consumer Sentiment Report, Oct 2023
3. https://www.wbur.org/npr/1165146797/it-takes-a-few-dollars-and-8-minutes-to-create-a-deepfake-and-thats-only-the-sta
4. https://www.ftc.gov/system/files/ftc_gov/pdf/Voice-Cloning-Challenge-Rules-2024-01-02.pdf
5. https://www.isca-archive.org/odyssey_2020/chen20_odyssey.pdf
6. https://www.schumer.senate.gov/newsroom/press-releases/statements-from-the-eighth-bipartisan-senate-forum-on-artificial-intelligence
7. https://www.pindrop.com/blog/pindrop-and-respeecher-join-forces-to-help-keep-voice-cloning-away-from-harmful-uses

More
Blogs