Often, technological advances in the healthcare industry are viewed in a positive light. Faster, more accurate diagnoses, non-invasive procedures, and better treatment support this view. More recently, artificial intelligence (AI) has improved diagnostics and patient care by assisting in the early detection of diseases like diabetic retinopathy. But these same technologies made room for a new, alarming threat: deepfakes.

As GenAI becomes more accessible, deepfakes in healthcare are increasingly prevalent, posing a threat to patient safety, data security, and the overall integrity of healthcare systems.

What are deepfakes in the healthcare industry? 

“Deepfakes in healthcare” refers to the application of AI technology to create highly realistic synthetic data in the form of images, audio recordings, or video clips within the healthcare industry.

Audio deepfakes that reproduce someone’s voice are emerging as a specific threat to healthcare because of the industry’s dependence on phone calls and verbal communication. Whether used to steal patient data or disrupt operations, audio deepfakes represent a real and growing danger.

AI deepfakes are a growing threat to healthcare

Deepfake technology being used to steal sensitive patient data is one of the biggest fears at the moment, but it is not the only risk present. Tampering with medical results, which can lead to incorrect diagnoses and subsequent incorrect treatment, is another issue heightened by the difficulty humans have spotting deepfakes.

A 2019 study generated deepfake images of CT scans, showing tumors that were not there or removing tumors when these were present. Radiologists were then shown the scans and asked to diagnose patients.

Of the scans with added tumors, 99% were deemed as malignant. Of those without tumors, 94% were diagnosed as healthy. To double-check, researchers then told radiologists the CT scans contained an unspecified number of manipulated images. Even with this knowledge in mind, doctors misdiagnosed 60% of the added tumors and 87% of the removed ones.

Attackers can also use GenAI to mimic the voices of doctors, nurses, or administrators—and potentially convince victims to take actions that could compromise sensitive information.

Why healthcare is vulnerable to deepfakes

While no one is safe from deepfakes, healthcare is a particularly vulnerable sector because of its operations and the importance of the data it works with.

Highly sensitive data is at the core of healthcare units and is highly valuable on the black market. This makes it a prime target for cybercriminals who may use deepfake technology to access systems or extract data from unwitting staff.

The healthcare industry relies heavily on verbal communication, including phone calls, verbal orders, and voice-driven technology. Most people consider verbal interactions trustworthy, which sets the perfect stage for audio deepfakes to exploit this trust.

Plus, both healthcare workers and patients have a deep trust in medical professionals. Synthetic audio can perfectly imitate the voice of a doctor, potentially deceiving patients, caregivers, or administrative staff into taking harmful actions.

How deepfakes can threaten healthcare systems

Deepfakes, especially audio-based ones, pose various risks to healthcare systems. Here are four major ways these sophisticated AI fabrications can threaten healthcare.

1. Stealing patient data

Healthcare institutions store sensitive personal data, including medical histories, social security numbers, and insurance details. Cybercriminals can use audio deepfakes to impersonate doctors or administrators and gain unauthorized access to these data repositories. 

For example, a deepfake of a doctor’s voice could trick a nurse or staff member into releasing confidential patient information over the phone, paving the way for identity theft or medical fraud.

2. Disrupting operations

Deepfakes have the potential to cause massive disruptions in healthcare operations. Imagine a fraudster circulates a deepfake of a hospital director, instructing staff to delay treatment or change a protocol.

Staff might question the order, but that can cause a disruption—and when dealing with emergencies, slight hesitations can lead to severe delays in care.

3. Extortion

Scams using deepfake audios are sadly not uncommon any more. Someone could create a fraudulent audio recording, making it sound like a healthcare professional is involved in unethical or illegal activities.

They can then use the audio file to blackmail the professionals or organizations into paying large sums of money to prevent the release of the fake recordings.

4. Hindered communication and trust

Healthcare relies on the accurate and timely exchange of information between doctors, nurses, and administrators. Deepfakes that impersonate these key figures can compromise this communication, leading to a breakdown of trust. 

When you can’t be sure the voice you’re hearing is genuine or the results you’re looking at are real, it compromises the efficiency of the medical system. Some patients might hesitate to follow medical advice, while doctors might struggle to distinguish between legitimate communications and deepfakes.

Protecting healthcare systems from deepfakes

Healthcare deepfakes are a threat to both patients and healthcare professionals. So, how can we protect healthcare systems? Here are a few important steps.

Taking proactive measures

Catching a deepfake early is better than dealing with the consequences of a deepfake scam, so taking proactive measures should be your first line of defense. One of the most useful tools in combatting deepfakes is voice authentication technologies like Pindrop® Passport, which can analyze vocal characteristics like pitch, tone, and cadence to help verify a caller. 

Investing in an AI-powered deepfake detection software is another effective mitigation option. Systems like Pindrop® Pulse™ Tech can analyze audio content to identify pattern inconsistencies, such as unnatural shifts in voice modulation. AI-powered tools learn from newly developed deepfake patterns, so they can help protect you against both older and newer technologies.

Remember to train your staff. While humans are not great at detecting synthetic voices or images, when people are aware of the risks deepfakes pose, they can better spot potential red flags. 

These include unusual delays in voice interactions, irregular visual cues during telemedicine appointments, or discrepancies in communication. You can also conduct regular phishing simulations to help staff identify and respond to suspicious communications.

Implementing data security best practices

Proactive measures are the first lines of defense, but you shouldn’t forget about data protection.

Multifactor authentication (MFA) is a simple but strong data protection mechanism that can help confirm that only authorized individuals can access sensitive healthcare systems. With it, a person will need more than one form of verification, so if someone steals one set of credentials or impersonates someone’s voice, there will be a second line of defense.

Encrypting communication channels and even stored data is another vital aspect of data security. In healthcare, sending voice, video, and data across networks is common, so encrypting communication is a must. Protecting stored data adds an extra layer of security, as even if a third party gains access, they would still need a key to unlock it.

Remember to update and monitor your data security practices regularly.

Safeguard your healthcare organization from deepfakes today

When artificial technology first came to the public’s attention, its uses were primarily positive. In healthcare, for instance, synthetic media was, and still is, helpful in researching, training, and developing new technologies. 

Sadly, the same technology can also take a darker turn, with fraudsters using it to impersonate doctors, gain access to sensitive patient data, or disrupt operations. Solutions like Pindrop® Passport and the Pindrop® Pulse™ Tech add-on offer a powerful way to authenticate voices and detect audio deepfakes before they can infiltrate healthcare communication channels.

By combining proactive detection tools with strong data security practices, healthcare providers can better protect themselves, their patients, and their operations from the devastating consequences of deepfakes.

Nearly six months ago, we launched our Pindrop PulseTM solution, a cutting-edge deepfake detection technology for our enterprise customers to help detect AI-generated voices in their call centers. Since then, we have collaborated with news organizations, governments, the music and entertainment industry, and corporate security teams to assess hundreds of suspected deepfakes. From AI-generated robocalls aimed at voter suppression to sophisticated smear campaigns, and from general misinformation in conflicts worldwide to attempts to distort public perception—each case underscores the critical need for robust deepfake detection mechanisms.

The implications of these deepfakes are profound: they threaten the integrity of news organizations, social media platforms, and elections worldwide. The potential for misinformation to sway public opinion and disrupt social order is a stark reality that we now face. 

In response to these grave threats, we’re thrilled to announce Pindrop PulseTM Inspect in Preview, an Audio Deepfake Detection Solution to assist fact-checkers, misinformation experts, security departments, trust and safety teams, and social media platforms. As a forensics tool, Pindrop Pulse is designed to detect AI-generated speech in audio or video media, including both digital media (e.g., deepfakes on social media) and phone call media (e.g., voicemails). Users log into the web application, upload their media files, and within seconds, receive a determination on whether the content contains AI-generated speech. Additionally, users can integrate the Pindrop Pulse award-winning deepfake detection technology programmatically into their own workflows via our simple-to-use APIs.

A Rapidly Growing Problem

Simply stated, ‘deepfakes’ are AI-altered images, text, video, and audio files.

Specifically for speech, this means creating highly realistic audio clips that can convincingly mimic someone’s voice by training an AI-model from their publicly available speech. 

This problem is growing for several reasons. First, the technology has advanced so significantly that the quality of synthetic speech is remarkably high. Second, commercial platforms offering these services have become incredibly affordable. And, the number of available tools for deepfake creation, i.e. Text-to-Speech (TTS) and Speech-to-Speech (STS) have exploded over the past two years that there are now close to 2000 open source Text-to-Speech tools on Huggingface alone.

Humans are notoriously bad at detecting deepfakes. In a study, humans were only able to detect fake audio 54.5% of the time, and in the real world, distinguishing between genuine and fake audio is even more challenging. Scammers who are creating these deepfakes are becoming increasingly sophisticated, often adding background noise or music, or using very short clips of speech to make detection more challenging. These fraudsters are continuously evolving their techniques, making it imperative for us to stay one step ahead in the fight against misinformation.

Over the past 13 years, Pindrop has built a platform based on real-time analysis of +5 billion audio interactions. We have over 270+ patents on voice and security, and 25 patents on audio deepfake detection alone. Today, we’re proud to package our experience and technology into a tool that helps combat the most deceptive audio deepfakes, particularly for the news media or organizations that rely on the accuracy of their content to maintain customer trust and the credibility of their organization.

Good AI to Fight Bad AI in the Media

Pindrop has partnered with some of the market and technology leaders fighting misinformation online. For example, TrueMedia.org was among the first adopters to test our solution in their workflows and reported that the Pindrop Pulse audio deepfake detection had better accuracy than other alternatives in detecting synthetic speech. 

According to Oren Etzioni, CEO of TrueMedia.org,“TrueMedia.org is a non-profit, non-partisan AI project to fight disinformation in political campaigns by identifying manipulated media. Our comprehensive evaluation found Pindrop’s audio deepfake detection has better accuracy than other alternatives in detecting synthetic speech. We are excited to partner with Pindrop in this mission, and add Pindrop’s deepfake detection technology in the solution for our customers and users across the world.”

Pulse Inspect offers trust and safety teams a forensics tool to enhance their disinformation detection workflows.

  • Best-in-class Performance: Pindrop has trained its deepfake detection model on over 370 deepfake generation tools with over 20M statements (both genuine and synthetic), enabling us to achieve over 99% accuracy against previously seen deepfake models and 90% of “zero-day” attacks that use new or previously unseen tools. We’ve also had third parties confirm that our solution had over 40 percentage points higher accuracy than competing solutions on audio. 
  • Resilience: News and social media are global businesses and need support to detect deepfakes across various languages. Pindrop PulseTM Inspect is language agnostic and its underlying training models have been tested and validated on over 40 languages that cover over 90% of the internet’s spoken languages. This technology offers resilience to adversarial attacks such as addition of noise, reverberance or speech changes. 
  • Breadth of Audio: The same Pindrop Pulse technology that identifies over a million social engineering attempts in the call center has now expanded to digital media. Pulse Inspect supports both phone call audio (8kHz) and high-fidelity social media audio (44.1kHz). It also provides detection capabilities irrespective of whether synthetic speech is created using text to speech, speech to speech or voice conversion techniques.
  • Video Support: Pulse Inspect supports audio deepfake detection in videos. The platform analyzes video files for AI-generated speech by extracting audio content out of video media types. 

Explainability: Pulse Inspect offers segmental analysis of uploaded media to aid in the detection of partial deepfakes. This feature provides a visual indicator to users to help determine which segment in a long-form media file is synthetically generated vs. segments which most likely do not contain synthetic speech.

Free trial

With Pulse Inspect in Preview, we invite those who are responsible for identifying and reporting on deepfakes to evaluate our technology, at no cost.

Request access to a free trial here.

1. https://www.pindrop.com/blog/pindrop-named-a-winner-in-the-ftc-voice-cloning-challenge
2. https://synthical.com/article/c51439ac-a6ad-4b8d-82ed-13cf98040c7e
3. https://www.pindrop.com/blog/exposing-the-truth-about-zero-day-deepfake-attacks-metas-voicebox-case-study
4. In the NPR study, Pindrop detected 81 out of possible 84 (96.4%) voice samples correctly, compared to the nearest competitor who detected 47 out of 84 (56% – excludes samples identified as inconclusive).
5. Statista: Languages most frequently used for web content as of January 2024
6. Terms and conditions apply.

Voice security is
not a luxury—it’s
a necessity

Take the first step toward a safer, more secure future
for your business.