When a powerful new technology like AI is dropped into society with no regulation, there will be folks who use the tool for their own benefit: And unsurprisingly, AI is making scammers’ jobs much easier and more believable. We wrote about the “virtual kidnapping” scam making the rounds—with this scheme, scammers use generic voice recordings of children in distress and call families, hoping the targets would think it’s their child on the other end. But now, scammers are actually using your child’s actual voice, with the help of AI voice generators.
What is an AI voice generator?
This is how Voicemod, a Spanish sound effect software company, defines AI voice generator technology:
AI voice is a synthetic voice that mimic human speech using artificial intelligence and deep learning. These voices can be used by converting text to speech, like with Voicemod Text to Song, or speech to speech, which is how our AI voice collection works.
The technology is now being used to enhance what the Federal Trade Commission categorizes as “imposter scams,” where scammers pretend to be a family member or friend to swindle victims, usually elderly people, out of their money. In 2022, there were over 5,100 reports of imposter scams over the phone, totaling $11 million in losses, according to the Washington Post.
There are services readily available for people to generate voices using AI, such as Voicemod, with very little oversight. Microsoft’s VALL-E text-to-speech AI model claims to be able to simulate anyone’s voice with just three seconds of audio, as reported by Ars Technica.
“It can then re-create the pitch, timber and individual sounds of a person’s voice to create an overall effect that is similar,” Hany Farid, a professor of digital forensics at the University of California at Berkeley, told the Washington Post. “It requires a short sample of audio, taken from places such as YouTube, podcasts, commercials, TikTok, Instagram or Facebook videos.”
How does the AI voice generator scam work?
Scammers use the AI voice generator technology to mimic someone’s voice, often a young child, to fool a relative into thinking the child is being held for ransom. The scammer demands a sum of money in return for releasing the child safely.
As you can see from this NBC News report, it’s easy to grab samples of a person’s voice from social media and use it to generate anything you want to say. It’s so realistic the reporter was able to fool coworkers into thinking the AI generator was actually her, with one of them agreeing to lend her their company card to make some purchases.
The scam is incredibly realistic, and people are being caught off guard. There are also different variations using the same technology. A couple from Canada lost $21,000 ($15,449 in U.S. dollars) following a phony phone call from a “lawyer”: The scammer was pretending to represent their child after he allegedly killed a diplomat’s son in a car accident, and needed money for “legal fees” while their son “was in jail,” according to the same Washington Post report.
What can you do to avoid falling for the AI voice generator scam?
The best line of defense is awareness. If you know about the scam and understand how it works, you’re more likely to recognize it if it happens to you or one of your loved ones.
If you receive such a phone call, you should immediately call or video chat the child or “victim” who was supposedly kidnapped. As counterintuitive as it may seem to do this instead of contacting the authorities or reaching for your credit card to hand over the ransom money, calling them yourself will shatter the scammer’s illusion. If it is, in fact, a scam, you will hear or see the “victim” on the other line going about their regular day.
Other than those solutions, unfortunately, there is not much else you can do to prevent being a target. There is technology in place that can detect AI activity in videos, images, audio, and text, but it is not applicable and available to the public to help in situations like this. The hope is that the same technology that created this problem will eventually create a solution to manage itself.