Published by Security Testing and Assurance on 10 August 2023
Over the past year, artificial intelligence (AI) has rapidly shifted from being the stuff of science fiction to a commonplace concept. Tools that let you generate intricate images out of thin air and groundbreaking new advancements in generative AI applications like ChatGPT have made AI more accessible, applicable and popular than at any time previously.
It shouldn’t come as a surprise to anyone that these AI tools are firmly on the radar of scammers too. That’s what my team at CyberCX set out to show viewers of 60 Minutes when the program approached us to demonstrate just how easy it is for scammers to utilise seemingly complex AI technology to target their victims.
We decided to show how AI tools can be used by scammers to evolve the age-old scamming technique of call spoofing.
Here’s how we did it, as well as what you need to know to keep yourself safe from being scammed.
Spoofing 60 Minutes
Call spoofing is essentially where scammers will disguise the calling party ID of a phone call. You might think you’re getting a call from your bank – because that’s what the caller ID says – but really it’s from a scammer pretending to be your bank.
With AI, spoofing is about to become even harder to detect.
Working with Nine’s 60 Minutes, we created an AI clone of reporter Amelia Adams’ voice. In a spoof that appeared to come from Amelia, we used AI tools to convince one of Amelia’s colleagues that they were having a conversation with her. We then asked the colleague to hand over Amelia’s passport details.
Overcoming the roadblocks of voice cloning
Nearly all publicly available services that allow for easy, rapid voice cloning had one of two restrictions: American accents, and/or the requirement of a spoken authorisation letter that matched the voice.
While bypassing the authorisation letter can be straightforward for criminal scammers, we couldn’t find an Australian accent we thought would be convincing enough. So, we trained our own model from scratch.
Jason Edelstein, Executive Director of Security Testing and Assurance
The voice
To convince Amelia’s colleague to hand over her passport details, we had to ensure the AI voice sounded as close to Amelia’s actual voice as possible.
- We needed a good quality recording of Amelia speaking for the basis of the cloning process. The more audio, the better. Searching through YouTube we found an old podcast recording featuring Amelia.
- We used a program to analyse the recording and understand the unique characteristics of Amelia’s voice. The program then used this data to create a ’model’ or blueprint that captured all the important features and patterns of her voice.
- The final step was to feed the application with audio of a voice actor saying the lines that we needed Amelia to say. This program could pick out words and reshape them to mimic the way Amelia would say them. When it was finished, we were left with an audio file of Amelia saying things she never said.
By predicting the approximate flow of the conversation, we could effectively prepare a range of sentences and phrases that were most likely to come up in conversation. Instead of playing a prepared script and hoping it worked verbatim, this allowed us to engage in conversation with Amelia’s colleague and have a range of answers ready for different possible responses.
Text-to-Speech and morphing
The most common version of AI speech generation is done using Text-to-Speech (TTS) and allows the user to type a sentence that will then be read aloud by the voice model. While this technology can be useful, for our scenario we opted to use a system known as ‘morphing’.
Morphing allowed us to use a voice actor to give emphasis and emotion to the phrases we needed Amelia to say, adding an aspect of realism that isn’t present in TTS voice clones.
The end result of all our work was a spoof call seeming to come from Amelia’s phone number, with an AI voice model programmed to sound exactly like Amelia on the other end of the line.
What does this mean?
It is an unfortunate reality that wherever technology goes, scammers will follow. The proliferation of AI technology is the latest example of this.
That’s why we felt it was important for us to demonstrate to 60 Minutes’ audience how scams are evolving with AI and emphasise the need to stay vigilant. There are a few important lessons you can take away from this:
- Call spoofs are real and common – just because you recognise the number or the caller ID, it doesn’t mean the call is legitimate. New AI tools can now also mimic the voice and the speaking style of someone you might know or trust.
- If anything ever seems suspicious, hang up and call the person back or contact them via a different communication channel. An example of this could be someone you know asking for personal or sensitive information they wouldn’t normally ask you for.
- If you’re unsure, ask the person questions that only they would know the answer to. If they don’t understand or can’t remember, hang up.
- Don’t trust someone just because they have some of your personal information – this type of data can be scraped from the web or exposed by data breaches, which are becoming increasingly common.
- Don’t give out any sensitive or personal information over the phone. This might seem hard to avoid, but if there’s a safer way (like in person) always try that.
You can watch the episode of 60 Minutes ‘Scamdemic’ with our spoof call here: https://www.9now.com.au/60-minutes/2023/episode-22
Author: Jason Edelstein – Executive Director, Security Testing and Assurance
We are hiring! CyberCX currently have open offensive roles in penetration testing, adversary simulation, and AppSec for Australia and New Zealand. If you are interested in working with the largest and most capable team in the region in a fun, rewarding, and challenging environment, please send your CV to [email protected]