In light of recent deepfakes, the 1997 film Wag the Dog feels like foreshadowing. In the movie, Robert De Niro plays a spin doctor who sways an election by fabricating a war to distract voters from a presidential scandal. He pulls it off with the help of a Hollywood producer, actors and a studio.
Today, creating deceptive content requires much less. Thanks to AI, all it takes is one person with a computer, some images and free software to convince the public of something that never happened.
As deepfakes get more advanced, will content made or modified by AI become indistinguishable from reality? And what would happen if this technology gets into the wrong hands in 2024, the biggest election year in history?
Let’s dive into the dark, and sometimes funny, world of deepfakes to see what they’re capable of now and in the future—and what’s being done to curtail their abuse.
A deepfake refers to media, including video, photo and audio, generated or manipulated by AI to impersonate someone or fabricate an event that never happened.
The word stems from the term “deep learning,” a subset of AI, and appears to have originated in 2017 thanks to a Reddit user. The terms “face swapping” or “voice cloning” are also used when talking about deepfakes.
Famous deepfakes include:
As the above examples show, deepfakes can range from harmless entertainment to sinister manipulation — and it’s the latter that has politicians and the public on guard.
From 2022 to 2023, the number of deepfake fraud incidents globally rose by 10x, according to research by identity verification platform Sumsub. The biggest surge was in North America, used to power fake IDs and account takeovers.
And now, experts warn that deepfakes are being used more than ever in attempts to spread election disinformation.
In September 2023, just days before a tight parliamentary election in Slovakia, an audio recording was posted online sounding like progressive candidate Michal Šimečka discussing how to rig the election.
Fact-checkers warned the audio was a deepfake – but that didn’t stop its spread. It’s impossible to say if and how the falsified recording affected the outcome, but in the end, Šimečka’s opponent won.
In January 2024, in what NBC News calls the “first-known use of an AI deepfake in a presidential campaign,” a robocall impersonating President Joe Biden urged New Hampshire voters not to participate in their state’s presidential primary.
Just weeks later, the FCC outlawed robocalls featuring AI-generated voices.
Easy is subjective – let’s begin with a better question: How convincing do you want the deepfake to be?
If you just want to create a funny video depicting you as an Avenger, apps like iFace can take a single selfie of you and put your face into famous movie scenes in seconds.
But creating the kind of sophisticated deepfake that the world is really worried about, the kind that could potentially sway elections, takes much more effort than that.
As the two election examples above show, audio is easier to fake than video. To see just how much effort is required to create a persuasive deepfake video, we first need to understand how deepfakes work.
To create a convincing digital forgery of another person’s likeness, the AI model must first be trained.
In other words, it needs to study faces to learn what the source and target images look like to replicate highly realistic scenarios.
To do this, you must feed it images, audio, and video for it to study – the more, the better. This can take hours upon hours, lots of GPUs on your computer and of course the technical skills to make it happen.
YouTuber Mike Boyd, who had no apparent prior experience in special effects or machine learning, set out to create a “reasonably convincing” deepfake of him in famous movies.
He achieved his mission by using open-source software DeepFaceLab, studying multiple tutorials and feeding the AI model thousands of photos of himself.
It took him 100 hours, and here’s one of the deepfakes all of that work produced:
As AI advances, however, reasonably convincing deepfakes become easier to create and higher quality.
For context, back in March 2023, an AI-generated video depicting Will Smith eating spaghetti traumatized the world after it was posted to the subreddit r/StableDiffusion.
The original poster, a user named “chaindrop,” says the video was created using ModelScope, an open-source text-to-video AI tool.
A year later, I wanted to see how far this technology has come in the year since the original “Will Smith eating spaghetti” video.
I used Vercel’s text-to-video AI tool to generate a two-second video based on the simple prompt of “Will Smith eating spaghetti.” This is what popped out:
Erm, not quite right. But I will say that this deepfake Will Smith’s pasta-eating movements are clearer and a lot less frantic than the ModelScope one.
PikaLabs produced a more realistic version of the actor but it didn’t fulfill the prompt.
If Sora’s outputs end up being anything like the realistic and elaborate videos OpenAI displays on its landing page, then we might actually have something to worry about in the future.
As you can imagine, regulating technology that is increasingly deceptive and constantly evolving is a challenge. But in the U.S. and abroad, entities in both the public and private sectors are taking action.
As of March 13, 2024, 43 U.S. states have either introduced or passed legislation regulating the use of deepfakes in elections, according to nonprofit consumer advocacy organization Public Citizen. The Associated Press also reports that at least 10 states have already enacted laws related to deepfakes.
Across the pond, the UK enacted the Online Safety Act in October 2023, which among other things, outlaws the nonconsensual sharing of photographs or film of an intimate act — including images “made or altered by computer graphics or in any other way.”
Additionally, in February, the UK's Home Secretary James Cleverly met with leaders at Google, Meta, Apple and other tech companies to discuss how to safeguard upcoming elections.
In December 2023, European Union lawmakers approved the AI Act, a first-of-its-kind legal framework that attempts to identify and mitigate risks associated with artificial intelligence.
Under this law, deepfakes must be labeled as artificially generated so the public can remain informed.
In the private sector, businesses involved in developing AI tools are putting their own guardrails in place.
Last year, Google launched a watermarking tool to make it easier for software to detect AI-generated images. In February, 20 tech companies including Google, Meta and OpenAI signed a tech accord to fight AI election interference.
In it, the companies committed to detecting and labeling deceptive AI content on their platforms so that it’s clear that it’s been generated or manipulated by artificial intelligence.
Last month, Google joined C2PA as a steering committee member, alongside Adobe, Microsoft and Sony, to help develop "content credentials,” which is essentially a virtual badge attached to digital content that, when clicked, shows details on how AI was used to make or modify it.
OpenAI, the maker of DALL-E and ChatGPT, already has mitigation efforts in place, including prohibiting the use of a public figure’s likeness in its AI-generated images.
When I prompted ChatGPT to generate an image of Will Smith eating spaghetti, it wouldn’t comply.
Because OpenAI’s text-to-video model will reject prompts requesting celebrity likeness, Sora will likely follow the same policy.
From consumer fraud to attempted election interference, deepfakes are clearly already doing damage.
When truth can be manufactured at the click of a button, it’s on each individual to stay skeptical – until governments and companies catch up.