I don’t know the exact date when deepfakes were invented, I imagine this is akin to asking the question who invented sliced bread. There were knives, and then there was bread, but whoever was the first person to plunge a blade into that hot lump of gluten we will probably never know. It doesn’t matter, the person I’m interested in is whoever decided to grill that sliced bread will butter and cheese. Ummmm, grilled cheese…Where was I again? Oh, right, deepfakes. Similarly, human faces have been around for a few years, and then this thing by the name of Machine learning (ML) escaped containment and has forever changed our lives.
Machine learning, specifically neural networks, powers deepfakes, and while the term deepfake has become a catchall to describe any image/video that seems fraudulent. For a deepfake to be authentic (oxymoron statements make me laugh), the fake must be built using a machine learning toolset.
From here on, I will refer to shallow fake as a deepfake of one’s self. So when was the first shallow fake introduced? I do not know, so instead, let’s use the release of the movie Tron Legacy (2010) as our baseline. Why this movie? Mostly because it provides an entertaining milestone as our first publicly observed shallow fake. The film Tron Legacy debuted actor Jeff Bridges as both himself and himself 28 years younger. After the film was released, Jeff was asked how he felt about being motion captured and having his face scanned with laser are recreated via CGI. Jeff replied, “Of course, there’s always the chance that one day actors won’t be needed at all, as filmmakers fully create their characters in a computer. “I go between worry and gratefulness. Maybe I’ll be ready to do something else by that time,” he laughed about the possibility. “There is going to be a time — it’s probably already here — when they say, ‘Let’s get a combination of [co-star] Garrett [Hedlund] and Bridges and let’s put a little Bela Lugosi in there. What the hell — let’s see what happens!’ Just 11 years later, we can do this…on a cell phone.
Other than using shallow fakes to make Disney Movies, how can this technology be applied “productively”. Zoom meetings, obviously !! In October 2020, NVIDIA introduced its MAXINE Video-Streaming Platform. The platform enables users to “shallow fake” in real-time. MBA’s note that NVIDIA created a Platform instead of a downloadable app.
MAXINE includes features that grant end-users the ability to remove nose piercings, blemishes, and correct lighting, all while reducing video bandwidth to a fraction of today’s standards. Bad haircut, don’t worry, forgot to put on makeup, oh well, failed to wipe away your eye boogers, if no one can see them, they never happened. Apparently, how one views themself during a zoom session has lead to a record number of elective surgeries. Check out this article published by the New York Times titled Don’t Like What You See on Zoom? Get a Face-Lift and Join the Crowd. WOW! “Cosmetic surgeons say business is booming after elective surgery opened up, with quarantine proving ample time to heal in secrecy from renovation of face and body.” 1
If this hasn’t scared you enough wait until you hear Whats on the horizon? Need a hint, Deep fake audio…like in Terminator (1984) when the T600 impersonates John Connors, while the T1000 super deepfaked Johns mother.
Part 2, This is where I attempt to explain how deepfake technology works, using simple concepts. Above was the meat and potatoes, below is the room temperature sparkling water to wash it all down. Fair warning, I am not an expert on this subject… yet, so please don’t use this information in your dissertation
Deepfake broad concept
If I asked you, given 3x = 30, what is x? You would answer 10. Great you’ve just mastered the extreme basics behind deepfake technology. The resulting graph would look like this. Okay, hold onto this thought.
Computers are great with numbers, so the question then becomes how can we get an image to be represented by numbers with an algorithm that’s insanely good at finding x if x was a facial feature. Mathematically you would get something that looks like the neural network below, but let’s use cartoons instead.
Hold it! we still aren’t there yet. We actually have to do this two times. One for the original face and another for the imposter.
Now we get to the good part. The computer has mapped out how face A and face B looks and moves. To create the deep fake, we say to the computer if 3x = 30 and x = 10 what is 3.4x=30? Think of the ‘3’ as the original face and ‘3.4’ as the imposters face. The computer would say I don’t have an exact number, but I can get close and reply with x equals roughly 8.82353, which is close enough to fool the human eye. The more data points the algorithm is trained with, the more accurate the estimate will be, which is why deepfaking celebrities is easy. They have plenty of images and videos of their faces in different expressions. With enough data points, impersonating your own face is not that hard, the “3.4” would become more like “3.02”.
I hope you enjoyed the crash crash in deepfakes :)