Deepfakes could become easier to create and Google is limiting Bard to prevent misuse, says Sundar Pichai.
The concerns about deepfake technologies have been surfacing on the internet for a while now, even before the AI pop culture boom happened. Google CEO Sundar Pichai sounded the alarm for deepfakes being used to replicate public figures during CBS’ 60 Minutes.
It will be possible with AI to create– you know, a video, easily. Where it could be Scott saying something, or me saying something, and we never said that. And it could look accurate. But you know, on a societal scale, you know, it can cause a lot of harm.
Sundar Pichai, CEO, Google
Pichai also told interviewer Scott Pelley that his company is effectively placing intentional limits on its Bard AI to prevent this kind of misuse.
Google is relying in part on an incremental release to let the society acclimatize to it and in part on user feedback to develop “more robust safety layers before … [they] deploy more capable models,” Pichai said in the interview.
Current AI models can create fabricated imagery and audio that mimic public figures. All these models need is some training data, which is easy to get off the internet for public figures. The current fake voice generation capabilities are not indistinguishable and do sound a little robotic (the same goes for the current generation of deepfake videos—They look a little unpolished, unnatural, or a mix of the two). But that might soon change.
What that would mean is that anyone would be able to create video and audio deepfakes of known personalities. This can create widespread paranoia as such footage can be instrumental in spreading chaos, false accusations, misinformation, or just obscenity.
Offering an example of Bard learning Bengali on its own without having been trained to learn a language, Pichai mentioned how even Google doesn’t fully comprehend how Bard AI provides its answers. He added that we also don’t understand how the human mind works. In AI, this is called the “black box” problem.
Black Box Problem: AI models are written with instructions that are unlike typical software programs. The way most AI models work is: you provide an input and you get an output. The process of arriving at that output is based on instructions that have been pre-programmed, but as it’s a training-based process, it’s impossible to know exactly what train of thought was followed by the AI to come at one particular output. Another quality of this process is that every output for the same input will be different.
Deepfake: Trained on the facial expressions, voice samples, or existing images of somebody, specifically trained generative AI models can identify patterns, features, shapes, etc. comprehensively. Based on this data, the model can then generate new images, sounds, or videos that will mimic the person. Using this, people can create everything from fake voice recordings and questionable images to derogatory videos of a person.