Like a lot of the hype surrounding AI, the current abilities of voice-cloning are somewhat overblown. Earlier this year a track called “Heart on my Sleeve” was touted as an “AI-generated” song by Drake and The Weeknd, which led many to assume that the song, its lyrics, and its vocals were entirely an AI creation. In fact, a representative for the song’s anonymous creator confirmed to the New York Times that it was “an original composition written and recorded by humans.” The only “AI” involvement was the use of voice-changing technology to make the singer’s voice sound like Drake and The Weeknd — essentially an advanced version of auto-tune. Similarly, “vocaloids” like Hatsune Miku and voice assistants like Siri are created using banks of voice samples from real people (Saki Fujita and Susan Bennett, respectively).
While AI enthusiasts frequently declare that further huge leaps are just around the corner, the gulf between voice-cloning technology and creating a “true” AI Robin Williams performance is analogous to the difference between flying around the world in a hot air balloon and traveling to Alpha Centauri. Since generative AI is based around analyzing datasets and generating the most probable response, it’s not very good at being funny because comedy largely depends on subverting expectations. The challenge is multiplied by the complexity of comedic performances, where a joke can live or die depending on how it’s delivered.
Nonetheless, AI is still a threat to writers and actors because studio executives don’t necessarily care about quality. As Zelda Williams observed:
“These recreations are, at their very best, a poor facsimile of greater people, but at their worst, a horrendous Frankensteinian monster, cobbled together from the worst bits of everything this industry is, instead of what it should stand for.”