Conversational AI Is A Novel Opportunity for Hollywood — but Do Talent or Consumers Want It?

Photo illustration of a robot hand holding a speech bubble
Photo Illustration: Variety VIP+; Adobe Stock

An emerging outgrowth of generative AI development is focused on conversational AI — speech-based applications, including voice clones of celebrities, designed for consumer entertainment.

Use of AI voices in Hollywood has already and more commonly been discussed for content localization, enabling broader distribution of dubs of film and TV content into overseas markets that media companies wouldn’t otherwise have had budget to create (discussed in VIP+’s June 2024 special report, “Generative AI in Film & TV”).

Yet synthetic voices could have other potential uses in professional content than localization. With consent and compensation for talent or their estates, these could include voiceover narration (e.g., for sports content, animation or documentary), scripted audio applications (e.g., audiobooks), or voice-enabled interactive chatbots, powered by large language models (LLM).

Certain kinds of uses could offer audiences personalized content experiences at a speed and scale that wouldn’t otherwise have been achievable without using AI, while still being high-quality — in turn, expanding the opportunity for talent to monetize their likeness and personalize interactions with fans.

In a notable recent example by a media company, NBCUniversal announced that Peacock will deliver personalized 10-minute video recaps with highlight clips of the previous day’s coverage of the Paris Olympics, announced by an AI voice clone of TV sportscaster Al Michaels. To personalize their versions, users will be able to select their three favorite sports and preferred topics on the platform.

Yet more recent developments signal growing interest among AI and tech companies to create their own entertaining content experiences with conversational AI — in some cases by pursuing partnerships with celebrity and creator talent to customize applications or AI chatbots with their voices or personas:

  • ElevenLabs launched Iconic Voices for its new Reader app, featuring AI voices of four famous deceased actors — Judy Garland, James Dean, Burt Reynolds and Laurence Olivier — that will read digital text aloud, provided in various formats including links, PDF and ePub. Exact licensing terms weren’t disclosed but were negotiated with celebrity rights management agency CMG Worldwide to compensate the actors’ estates.
  • Character.ai, the AI firm offering AI chatbots with customizable personas, launched Character Calls, a feature that now allows users to engage in real-time interactive audio (voice) conversations with any of its AI bots. Following the March launch of Character Voices, which simply allowed chatbots to speak responses aloud during text-based chats, users can select a voice for their chatbot character from a voice library or create their own by supplying a sample of recorded speech. In an important difference to ElevenLabs’ Iconic Voices, the text content of these chatbot characters is “unscripted” and instead automatically generated by a LLM then converted to speech.
  • Meta’s AI Studio said it will begin testing the ability for a select group of about 50 creators to make versions of themselves as AI chatbots on Instagram, which are being made publicly available for some U.S. users to direct message on the app, with a wider rollout coming in August. The announcement follows Meta’s launch last September of 28 AI characters modeled off of real-world celebrities and creators as text-based chatbots (powered by Llama 3) for users to message on Instagram, Messenger and WhatsApp. A Meta spokesperson this week told VIP+ the company had nothing to share about voice capabilities, but TechCrunch previously reported that voice would be coming to Meta’s AI chatbots sometime this year.
  • Google is reportedly developing a product similar to Character.ai that would allow users to create, customize and talk to their own chatbot personas and could launch sometime this year, per The Information. In addition, chatbots might be modeled on real-world celebrities, and Google has reportedly discussed partnerships with influencers. It’s unclear if interactive voice would come to Google’s personified chatbots, but its multimodal model Gemini could certainly be adapted for it.

Now, with the wave of new activity around celebrity AI chatbots, a bigger question is whether app users will be comfortable with and want to engage with them. Still somewhat early days, use of these applications or features is still limited to a minority of U.S. consumers. But more are at least aware, even if they haven’t used them, according to a VIP+ survey conducted by HarrisX in May 2024.

As development proceeds, generative AI and media companies will need to consider the format such AI experiences take to be compelling and engage audiences. About two-thirds of U.S. consumers were at least somewhat interested in interactive digital experiences enabled by gen AI — whether text-based chatbots, a voice application or a hyperrealistic video avatar — modeled after a celebrity or someone they personally know (e.g., a relative), per the VIP+/HarrisX survey.

But consumer interest drops substantially when asked about ways those experiences might be presented. In fact, most U.S. consumers said they wouldn’t be interested in varied types of digital experiences with synthetic beings. For example, a third (33%) of consumers said they’d be very or somewhat interested in an interactive virtual version of a famous dead person, and 30% would be for a living celebrity.

While a third expressing interest isn’t inconsequential, the discrepancy could suggest that these experiences are compelling in theory but less so in practice. For most consumers, acceptance and engagement with such experiences isn’t likely to be automatic.

Beyond consumer acceptance, talent’s own openness to such partnerships might not be surefire. AI voice tech brings substantial risk of unauthorized use, which has likely tainted appeal it may have held. Voice and screen actor voices have been cloned without consent or compensation and publicly distributed or used in content.

User-generated nonconsensual deepfakes of famous voices have become prolific, with Morgan Freeman and Joe Biden being notable victims.

Nonauthorized use has come not only from bad actors who use voice-cloning tools but from AI companies that provide the enabling tech. Among some artists, trust in certain companies to be fair partners has likely already been broken.

Although OpenAI claimed it had used a different voice actor to create the “Sky” voice for ChatGPT, it sounded enough like Scarlett Johansson to have the same effect as a nonconsensual clone, especially after she’d previously refused to participate. Voice actors have also filed class action against AI voice startup LOVO for using clones of their voices to promote their tool without permission or payment, including celebrities Johansson, Conan O’Brien, Ariana Grande and Joe Rogan.

Most consumers and entertainment workers alike oppose nonconsensual deepfakes of voice and image likeness. Aside from potential for lawsuits, applications that make use of unauthorized voices are more likely to see backlash.

Even so, not all consumers, or AI companies, hold the same regard or common understanding of the ethical use of such systems. Meanwhile, laws penalizing misuse are scant and uneven across states, though in progress, to allow actors and individuals to protect their voices as their personal or intellectual property, given voice likeness itself cannot be copyrighted.

In the near term, media companies or talent who embark on such partnerships will need to ensure safety measures to prevent misuse. Risk-free use could be hard to guarantee with total certainty. Text or voice-based chatbots or applications that make use of an actor’s voice or persona aren’t without reputational risk, as user interactions with an LLM would by nature have celebrity AIs say things the person never did.

To avoid consumer confusion, celebrity-based AI would need to be intentionally packaged, labeled, disclosed or disclaimed as being AI-powered, such that it’s immediately understood that the celebrity isn’t the speaker. If a celebrity AI chatbot, whether text or voice enabled, is powered by an LLM, there would need to be controls programmed into the model, restricting certain responses deemed out of bounds or off-brand (e.g., profanity filters).

Still, LLM-based chatbots are ultimately less controllable than scripted speech (providing specific text to a text-to-speech model) and savvy users might attempt to “jailbreak” models to bypass its programmed restrictions. In the case of ElevenLabs’ Reader app, where users can upload any text for celebrity AI voices to recite, the company moderates text with a combination of automated and human review and doesn’t generate speech for content identified as harmful (e.g., hate speech) per its terms of service.

Variety VIP+ Explores Gen AI From All Angles — Pick a Story

It’s probably worth noting the ElevenLabs partnership was with deceased celebrities, where it’s at least known that these people have no more ability to say anything, and historical record is unlikely to be challenged. Chatbots for beloved fictional characters from owned IP (e.g., Yoda, Homer Simpson, Tony Soprano) could also be considered safer from AI’s reputation risk, as their purpose would be more easily understood as a novel form of entertainment, not associated with a real person.

“That’s the question we are seeing frequently now where people are trying to figure out how to use film and TV characters and make them available in that kind of an easy chatbot application, where, say, you speak with your favorite character from a book or a movie,” ElevenLabs co-founder Mati Staniszewski said to VIP+ in August 2023.

\